ERIC - Search Results

Publication Date

In 2026	0
Since 2025	5
Since 2022 (last 5 years)	10
Since 2017 (last 10 years)	33
Since 2007 (last 20 years)	51

Descriptor

Test Length	133
Test Reliability	133
Test Validity	63
Test Items	44
Test Construction	42
Scores	24
Test Format	23
Computer Assisted Testing	21
Error of Measurement	20
Foreign Countries	20
Item Response Theory	19
Comparative Analysis	16
Statistical Analysis	16
Psychometrics	15
Difficulty Level	14
Item Analysis	14
Adaptive Testing	13
Language Tests	13
Testing Problems	13
Correlation	12
Higher Education	12
Mathematical Models	12
Testing	12
Mastery Tests	11
Cutting Scores	10
More ▼

Publication Type

Reports - Research	91
Journal Articles	74
Speeches/Meeting Papers	18
Reports - Evaluative	16
Reports - Descriptive	6
Tests/Questionnaires	4
Guides - Non-Classroom	3
Information Analyses	2
Opinion Papers	2
Reference Materials -…	2
Collected Works - Serials	1
Guides - General	1
Numerical/Quantitative Data	1
Reports - General	1
More ▼

Education Level

Higher Education	12
Postsecondary Education	11
Elementary Education	9
Secondary Education	6
Early Childhood Education	4
Grade 6	4
Intermediate Grades	4
Middle Schools	4
Primary Education	4
Grade 3	3
Grade 5	3
Grade 7	3
Junior High Schools	3
Elementary Secondary Education	2
Grade 2	2
Grade 4	2
Grade 8	2
High Schools	2
Grade 1	1
Grade 9	1
Kindergarten	1
More ▼

Audience

Researchers	4
Practitioners	2
Community	1
Support Staff	1

Location

China	4
Turkey	3
Australia	2
Canada	2
Ireland	2
Netherlands	2
Singapore	2
United Kingdom	2
Alabama	1
California	1
Germany	1
Illinois (Chicago)	1
Indiana	1
Japan	1
Kenya	1
Maryland	1
New Jersey	1
New Zealand	1
Pennsylvania	1
Peru	1
Poland	1
Portugal	1
South Korea	1
Spain	1
Taiwan	1
More ▼

Laws, Policies, & Programs

Job Training Partnership Act…

What Works Clearinghouse Rating

Test Reliability X

Showing 91 to 105 of 133 results Save | Export

Consideration for Sample Size in Reliability Studies for Mastery Tests. Publication Series in Mastery Testing.

Download full text

Saunders, Joseph C.; Huynh, Huynh – 1980

In most reliability studies, the precision of a reliability estimate varies inversely with the number of examinees (sample size). Thus, to achieve a given level of accuracy, some minimum sample size is required. An approximation for this minimum size may be made if some reasonable assumptions regarding the mean and standard deviation of the test…

Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests

Equivalence Reliability of the Split-Half WISC-R Object Assembly Subtest in a Cohort of Juvenile Offenders.

Download full text

Rodriguez-Aragon, Graciela; And Others – 1993

The predictive power of the Split-Half version of the Wechsler Intelligence Scale for Children--Revised (WISC-R) Object Assembly (OA) subtest was compared to that of the full administration of the OA subtest. A cohort of 218 male and 49 female adolescent offenders detained in a Texas juvenile detention facility between 1990 and 1992 was used. The…

Descriptors: Adolescents, Cohort Analysis, Comparative Testing, Correlation

A Comparison of Three Short Forms of the McCarthy Scales of Children's Abilities.

Peer reviewed

Harrington, Robert G.; Jennings, Valerie – Contemporary Educational Psychology, 1986

Three short forms of the McCarthy Scales of Children's Abilities (MSCA) have been developed to screen the cognitive skills of young children suspected of learning disorders and developmental delays. Correlations were obtained between scores on the full form of the MSCA and the Kaufman, Taylor, and McCarthy Screening Test short forms. (Author/LMO)

Descriptors: Cognitive Tests, Comparative Testing, Correlation, Early Childhood Education

The Standardized Mean Difference within the Framework of Item Response Theory

Peer reviewed

Direct link

Wang, Wen-Chung; Chen, Hsueh-Chu – Educational and Psychological Measurement, 2004

As item response theory (IRT) becomes popular in educational and psychological testing, there is a need of reporting IRT-based effect size measures. In this study, we show how the standardized mean difference can be generalized into such a measure. A disattenuation procedure based on the IRT test reliability is proposed to correct the attenuation…

Descriptors: Test Reliability, Rating Scales, Sample Size, Error of Measurement

A Method for Determining the Length of Criterion-Referenced Tests Using Reliability and Validity Indices.

Download full text

Mills, Craig N.; Simon, Robert – 1981

When criterion-referenced tests are used to assign examinees to states reflecting their performance level on a test, the better known methods for determining test length, which consider relationships among domain scores and errors of measurement, have their limitations. The purpose of this paper is to present a computer system named TESTLEN, which…

Descriptors: Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores, Error of Measurement

An Investigation of Proposed Revisions to Section 3 of the TOEFL Test. TOEFL Research Report 47.

Download full text

Schedl, Mary; And Others – 1995

The Test of English as a Foreign Language (TOEFL) program is exploring a change in Section 3 of the TOEFL test that would replace the vocabulary subpart with additional reading comprehension questions. This study investigated the proposed revision in terms of the length and timing that would be necessary to address concerns of test speededness of…

Descriptors: Adult Students, English (Second Language), Language Tests, Psychometrics

The Use of the Sequential Probability Ratio Test in Making Grade Classifications in Conjunction with Tailored Testing.

Download full text

Reckase, Mark D. – 1981

This report describes a study comparing the classification results obtained from a one-parameter and three-parameter logistic based tailored testing procedure used in conjunction with Wald's sequential probability ratio test (SPRT). Eighty-eight college students were classified into four grade categories using achievement test results obtained…

Descriptors: Adaptive Testing, Classification, Comparative Analysis, Computer Assisted Testing

Practical Procedures for Constructing Mastery Tests to Minimize Errors of Classification and to Maximize or Optimize Decision Reliability.

Byars, Alvin Gregg – 1980

The objectives of this investigation are to develop, describe, assess, and demonstrate procedures for constructing mastery tests to minimize errors of classification and to maximize decision reliability. The guidelines are based on conditions where item exchangeability is a reasonable assumption and the test constructor can control the number of…

Descriptors: Cutting Scores, Difficulty Level, Grade 4, Intermediate Grades

Q. How Many Options Should a Multiple-Choice Question Have? (a) 2. (b) 3. (c) 4. At-a-glance Research Report.

Catts, Ralph – 1978

The reliability of multiple choice tests--containing different numbers of response options--was investigated for 260 students enrolled in technical college economics courses. Four test forms, constructed from previously used four-option items, were administered, consisting of (1) 60 two-option items--two distractors randomly discarded; (2) 40…

Descriptors: Answer Sheets, Difficulty Level, Foreign Countries, Higher Education

An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2006

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…

Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Estimation of Interrater and Parallel Forms Reliability for the MCAT Essay.

Mitchell, Karen J.; Anderson, Judith A. – 1987

The Association of American Medical Colleges is conducting research to develop, implement, and evaluate a Medical College Admission Test (MCAT) essay testing program. Essay administration in the spring and fall of 1985 and 1986 suggested that additional research was needed on the development of topics which elicit similar skills and meet standard…

Descriptors: College Entrance Examinations, Essay Tests, Estimation (Mathematics), Generalizability Theory

Preschool Assessment Instrument Ratings Guide.

Metropolitan Atlanta Consortium of Consultants and Lead Speech-Language Pathologists, GA. – 1990

This guide presents ratings of assessment instruments for use by speech-language pathologists with preschool students. Tests are reviewed in alphabetical order on forms filled out by practicing speech-language pathologists, including data on speech components covered by each test, age range, factors of norms where norms are used, reliability,…

Descriptors: Diagnostic Tests, Examiners, Preschool Education, Preschool Tests

Tailoring Tests to Educational Levels.

Download full text

de Jong, John H. A. L. – 1984

The Netherlands' secondary education system is highly differentiated, with four different school types for four scholastic ability levels. Final examinations must accommodate these four levels, and require a test-independent definition of the intended final ability levels as well as a sample-free evaluation of the range of ability levels at which…

Descriptors: Difficulty Level, Efficiency, Equated Scores, Foreign Countries

Comparison of Difficulties and Reliabilities of Math-Completion and Multiple-Choice Item Formats.

Download full text

Oosterhof, Albert C.; Coats, Pamela K. – 1981

Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…

Descriptors: Comparative Analysis, Difficulty Level, Grading, Higher Education

Effects of Test Length and Advancement Score on Several Criterion-Referenced Test Reliability and Validity Indices. Laboratory of Psychometric and Evaluation Research Report No. 86.

Download full text

Eignor, Daniel R.; Hambleton, Ronald K. – 1979

The purpose of the investigation was to obtain some relationships among (1) test lengths, (2) shape of domain-score distributions, (3) advancement scores, and (4) several criterion-referenced test score reliability and validity indices. The study was conducted using computer simulation methods. The values of variables under study were set to be…

Descriptors: Comparative Analysis, Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Educational and Psychological…	13
Journal of Psychoeducational…	8
Applied Psychological…	5
Journal of Educational…	5
Psychometrika	4
Applied Measurement in…	3
Language Testing	3
Assessment & Evaluation in…	2
ETS Research Report Series	2
International Journal of…	2
Journal of Personality…	2
Psychological Assessment	2
Research Matters	2
ACT Education Corp.	1
AERA Online Paper Repository	1
African Educational Research…	1
Anatomical Sciences Education	1
Assessment	1
Assessment and Evaluation in…	1
College Student Journal	1
Contemporary Educational…	1
Education and Information…	1
Educational Research and…	1
Educational Sciences: Theory…	1
Eurasian Journal of…	1
More ▼

Hambleton, Ronald K.	4
Burton, Richard F.	3
Cliff, Norman	2
Gilmer, Jerry S.	2
Huynh, Huynh	2
Lee, Yi-Hsuan	2
Leite, Walter L.	2
Livingston, Samuel A.	2
Marcoulides, Katerina M.	2
Raborn, Anthony W.	2
Reckase, Mark D.	2
Wilcox, Rand R.	2
Yao, Lihua	2
Zhang, Jinming	2
de Jong, John H. A. L.	2
Abrams, Matthew	1
Allison, Paul A.	1
Almeida, Leandro S.	1
Anderson, Judith A.	1
Andrea Fuster	1
Andy Rick Sánchez-Villena	1
Anthony, Christopher J.	1
Anthony, Christopher James	1
Arens, A. Katrin	1
More ▼

Wechsler Adult Intelligence…	3
McCarthy Scales of Childrens…	2
Peabody Picture Vocabulary…	2
Test of English as a Foreign…	2
Wechsler Intelligence Scale…	2
ACT Assessment	1
ACTFL Oral Proficiency…	1
Adaptive Behavior Scale	1
Armed Forces Qualification…	1
Comprehensive Tests of Basic…	1
Developmental Indicators for…	1
Draw a Person Test	1
Fennema Sherman Mathematics…	1
Iowa Tests of Basic Skills	1
MacArthur Communicative…	1
Matching Familiar Figures Test	1
Measures of Academic Progress	1
Medical College Admission Test	1
Minnesota Multiphasic…	1
Multidimensional…	1
National Assessment of…	1
Positive and Negative Affect…	1
School and College Ability…	1
Self Description Questionnaire	1
Stanford Binet Intelligence…	1
More ▼