Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 10 |
| Since 2017 (last 10 years) | 33 |
| Since 2007 (last 20 years) | 51 |
Descriptor
| Test Length | 133 |
| Test Reliability | 133 |
| Test Validity | 63 |
| Test Items | 44 |
| Test Construction | 42 |
| Scores | 24 |
| Test Format | 23 |
| Computer Assisted Testing | 21 |
| Error of Measurement | 20 |
| Foreign Countries | 20 |
| Item Response Theory | 19 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 12 |
| Postsecondary Education | 11 |
| Elementary Education | 9 |
| Secondary Education | 6 |
| Early Childhood Education | 4 |
| Grade 6 | 4 |
| Intermediate Grades | 4 |
| Middle Schools | 4 |
| Primary Education | 4 |
| Grade 3 | 3 |
| Grade 5 | 3 |
| More ▼ | |
Audience
| Researchers | 4 |
| Practitioners | 2 |
| Community | 1 |
| Support Staff | 1 |
Location
| China | 4 |
| Turkey | 3 |
| Australia | 2 |
| Canada | 2 |
| Ireland | 2 |
| Netherlands | 2 |
| Singapore | 2 |
| United Kingdom | 2 |
| Alabama | 1 |
| California | 1 |
| Germany | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Job Training Partnership Act… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedHambleton, Ronald K.; And Others – Journal of Educational Measurement, 1983
A new method was developed to assist in the selection of a test length by utilizing computer simulation procedures and item response theory. A demonstration of the method presents results which address the influences of item pool heterogeneity matched to the objectives of interest and the method of item selection. (Author/PN)
Descriptors: Computer Programs, Criterion Referenced Tests, Item Banks, Latent Trait Theory
Peer reviewedGreen, Kathy – Journal of Experimental Education, 1979
Reliabilities and concurrent validities of teacher-made multiple-choice and true-false tests were compared. No significant differences were found even when multiple-choice reliability was adjusted to equate testing time. (Author/MH)
Descriptors: Comparative Testing, Higher Education, Multiple Choice Tests, Test Format
Peer reviewedHarlan, Elena; Clark, Lee Anna – Assessment, 1999
Reports the development of a paragraph-descriptor short form of the Schedule for Nonadaptive and Adaptive Personality (SNAP); (L. Clark, 1993) with self- and other versions. Data from 294 college students, with parental ratings for 94 students, support the reliability and validity of the measure. (SLD)
Descriptors: Adjustment (to Environment), College Students, Higher Education, Parents
Peer reviewedMulhern, Fiona; Rae, Gordon – Educational and Psychological Measurement, 1998
Data from 196 Irish school children were analyzed and used to develop a shortened version of the Fennema-Sherman Mathematics Attitudes Scales (E. Fennema and J. Sherman, 1976). Internal consistency estimates of the reliability of scores on the whole scale and each of the subscales of the original and short form were favorable. (SLD)
Descriptors: Attitude Measures, Elementary Education, Elementary School Students, Foreign Countries
Livingston, Samuel A. – 1984
Much previously published material for estimating the reliability of classification has been based on the assumption that a test consists of a known number of equally weighted items. The test score is the number of those items answered correctly. These methods cannot be used with classifications based on weighted composite scores, especially if…
Descriptors: Equated Scores, Essay Tests, Estimation (Mathematics), Mathematical Models
Peer reviewedHambleton, Ronald K., Ed. – Applied Psychological Measurement, 1980
This special issue covers recent technical developments in the field of criterion-referenced testing. An introduction, six papers, and two commentaries dealing with test development, test score uses, and evaluation of scores review relevant literature, offer new models and/or results, and suggest directions for additional research. (SLD)
Descriptors: Criterion Referenced Tests, Mastery Tests, Measurement Techniques, Standard Setting (Scoring)
Peer reviewedValencia, Richard R.; Rankin, Richard J. – Educational and Psychological Measurement, 1983
The concurrent validity and reliability of Kaufman's short-form version of the McCarthy Scales of Children's Abilities were examined for a sample of 342 Mexican-American preschool and kindergarten age children. The results showed that generally the positive psychometric properties of the Kaufman short form were also noted for the children in this…
Descriptors: High Risk Students, Mexican Americans, Preschool Education, Preschool Tests
Peer reviewedKristof, Walter – Psychometrika, 1971
Descriptors: Cognitive Measurement, Error of Measurement, Mathematical Models, Psychological Testing
Burton, Richard F. – Assessment and Evaluation in Higher Education, 2005
Examiners seeking guidance on multiple-choice and true/false tests are likely to encounter various faulty or questionable ideas. Twelve of these are discussed in detail, having to do mainly with the effects on test reliability of test length, guessing and scoring method (i.e. number-right scoring or negative marking). Some misunderstandings could…
Descriptors: Guessing (Tests), Multiple Choice Tests, Objective Tests, Test Reliability
PDF pending restorationGilmer, Jerry S.; Feldt, Leonard S. – 1982
The Feldt-Gilmer congeneric reliability coefficients make it possible to estimate the reliability of a test composed of parts of unequal, unknown length. The approximate standard errors of the Feldt-Gilmer coefficients are derived via a method using the multivariate Taylor's expansion. Monte Carlo simulation is employed to corroborate the…
Descriptors: Educational Testing, Error of Measurement, Mathematical Formulas, Mathematical Models
Phillips, Phyllis P.; Halpin, Gerald – 1975
Because it generally took over an hour to administer the Porch Index of Communicative Ability (PICA), a shorter but comparable version of the test was developed. The original test was designed to quantify aphasic patients' ability level on common communicative tasks and consisted of 18 ten-item subtests. Each item resulted in a proficiency rating,…
Descriptors: Adults, Aphasia, Equated Scores, Language Handicaps
Watkins, John M.; And Others – 1978
Generalizability theory was applied to the Matching Familiar Figures Test (MFF), an instrument commonly employed to assess reflection-impulsivity in children, in order to analyze the dependability of the MFF at four grade levels: second, third, fourth, and fifth. The MFF was individually administered to 114 boys. A completely crossed, two-facet…
Descriptors: Age Differences, Cognitive Development, Cognitive Tests, Conceptual Tempo
Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items
Haladyna, Tom; Roid, Gale – 1981
Two approaches to criterion-referenced test construction are compared. Classical test theory is based on the practice of random sampling from a well-defined domain of test items; latent trait theory suggests that the difficulty of the items should be matched to the achievement level of the student. In addition to these two methods of test…
Descriptors: Criterion Referenced Tests, Error of Measurement, Latent Trait Theory, Test Construction
Myers, Charles T. – 1978
The viewpoint is expressed that adding to test reliability by either selecting a more homogeneous set of items, restricting the range of item difficulty as closely as possible to the most efficient level, or increasing the number of items will not add to test validity and that there is considerable danger that efforts to increase reliability may…
Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Test Construction

Direct link
