Publication Date
| In 2026 | 0 |
| Since 2025 | 15 |
| Since 2022 (last 5 years) | 63 |
| Since 2017 (last 10 years) | 162 |
| Since 2007 (last 20 years) | 321 |
Descriptor
Source
Author
| Hambleton, Ronald K. | 15 |
| Wang, Wen-Chung | 9 |
| Livingston, Samuel A. | 6 |
| Sijtsma, Klaas | 6 |
| Wainer, Howard | 6 |
| Weiss, David J. | 6 |
| Wilcox, Rand R. | 6 |
| Cheng, Ying | 5 |
| Gessaroli, Marc E. | 5 |
| Lee, Won-Chan | 5 |
| Lewis, Charles | 5 |
| More ▼ | |
Publication Type
Education Level
Location
| Turkey | 8 |
| Australia | 7 |
| Canada | 7 |
| China | 5 |
| Netherlands | 5 |
| Japan | 4 |
| Taiwan | 4 |
| United Kingdom | 4 |
| Germany | 3 |
| Michigan | 3 |
| Singapore | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Americans with Disabilities… | 1 |
| Equal Access | 1 |
| Job Training Partnership Act… | 1 |
| Race to the Top | 1 |
| Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedSerlin, Ronald C.; Kaiser, Henry F. – Educational and Psychological Measurement, 1978
When multiple-choice tests are scored in the usual manner, giving each correct answer one point, information concerning response patterns is lost. A method for utilizing this information is suggested. An example is presented and compared with two conventional methods of scoring. (Author/JKS)
Descriptors: Correlation, Factor Analysis, Item Analysis, Multiple Choice Tests
Peer reviewedTaylor, James B. – Educational and Psychological Measurement, 1977
The reliability and item homogeneity of personality scales are in part dependent on the content domain being sampled, and this characteristic reliability cannot be explained by item ambiguity or scale length. It is suggested that clarity of self concept is also a determinant. (Author/JKS)
Descriptors: Item Analysis, Personality Assessment, Personality Measures, Personality Theories
Peer reviewedStrommen, Erik F.; Smith, Jeffrey K. – Educational and Psychological Measurement, 1987
The internal consistency of the Goodenough-Harris Draw-A-Person Test was examined using 150 children, aged 5-8. The 72-item full scales showed good internal consistency at all ages, with no sex differences. Administration of a 42-item short form resulted in sex effects and differential internal consistency. (Author/GDC)
Descriptors: Freehand Drawing, Primary Education, Sex Differences, Test Bias
Peer reviewedSafrit, Margaret J.; And Others – Research Quarterly for Exercise and Sport, 1985
Constraints on criterion-referenced tests to make mastery/nonmastery classifications of motor skills can lead to excessively long tests. A sequential probability ratio test classified many subjects' golf shots quickly but required many trials for four subjects. The test's classification accuracy makes it a potentially useful device for physical…
Descriptors: Criterion Referenced Tests, Golf, Higher Education, Mastery Tests
Peer reviewedSudman, Seymour; Bradburn, Norman – New Directions for Program Evaluation, 1984
Situations in which mailed questionnaires are most appropriate are identified. Population variables, characteristics of questionnaires, and social desirability variables are examined in depth. (Author)
Descriptors: Attitude Measures, Evaluation Methods, Program Evaluation, Research Methodology
PDF pending restorationNeustel, Sandra – 2001
As a continuing part of its validity studies, the Association of American Medical Colleges commissioned a study of the speediness of the Medical College Admission Test (MCAT). If speed is a hidden part of the test, it is a threat to its construct validity. As a general rule, the criterion used to indicate lack of speediness is that 80% of the…
Descriptors: College Applicants, College Entrance Examinations, Higher Education, Medical Education
Ito, Kyoko; Sykes, Robert C. – 2000
This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…
Descriptors: Constructed Response, Elementary Education, Essay Tests, Test Construction
Peer reviewedBerk, Ronald A. – Journal of Experimental Education, 1980
A sampling methodology is proposed for determining lengths of tests designed to assess the comprehension of written discourse. It is based on Bormuth's transformational analysis, within a domain-referenced framework. Guidelines are provided for computing sample size and selecting sentences to which the transformational rules can be applied.…
Descriptors: Reading Comprehension, Reading Tests, Sampling, Test Construction
Peer reviewedDe Ayala, R. J. – Applied Psychological Measurement, 1994
Previous work on the effects of dimensionality on parameter estimation for dichotomous models is extended to the graded response model. Datasets are generated that differ in the number of latent factors as well as their interdimensional association, number of test items, and sample size. (SLD)
Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Sample Size
Peer reviewedLivingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995
A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)
Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions
Peer reviewedQualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Peer reviewedPaolo, Anthony M.; Ryan, Joseph J. – Psychological Assessment, 1993
The Satz-Mogel Abbreviation of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) was compared with a 7-subtest short form of 130 healthy and 40 neurologically impaired older adults. Both short forms were found similar for normal or impaired adults in comparison with the full WAIS-R. (SLD)
Descriptors: Comparative Testing, Intelligence Tests, Neurological Impairments, Older Adults
Peer reviewedFitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001
Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…
Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability
DeMars, Christine E. – Educational and Psychological Measurement, 2005
Type I error rates for PARSCALE's fit statistic were examined. Data were generated to fit the partial credit or graded response model, with test lengths of 10 or 20 items. The ability distribution was simulated to be either normal or uniform. Type I error rates were inflated for the shorter test length and, for the graded-response model, also for…
Descriptors: Test Length, Item Response Theory, Psychometrics, Error of Measurement
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

Direct link
