Publication Date
| In 2026 | 0 |
| Since 2025 | 18 |
| Since 2022 (last 5 years) | 66 |
| Since 2017 (last 10 years) | 165 |
| Since 2007 (last 20 years) | 324 |
Descriptor
Source
Author
| Hambleton, Ronald K. | 15 |
| Wang, Wen-Chung | 9 |
| Livingston, Samuel A. | 6 |
| Sijtsma, Klaas | 6 |
| Wainer, Howard | 6 |
| Weiss, David J. | 6 |
| Wilcox, Rand R. | 6 |
| Cheng, Ying | 5 |
| Gessaroli, Marc E. | 5 |
| Lee, Won-Chan | 5 |
| Lewis, Charles | 5 |
| More ▼ | |
Publication Type
Education Level
Location
| Turkey | 8 |
| Australia | 7 |
| Canada | 7 |
| China | 5 |
| Netherlands | 5 |
| Japan | 4 |
| Taiwan | 4 |
| United Kingdom | 4 |
| Germany | 3 |
| Michigan | 3 |
| Singapore | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Americans with Disabilities… | 1 |
| Equal Access | 1 |
| Job Training Partnership Act… | 1 |
| Race to the Top | 1 |
| Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Wiberg, Marie – International Journal of Testing, 2006
A simulation study of a sequential computerized mastery test is carried out with items modeled with the 3 parameter logistic item response theory model. The examinees' responses are either identically distributed, not identically distributed, or not identically distributed together with estimation errors in the item characteristics. The…
Descriptors: Test Length, Computer Simulation, Mastery Tests, Item Response Theory
Brennan, Robert L. – 1990
In 1955, R. Levine introduced two linear equating procedures for the common-item non-equivalent populations design. His procedures make the same assumptions about true scores; they differ in terms of the nature of the equating function used. In this paper, two parameterizations of a classical congeneric model are introduced to model the variables…
Descriptors: Equated Scores, Equations (Mathematics), Mathematical Models, Research Design
Wilcox, Rand R. – 1980
Wilcox (1977) examines two methods of estimating the probability of a false-positive on false-negative decision with a mastery test. Both procedures make assumptions about the form of the true score distribution which might not give good results in all situations. In this paper, upper and lower bounds on the two possible error types are described…
Descriptors: Cutting Scores, Mastery Tests, Mathematical Models, Student Placement
Peer reviewedSerlin, Ronald C.; Kaiser, Henry F. – Educational and Psychological Measurement, 1978
When multiple-choice tests are scored in the usual manner, giving each correct answer one point, information concerning response patterns is lost. A method for utilizing this information is suggested. An example is presented and compared with two conventional methods of scoring. (Author/JKS)
Descriptors: Correlation, Factor Analysis, Item Analysis, Multiple Choice Tests
Peer reviewedTaylor, James B. – Educational and Psychological Measurement, 1977
The reliability and item homogeneity of personality scales are in part dependent on the content domain being sampled, and this characteristic reliability cannot be explained by item ambiguity or scale length. It is suggested that clarity of self concept is also a determinant. (Author/JKS)
Descriptors: Item Analysis, Personality Assessment, Personality Measures, Personality Theories
Peer reviewedStrommen, Erik F.; Smith, Jeffrey K. – Educational and Psychological Measurement, 1987
The internal consistency of the Goodenough-Harris Draw-A-Person Test was examined using 150 children, aged 5-8. The 72-item full scales showed good internal consistency at all ages, with no sex differences. Administration of a 42-item short form resulted in sex effects and differential internal consistency. (Author/GDC)
Descriptors: Freehand Drawing, Primary Education, Sex Differences, Test Bias
Peer reviewedSafrit, Margaret J.; And Others – Research Quarterly for Exercise and Sport, 1985
Constraints on criterion-referenced tests to make mastery/nonmastery classifications of motor skills can lead to excessively long tests. A sequential probability ratio test classified many subjects' golf shots quickly but required many trials for four subjects. The test's classification accuracy makes it a potentially useful device for physical…
Descriptors: Criterion Referenced Tests, Golf, Higher Education, Mastery Tests
Peer reviewedSudman, Seymour; Bradburn, Norman – New Directions for Program Evaluation, 1984
Situations in which mailed questionnaires are most appropriate are identified. Population variables, characteristics of questionnaires, and social desirability variables are examined in depth. (Author)
Descriptors: Attitude Measures, Evaluation Methods, Program Evaluation, Research Methodology
PDF pending restorationNeustel, Sandra – 2001
As a continuing part of its validity studies, the Association of American Medical Colleges commissioned a study of the speediness of the Medical College Admission Test (MCAT). If speed is a hidden part of the test, it is a threat to its construct validity. As a general rule, the criterion used to indicate lack of speediness is that 80% of the…
Descriptors: College Applicants, College Entrance Examinations, Higher Education, Medical Education
Ito, Kyoko; Sykes, Robert C. – 2000
This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…
Descriptors: Constructed Response, Elementary Education, Essay Tests, Test Construction
Peer reviewedBerk, Ronald A. – Journal of Experimental Education, 1980
A sampling methodology is proposed for determining lengths of tests designed to assess the comprehension of written discourse. It is based on Bormuth's transformational analysis, within a domain-referenced framework. Guidelines are provided for computing sample size and selecting sentences to which the transformational rules can be applied.…
Descriptors: Reading Comprehension, Reading Tests, Sampling, Test Construction
Peer reviewedDe Ayala, R. J. – Applied Psychological Measurement, 1994
Previous work on the effects of dimensionality on parameter estimation for dichotomous models is extended to the graded response model. Datasets are generated that differ in the number of latent factors as well as their interdimensional association, number of test items, and sample size. (SLD)
Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Sample Size
Peer reviewedLivingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995
A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)
Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions
Peer reviewedQualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Peer reviewedPaolo, Anthony M.; Ryan, Joseph J. – Psychological Assessment, 1993
The Satz-Mogel Abbreviation of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) was compared with a 7-subtest short form of 130 healthy and 40 neurologically impaired older adults. Both short forms were found similar for normal or impaired adults in comparison with the full WAIS-R. (SLD)
Descriptors: Comparative Testing, Intelligence Tests, Neurological Impairments, Older Adults

Direct link
