Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 1 |
Descriptor
| Test Reliability | 11 |
| Scores | 5 |
| Latent Trait Theory | 4 |
| Test Items | 4 |
| Item Response Theory | 3 |
| College Entrance Examinations | 2 |
| Correlation | 2 |
| Cutting Scores | 2 |
| Error of Measurement | 2 |
| Item Analysis | 2 |
| Item Bias | 2 |
| More ▼ | |
Source
| Applied Measurement in… | 2 |
| Education Policy Analysis… | 1 |
| Educational and Psychological… | 1 |
| Journal of College Admissions | 1 |
| Journal of Educational… | 1 |
| Journal of Educational… | 1 |
| Journal of Educational and… | 1 |
Author
| Wainer, Howard | 11 |
| Grabovsky, Irina | 1 |
| Holland, Paul W. | 1 |
| Lukhele, Robert | 1 |
| Morgan, Anne | 1 |
| Thissen, David | 1 |
Publication Type
| Journal Articles | 8 |
| Reports - Research | 5 |
| Reports - Evaluative | 3 |
| Book/Product Reviews | 1 |
| Information Analyses | 1 |
| Opinion Papers | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
Audience
| Researchers | 1 |
Location
| Massachusetts | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| SAT (College Admission Test) | 2 |
| Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Grabovsky, Irina; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2017
In this article, we extend the methodology of the Cut-Score Operating Function that we introduced previously and apply it to a testing scenario with multiple independent components and different testing policies. We derive analytically the overall classification error rate for a test battery under the policy when several retakes are allowed for…
Descriptors: Cutting Scores, Weighted Scores, Classification, Testing
Peer reviewedWainer, Howard – Journal of Educational Measurement, 1986
An example demonstrates and explains that summary statistics commonly used to measure test quality can be seriously misleading and that summary statistics for the whole test are not sufficient for judging the quality of the test. (Author/LMO)
Descriptors: Correlation, Item Analysis, Statistical Bias, Statistical Studies
Wainer, Howard – 1982
This paper is the transcript of a talk given to those who use test information but who have little technical background in test theory. The concepts of modern test theory are compared with traditional test theory, as well as a probable future test theory. The explanations given are couched within an extended metaphor that allows a full description…
Descriptors: Difficulty Level, Latent Trait Theory, Metaphors, Test Items
Peer reviewedWainer, Howard; Lukhele, Robert – Educational and Psychological Measurement, 1997
The reliability of scores from four forms of the Test of English as a Foreign Language (TOEFL) was estimated using a hybrid item response theory model. It was found that there was very little difference between overall reliability when the testlet items were assumed to be independent and when their dependence was modeled. (Author/SLD)
Descriptors: English (Second Language), Item Response Theory, Scores, Second Language Learning
Wainer, Howard – 1985
Techniques derived from item response theory are useful for estimating the reliability of test classification above and below the cutting score. Test developers can construct a test whose information is peaked in the region of the cutting score; users can select a test which provides the most information in this region. The Cut-Score…
Descriptors: Cutting Scores, Item Analysis, Latent Trait Theory, Mastery Tests
Wainer, Howard; And Others – 1991
It is sometimes sensible to think of the fundamental unit of test construction as being larger than an individual item. This unit, dubbed the testlet, must pass muster in the same way that items do. One criterion of a good item is the absence of differential item functioning (DIF). The item must function in the same way as all important…
Descriptors: Definitions, Identification, Item Bias, Item Response Theory
Peer reviewedMorgan, Anne; Wainer, Howard – Journal of Educational Statistics, 1980
Two estimation procedures for the Rasch Model of test analysis are reviewed in detail, particularly with respect to new developments that make the more statistically rigorous conditional maximum likelihood estimation practical for use with longish tests. (Author/JKS)
Descriptors: Error of Measurement, Latent Trait Theory, Maximum Likelihood Statistics, Psychometrics
Peer reviewedWainer, Howard – Education Policy Analysis Archives, 1999
The critique of the Massachusetts Teacher Tests by W. Haney and others points out some flaws in the tests but ignores the fact that the tests provide some useful information to guide teacher selection decisions. Calls for additional study of these teacher evaluation instruments. (SLD)
Descriptors: Beginning Teachers, Elementary Secondary Education, State Programs, Teacher Evaluation
Peer reviewedWainer, Howard; Thissen, David – Applied Measurement in Education, 1993
Because assessment instruments of the future may well be composed of a combination of types of questions, a way to combine those scores effectively is discussed. Two new graphic tools are presented that show that it may not be practical to equalize the reliability of different components. (SLD)
Descriptors: Constructed Response, Educational Assessment, Graphs, Item Response Theory
Peer reviewedHolland, Paul W.; Wainer, Howard – Applied Measurement in Education, 1990
Two attempts to adjust state mean Scholastic Aptitude Test (SAT) scores for differential participation rates are examined. Both attempts are rejected, and five rules for performing adjustments are outlined to foster follow-up checks on untested assumptions. National Assessment of Educational Progress state data are determined to be more accurate.…
Descriptors: College Applicants, College Entrance Examinations, Estimation (Mathematics), Item Bias
Peer reviewedWainer, Howard – Journal of College Admissions, 1983
Discusses changes in testing as a result of the availability of extensive inexpensive computing and some recent developments in statistical test theory. Describes the role of the Computerized Adaptive Test (CAT) and modern Item Response Theory (IRT) in ability testing tailored to each student's knowledge and ability. (JAC)
Descriptors: Cognitive Ability, College Entrance Examinations, Computer Assisted Testing, Higher Education

Direct link
