NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 20 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sakworawich, Arnond; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2020
Test scoring models vary in their generality, some even adjust for examinees answering multiple-choice items correctly by accident (guessing), but no models, that we are aware of, automatically adjust an examinee's score when there is internal evidence of cheating. In this study, we use a combination of jackknife technology with an adaptive robust…
Descriptors: Scoring, Cheating, Test Items, Licensing Examinations (Professions)
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Amanda L.; Wainer, Howard – Educational Measurement: Issues and Practice, 2016
It is widely accepted dogma that consequential decisions are better made with multiple measures, because using but a single one is thought more likely to be laden with biases and errors that can be better controlled with a wider source of evidence for making judgments. Unfortunately, advocates of using multiple measures too rarely provide detailed…
Descriptors: Tests, Examiners, College Entrance Examinations, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Wainer, Howard; Bradlow, Eric; Wang, Xiaohui – Journal of Educational and Behavioral Statistics, 2010
Confucius pointed out that the first step toward wisdom is calling things by the right name. The term "Differential Item Functioning" (DIF) did not arise fully formed from the miasma of psychometrics, it evolved from a variety of less accurate terms. Among its forebears was "item bias" but that term has a pejorative connotation…
Descriptors: Test Bias, Difficulty Level, Test Items, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Wainer, Howard – Journal of Educational and Behavioral Statistics, 2010
In this essay, the author tries to look forward into the 21st century to divine three things: (i) What skills will researchers in the future need to solve the most pressing problems? (ii) What are some of the most likely candidates to be those problems? and (iii) What are some current areas of research that seem mined out and should not distract…
Descriptors: Research Skills, Researchers, Internet, Access to Information
Peer reviewed Peer reviewed
Wainer, Howard; Thissen, David – Psychometrika, 1975
In this study of robust regression techniques, it was found that the Jackknife does particularly poor in estimating a correlation when there are sharp deviations from normality. A simple example is provided. (RC)
Descriptors: Correlation, Statistical Analysis, Statistical Bias
Peer reviewed Peer reviewed
Wainer, Howard – Journal of Educational Statistics, 1990
It is suggested that some of the technology applied to state Scholastic Aptitude Test scores to measure states' educational performance (particularly use of a truncated Gaussian model) may make it possible to adjust National Assessment of Educational Progress (NAEP) scores to make inferences about state educational progress possible. (SLD)
Descriptors: Academic Achievement, Educational Assessment, Elementary Secondary Education, Mathematical Models
Peer reviewed Peer reviewed
Wainer, Howard – Journal of Educational Measurement, 1986
Describes recent research attempts to draw inferences about the relative standing of the states on the basis of mean SAT scores. This paper identifies five serious errors that call into question the validity of such inferences. Some plausible ways to avoid the errors are described. (Author/LMO)
Descriptors: College Entrance Examinations, Equated Scores, Mathematical Models, Predictor Variables
Peer reviewed Peer reviewed
Wainer, Howard – Journal of Educational Measurement, 1986
An example demonstrates and explains that summary statistics commonly used to measure test quality can be seriously misleading and that summary statistics for the whole test are not sufficient for judging the quality of the test. (Author/LMO)
Descriptors: Correlation, Item Analysis, Statistical Bias, Statistical Studies
Peer reviewed Peer reviewed
Wainer, Howard; Lukhele, Robert – Applied Measurement in Education, 1997
The screening for flaws done for multiple-choice items is often not done for large items. Examines continuous item weighting as a way to manage the influence of differential item functioning (DIF). Data from the College Board Advanced Placement History Test are used to illustrate the method. (SLD)
Descriptors: Advanced Placement, College Entrance Examinations, History, Item Bias
Wainer, Howard – 1994
This study examined the Law School Admission Test (LSAT) through the use of testlet methods to model its inherent, locally dependent structure. Precision, measured by reliability, and fairness, measured by the comparability of performance across all identified subgroups of examinees, were the focus of the study. The polytomous item response theory…
Descriptors: College Entrance Examinations, Item Response Theory, Reading Comprehension, Reading Tests
Peer reviewed Peer reviewed
Wainer, Howard; Thissen, David – Applied Psychological Measurement, 1979
A class of naive estimators of correlation was tested for robustness, accuracy, and efficiency against Pearson's r, Tukey's r, and Spearman's r. It was found that this class of estimators seems to be superior, being less affected by outliers, reasonably efficient, and frequently more easily calculated. (Author/CTM)
Descriptors: Comparative Analysis, Correlation, Goodness of Fit, Nonparametric Statistics
Peer reviewed Peer reviewed
Wainer, Howard – Applied Measurement in Education, 1995
Analysis of the 1991 Law School Admission Test (LSAT) shows that the testlet structure of the reading comprehension and analytic reasoning sections has a significant effect on the statistical characteristics of the test. The testlet-based reliability of these two sections is lower than had been previously calculated. (SLD)
Descriptors: Admission (School), Item Bias, Law Schools, Psychometrics
Wainer, Howard; Wright, Benjamin D. – 1980
The pure Rasch model was compared with four modifications of the model in a number of different simulations in order to ascertain the comparative efficiencies of the parameter estimations of these modifications. Because there is always noise in test score data, some individuals may have response patterns that do not fit the model and their…
Descriptors: Error of Measurement, Guessing (Tests), Item Analysis, Latent Trait Theory
Peer reviewed Peer reviewed
Wainer, Howard; And Others – Journal of Educational Statistics, 1985
In this paper, scores from the Department of Education's table, "State Education Statistics," are examined to see if they can be used for state-by-state comparisons to aid in the evaluation of educational policies that vary across states. (Author/LMO)
Descriptors: Educational Assessment, Educational Indicators, Multivariate Analysis, National Norms
Peer reviewed Peer reviewed
Wainer, Howard; And Others – Journal of Educational Measurement, 1991
A testlet is an integrated group of test items presented as a unit. The concept of testlet differential item functioning (testlet DIF) is defined, and a statistical method is presented to detect testlet DIF. Data from a testlet-based experimental version of the Scholastic Aptitude Test illustrate the methodology. (SLD)
Descriptors: College Entrance Examinations, Definitions, Graphs, Item Bias
Previous Page | Next Page ยป
Pages: 1  |  2