ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	5

Descriptor

Error of Measurement	7
Sampling	7
Test Length	7
Sample Size	3
Test Items	3
Computation	2
Data Analysis	2
Item Response Theory	2
Probability	2
Statistical Analysis	2
Accuracy	1
Adolescents	1
Cognitive Measurement	1
Cutting Scores	1
Difficulty Level	1
Estimation (Mathematics)	1
Generalizability Theory	1
Goodness of Fit	1
Identification	1
Item Analysis	1
Item Banks	1
Latent Trait Theory	1
Licensing Examinations…	1
Longitudinal Studies	1
Mathematical Formulas	1
More ▼

Source

ETS Research Report Series	2
Applied Measurement in…	1
Educational and Psychological…	1
Journal of Educational and…	1
Psychometrika	1

Publication Type

Reports - Research	6
Journal Articles	5

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
National Longitudinal Study…	1

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Detection and Treatment of Careless Responses to Improve Item Parameter Estimation

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Hong, Maxwell; Diao, Qi – Journal of Educational and Behavioral Statistics, 2019

In psychological and survey research, the prevalence and serious consequences of careless responses from unmotivated participants are well known. In this study, we propose to iteratively detect careless responders and cleanse the data by removing their responses. The careless responders are detected using person-fit statistics. In two simulation…

Descriptors: Test Items, Response Style (Tests), Identification, Computation

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

Peer reviewed

Direct link

Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011

The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability

Evaluation of Methods to Compute Complex Sample Standard Errors in Latent Regression Models. Research Report. ETS RR-09-49

Peer reviewed
PDF on ERIC

Download full text

Oranje, Andreas; Li, Deping; Kandathil, Mathew – ETS Research Report Series, 2009

Several complex sample standard error estimators based on linearization and resampling for the latent regression model of the National Assessment of Educational Progress (NAEP) are studied with respect to design choices such as number of items, number of regressors, and the efficiency of the sample. This paper provides an evaluation of the extent…

Descriptors: Error of Measurement, Computation, Regression (Statistics), National Competency Tests

On the Theory of a Set of Tests Which Differ Only in Length

Peer reviewed

Kristof, Walter – Psychometrika, 1971

Descriptors: Cognitive Measurement, Error of Measurement, Mathematical Models, Psychological Testing

An Investigation of Methods for Reducing Sampling Error in Certain IRT Procedures.

Download full text

Wingersky, Marilyn S.; Lord, Frederic M. – 1983

The sampling errors of maximum likelihood estimates of item-response theory parameters are studied in the case where both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Banks, Latent Trait Theory

Abad, Francisco J.	1
Cheng, Ying	1
Diao, Qi	1
Dorans, Neil J.	1
Guo, Hongwen	1
Hong, Maxwell	1
Kandathil, Mathew	1
Kannan, Priya	1
Katz, Irvin R.	1
Kristof, Walter	1
Li, Deping	1
Lord, Frederic M.	1
Lu, Ru	1
Oranje, Andreas	1
Patton, Jeffrey M.	1
Sgammato, Adrienne	1
Sueiro, Manuel J.	1
Tannenbaum, Richard J.	1
Wingersky, Marilyn S.	1
More ▼