ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	8

Descriptor

Comparative Analysis	9
Computation	9
Item Response Theory	5
Test Items	4
Methods	3
Accuracy	2
Adaptive Testing	2
Bayesian Statistics	2
Bias	2
Computer Assisted Testing	2
Foreign Countries	2
Grade 7	2
Monte Carlo Methods	2
Scores	2
Simulation	2
Test Theory	2
Ability	1
Benchmarking	1
Computer Simulation	1
Correlation	1
Efficiency	1
Elementary School Teachers	1
Error of Measurement	1
Evaluation Methods	1
Evaluation Problems	1
More ▼

Source

Applied Measurement in…

Author

Sinharay, Sandip	2
Andrich, David	1
Beretvas, S. Natasha	1
Cho, Sun-Joo	1
Dodd, Barbara G.	1
Haag, Nicole	1
Haberman, Shelby	1
Heldsinger, Sandra	1
Ho, Tsung-Han	1
Humphry, Stephen	1
Kim, Kyung Yong	1
Larkin, Kevin	1
Lee, Won-Chan	1
Lee, Wooyeol	1
Murphy, Daniel L.	1
Penfield, Randall D.	1
Puhan, Gautam	1
Sachse, Karoline A.	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	8
Reports - Evaluative	1

Education Level

Secondary Education	3
Elementary Education	2
Grade 7	2
Junior High Schools	2
Middle Schools	2
Elementary Secondary Education	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 8	1
Intermediate Grades	1
More ▼

Audience

Location

Australia	1
Colorado	1
Florida	1
New York	1
North Carolina	1
Tennessee	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

The Impact of Three Factors on the Recovery of Item Parameters for the Three-Parameter Logistic Model

Peer reviewed

Direct link

Kim, Kyung Yong; Lee, Won-Chan – Applied Measurement in Education, 2017

This article provides a detailed description of three factors (specification of the ability distribution, numerical integration, and frame of reference for the item parameter estimates) that might affect the item parameter estimation of the three-parameter logistic model, and compares five item calibration methods, which are combinations of the…

Descriptors: Test Items, Item Response Theory, Comparative Analysis, Methods

Standard Errors for National Trends in International Large-Scale Assessments in the Case of Cross-National Differential Item Functioning

Peer reviewed

Direct link

Sachse, Karoline A.; Haag, Nicole – Applied Measurement in Education, 2017

Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…

Descriptors: Error of Measurement, Test Bias, International Assessment, Computation

Are the Nonparametric Person-Fit Statistics More Powerful than Their Parametric Counterparts? Revisiting the Simulations in Karabatsos (2003)

Peer reviewed

Direct link

Sinharay, Sandip – Applied Measurement in Education, 2017

Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…

Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis

The Consequences of Ignoring Item Parameter Drift in Longitudinal Item Response Models

Peer reviewed

Direct link

Lee, Wooyeol; Cho, Sun-Joo – Applied Measurement in Education, 2017

Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…

Descriptors: Item Response Theory, Test Items, Bias, Computation

Requiring a Consistent Unit of Scale between the Responses of Students and Judges in Standard Setting

Peer reviewed

Direct link

Humphry, Stephen; Heldsinger, Sandra; Andrich, David – Applied Measurement in Education, 2014

One of the best-known methods for setting a benchmark standard on a test is that of Angoff and its modifications. When scored dichotomously, judges estimate the probability that a benchmark student has of answering each item correctly. As in most methods of standard setting, it is assumed implicitly that the unit of the latent scale of the…

Descriptors: Foreign Countries, Standard Setting (Scoring), Judges, Item Response Theory

Item Selection and Ability Estimation Procedures for a Mixed-Format Adaptive Test

Peer reviewed

Direct link

Ho, Tsung-Han; Dodd, Barbara G. – Applied Measurement in Education, 2012

In this study we compared five item selection procedures using three ability estimation methods in the context of a mixed-format adaptive test based on the generalized partial credit model. The item selection procedures used were maximum posterior weighted information, maximum expected information, maximum posterior weighted Kullback-Leibler…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

A Comparison of Teacher Effectiveness Measures Calculated Using Three Multilevel Models for Raters Effects

Peer reviewed

Direct link

Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015

This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…

Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory

The Utility of Augmented Subscores in a Licensure Exam: An Evaluation of Methods Using Empirical Data

Peer reviewed

Direct link

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010

Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…

Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods

Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D. – Applied Measurement in Education, 2006

This study applied the maximum expected information (MEI) and the maximum posterior-weighted information (MPI) approaches of computer adaptive testing item selection to the case of a test using polytomous items following the partial credit model. The MEI and MPI approaches are described. A simulation study compared the efficiency of ability…

Descriptors: Bayesian Statistics, Adaptive Testing, Computer Assisted Testing, Test Items