ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	5

Source

Journal of Educational…

Author

Albano, Anthony D.	1
Artur Pokropek	1
Bunch, Michael B.	1
Cai, Liuhan	1
Carl Westine	1
Carmen Köhler	1
Johannes Hartig	1
Lale Khorramdel	1
Lease, Erin M.	1
McConnell, Scott R.	1
Michelle Boyer	1
Palermo, Corey	1
Ridge, Kirk	1
Shear, Benjamin R.	1
Stella Y. Kim	1
Tong Wu	1
More ▼

Publication Type

Journal Articles	5
Reports - Research	5

Education Level

Elementary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 5 results Save | Export

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

DIF Detection for Multiple Groups: Comparing Three-Level GLMMs and Multiple-Group IRT Models

Peer reviewed

Direct link

Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024

For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…

Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory

Using Hierarchical Logistic Regression to Study DIF and DIF Variance in Multilevel Data

Peer reviewed

Direct link

Shear, Benjamin R. – Journal of Educational Measurement, 2018

When contextual features of test-taking environments differentially affect item responding for different test takers and these features vary across test administrations, they may cause differential item functioning (DIF) that varies across test administrations. Because many common DIF detection methods ignore potential DIF variance, this article…

Descriptors: Test Bias, Regression (Statistics), Hierarchical Linear Modeling

Scoring Stability in a Large-Scale Assessment Program: A Longitudinal Analysis of Leniency/Severity Effects

Peer reviewed

Direct link

Palermo, Corey; Bunch, Michael B.; Ridge, Kirk – Journal of Educational Measurement, 2019

Although much attention has been given to rater effects in rater-mediated assessment contexts, little research has examined the overall stability of leniency and severity effects over time. This study examined longitudinal scoring data collected during three consecutive administrations of a large-scale, multi-state summative assessment program.…

Descriptors: Scoring, Interrater Reliability, Measurement, Summative Evaluation

Computerized Adaptive Testing in Early Education: Exploring the Impact of Item Position Effects on Ability Estimation

Peer reviewed

Direct link

Albano, Anthony D.; Cai, Liuhan; Lease, Erin M.; McConnell, Scott R. – Journal of Educational Measurement, 2019

Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in…

Descriptors: Test Items, Computer Assisted Testing, Item Analysis, Difficulty Level

Hierarchical Linear Modeling	5
Item Response Theory	3
Evaluation Methods	2
Models	2
Test Bias	2
Test Items	2
Achievement Tests	1
Bias	1
Computer Assisted Testing	1
Correlation	1
Difficulty Level	1
Educational Assessment	1
Elementary School Students	1
Equated Scores	1
Error of Measurement	1
Evaluators	1
Foreign Countries	1
International Assessment	1
Interrater Reliability	1
Item Analysis	1
Measurement	1
Measures (Individuals)	1
Performance	1
Prediction	1
Psychometrics	1
More ▼