ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	14
Since 2007 (last 20 years)	21

Descriptor

Data Analysis	37
Simulation	13
Item Response Theory	10
Models	10
Test Items	9
Evaluation Methods	7
Measurement	7
Comparative Analysis	5
Scores	5
Error of Measurement	4
Sampling	4
Tables (Data)	4
Validity	4
Accuracy	3
Correlation	3
Equated Scores	3
Hypothesis Testing	3
Item Analysis	3
Learning Processes	3
Maximum Likelihood Statistics	3
Multiple Choice Tests	3
Research Design	3
Test Bias	3
Test Results	3
Tests	3
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	29
Reports - Research	19
Reports - Evaluative	8
Reports - Descriptive	2
Speeches/Meeting Papers	1

Education Level

Middle Schools	2
Junior High Schools	1
Secondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
Advanced Placement…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 37 results Save | Export

Validation for Personalized Assessments: A Threats-to-Validity Approach

Peer reviewed

Direct link

Sandip Sinharay; Randy E. Bennett; Michael Kane; Jesse R. Sparks – Journal of Educational Measurement, 2025

Personalized assessments are of increasing interest because of their potential to lead to more equitable decisions about the examinees. However, one obstacle to the widespread use of personalized assessments is the lack of a measurement toolkit that can be used to analyze data from these assessments. This article takes one step toward building…

Descriptors: Test Validity, Data Analysis, Advanced Placement Programs, Art

A Dual-Purpose Model for Binary Data: Estimating Ability and Misconceptions

Peer reviewed

Direct link

Wenchao Ma; Miguel A. Sorrel; Xiaoming Zhai; Yuan Ge – Journal of Educational Measurement, 2024

Most existing diagnostic models are developed to detect whether students have mastered a set of skills of interest, but few have focused on identifying what scientific misconceptions students possess. This article developed a general dual-purpose model for simultaneously estimating students' overall ability and the presence and absence of…

Descriptors: Models, Misconceptions, Diagnostic Tests, Ability

Modeling Nonlinear Effects of Person-by-Item Covariates in Explanatory Item Response Models: Exploratory Plots and Modeling Using Smooth Functions

Peer reviewed

Direct link

Sun-Joo Cho; Amanda Goodwin; Matthew Naveiras; Paul De Boeck – Journal of Educational Measurement, 2024

Explanatory item response models (EIRMs) have been applied to investigate the effects of person covariates, item covariates, and their interactions in the fields of reading education and psycholinguistics. In practice, it is often assumed that the relationships between the covariates and the logit transformation of item response probability are…

Descriptors: Item Response Theory, Test Items, Models, Maximum Likelihood Statistics

Using Weighted Sum Scores to Close the Gap between DIF Practice and Theory

Peer reviewed

Direct link

Guo, Hongwen; Dorans, Neil J. – Journal of Educational Measurement, 2020

We make a distinction between the operational practice of using an observed score to assess differential item functioning (DIF) and the concept of departure from measurement invariance (DMI) that conditions on a latent variable. DMI and DIF indices of effect sizes, based on the Mantel-Haenszel test of common odds ratio, converge under restricted…

Descriptors: Weighted Scores, Test Items, Item Response Theory, Measurement

Examining Differential Rater Functioning Using a Between-Subgroup Outfit Approach

Peer reviewed

Direct link

Wind, Stefanie A.; Sebok-Syer, Stefanie S. – Journal of Educational Measurement, 2019

When practitioners use modern measurement models to evaluate rating quality, they commonly examine rater fit statistics that summarize how well each rater's ratings fit the expectations of the measurement model. Essentially, this approach involves examining the unexpected ratings that each misfitting rater assigned (i.e., carrying out analyses of…

Descriptors: Measurement, Models, Evaluators, Simulation

Use of Data Mining Methods to Detect Test Fraud

Peer reviewed

Direct link

Man, Kaiwen; Harring, Jeffrey R.; Sinharay, Sandip – Journal of Educational Measurement, 2019

Data mining methods have drawn considerable attention across diverse scientific fields. However, few applications could be found in the areas of psychological and educational measurement, and particularly pertinent to this article, in test security research. In this study, various data mining methods for detecting cheating behaviors on large-scale…

Descriptors: Information Retrieval, Data Analysis, Identification, Tests

Scale Alignment in Between-Item Multidimensional Rasch Models

Peer reviewed

Direct link

Feuerstahler, Leah; Wilson, Mark – Journal of Educational Measurement, 2019

Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described--delta dimensional alignment (DDA) and logistic regression alignment (LRA)--to transform estimated…

Descriptors: Item Response Theory, Models, Scores, Comparative Analysis

Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

Peer reviewed

Direct link

Drabinová, Adéla; Martinková, Patrícia – Journal of Educational Measurement, 2017

In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…

Descriptors: Test Items, Regression (Statistics), Guessing (Tests), Identification

How to Compare Parametric and Nonparametric Person-Fit Statistics Using Real Data

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2017

Person-fit assessment (PFA) is concerned with uncovering atypical test performance as reflected in the pattern of scores on individual items on a test. Existing person-fit statistics (PFSs) include both parametric and nonparametric statistics. Comparison of PFSs has been a popular research topic in PFA, but almost all comparisons have employed…

Descriptors: Goodness of Fit, Testing, Test Items, Scores

Parameter Estimation in Rasch Models for Examinee-Selected Items

Peer reviewed

Direct link

Liu, Chen-Wei; Wang, Wen-Chung – Journal of Educational Measurement, 2017

The examinee-selected-item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set of items (e.g., choose one item to respond from a pair of items), always yields incomplete data (i.e., only the selected items are answered and the others have missing data) that are likely nonignorable. Therefore, using…

Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Data Analysis

Statistically Modeling Individual Students' Learning over Successive Collaborative Practice Opportunities

Peer reviewed

Direct link

Olsen, Jennifer; Aleven, Vincent; Rummel, Nikol – Journal of Educational Measurement, 2017

Within educational data mining, many statistical models capture the learning of students working individually. However, not much work has been done to extend these statistical models of individual learning to a collaborative setting, despite the effectiveness of collaborative learning activities. We extend a widely used model (the additive factors…

Descriptors: Mathematical Models, Information Retrieval, Data Analysis, Educational Research

A Stepwise Test Characteristic Curve Method to Detect Item Parameter Drift

Peer reviewed

Direct link

Guo, Rui; Zheng, Yi; Chang, Hua-Hua – Journal of Educational Measurement, 2015

An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…

Descriptors: Item Response Theory, Test Items, Evaluation Methods, Equated Scores

Lord's Wald Test for Detecting Dif in Multidimensional Irt Models: A Comparison of Two Estimation Approaches

Peer reviewed

Direct link

Lee, Soo; Suh, Youngsuk – Journal of Educational Measurement, 2018

Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…

Descriptors: Item Response Theory, Sample Size, Models, Error of Measurement

Comparing the Effectiveness of Self-Paced and Collaborative Frame-of-Reference Training on Rater Accuracy in a Large-Scale Writing Assessment

Peer reviewed

Direct link

Raczynski, Kevin R.; Cohen, Allan S.; Engelhard, George, Jr.; Lu, Zhenqiu – Journal of Educational Measurement, 2015

There is a large body of research on the effectiveness of rater training methods in the industrial and organizational psychology literature. Less has been reported in the measurement literature on large-scale writing assessments. This study compared the effectiveness of two widely used rater training methods--self-paced and collaborative…

Descriptors: Interrater Reliability, Writing Evaluation, Training Methods, Pacing

Structured Constructs Models Based on Change-Point Analysis

Peer reviewed

Direct link

Shin, Hyo Jeong; Wilson, Mark; Choi, In-Hee – Journal of Educational Measurement, 2017

This study proposes a structured constructs model (SCM) to examine measurement in the context of a multidimensional learning progression (LP). The LP is assumed to have features that go beyond a typical multidimentional IRT model, in that there are hypothesized to be certain cross-dimensional linkages that correspond to requirements between the…

Descriptors: Middle School Students, Student Evaluation, Measurement Techniques, Learning Processes

Previous Page | Next Page »

Pages: 1 | 2 | 3

Wilson, Mark	3
Bolt, Daniel M.	2
Sinharay, Sandip	2
Suh, Youngsuk	2
Aleven, Vincent	1
Allen, Nancy L.	1
Amanda Goodwin	1
Baldwin, Su G.	1
Birenbaum, Menucha	1
Chang, Hua-Hua	1
Choi, In-Hee	1
Clauser, Brian E.	1
Cohen, Allan S.	1
Colton, Dean A.	1
D'Agostino, Ralph B.	1
Dillon, Gerard F.	1
Dorans, Neil J.	1
Drabinová, Adéla	1
Engelhard, George, Jr.	1
Evans, Franklin R.	1
Ferry, Paula	1
Feuerstahler, Leah	1
Gierl, Mark J.	1
Gilman, David Alan	1
More ▼