Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 14 |
| Since 2007 (last 20 years) | 21 |
Descriptor
| Data Analysis | 37 |
| Simulation | 13 |
| Item Response Theory | 10 |
| Models | 10 |
| Test Items | 9 |
| Evaluation Methods | 7 |
| Measurement | 7 |
| Comparative Analysis | 5 |
| Scores | 5 |
| Error of Measurement | 4 |
| Sampling | 4 |
| More ▼ | |
Source
| Journal of Educational… | 37 |
Author
| Wilson, Mark | 3 |
| Bolt, Daniel M. | 2 |
| Sinharay, Sandip | 2 |
| Suh, Youngsuk | 2 |
| Aleven, Vincent | 1 |
| Allen, Nancy L. | 1 |
| Amanda Goodwin | 1 |
| Baldwin, Su G. | 1 |
| Birenbaum, Menucha | 1 |
| Chang, Hua-Hua | 1 |
| Choi, In-Hee | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 29 |
| Reports - Research | 19 |
| Reports - Evaluative | 8 |
| Reports - Descriptive | 2 |
| Speeches/Meeting Papers | 1 |
Education Level
| Middle Schools | 2 |
| Junior High Schools | 1 |
| Secondary Education | 1 |
Audience
| Researchers | 2 |
Location
Laws, Policies, & Programs
Assessments and Surveys
| SAT (College Admission Test) | 2 |
| Advanced Placement… | 1 |
What Works Clearinghouse Rating
Sandip Sinharay; Randy E. Bennett; Michael Kane; Jesse R. Sparks – Journal of Educational Measurement, 2025
Personalized assessments are of increasing interest because of their potential to lead to more equitable decisions about the examinees. However, one obstacle to the widespread use of personalized assessments is the lack of a measurement toolkit that can be used to analyze data from these assessments. This article takes one step toward building…
Descriptors: Test Validity, Data Analysis, Advanced Placement Programs, Art
Wenchao Ma; Miguel A. Sorrel; Xiaoming Zhai; Yuan Ge – Journal of Educational Measurement, 2024
Most existing diagnostic models are developed to detect whether students have mastered a set of skills of interest, but few have focused on identifying what scientific misconceptions students possess. This article developed a general dual-purpose model for simultaneously estimating students' overall ability and the presence and absence of…
Descriptors: Models, Misconceptions, Diagnostic Tests, Ability
Sun-Joo Cho; Amanda Goodwin; Matthew Naveiras; Paul De Boeck – Journal of Educational Measurement, 2024
Explanatory item response models (EIRMs) have been applied to investigate the effects of person covariates, item covariates, and their interactions in the fields of reading education and psycholinguistics. In practice, it is often assumed that the relationships between the covariates and the logit transformation of item response probability are…
Descriptors: Item Response Theory, Test Items, Models, Maximum Likelihood Statistics
Guo, Hongwen; Dorans, Neil J. – Journal of Educational Measurement, 2020
We make a distinction between the operational practice of using an observed score to assess differential item functioning (DIF) and the concept of departure from measurement invariance (DMI) that conditions on a latent variable. DMI and DIF indices of effect sizes, based on the Mantel-Haenszel test of common odds ratio, converge under restricted…
Descriptors: Weighted Scores, Test Items, Item Response Theory, Measurement
Wind, Stefanie A.; Sebok-Syer, Stefanie S. – Journal of Educational Measurement, 2019
When practitioners use modern measurement models to evaluate rating quality, they commonly examine rater fit statistics that summarize how well each rater's ratings fit the expectations of the measurement model. Essentially, this approach involves examining the unexpected ratings that each misfitting rater assigned (i.e., carrying out analyses of…
Descriptors: Measurement, Models, Evaluators, Simulation
Man, Kaiwen; Harring, Jeffrey R.; Sinharay, Sandip – Journal of Educational Measurement, 2019
Data mining methods have drawn considerable attention across diverse scientific fields. However, few applications could be found in the areas of psychological and educational measurement, and particularly pertinent to this article, in test security research. In this study, various data mining methods for detecting cheating behaviors on large-scale…
Descriptors: Information Retrieval, Data Analysis, Identification, Tests
Feuerstahler, Leah; Wilson, Mark – Journal of Educational Measurement, 2019
Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described--delta dimensional alignment (DDA) and logistic regression alignment (LRA)--to transform estimated…
Descriptors: Item Response Theory, Models, Scores, Comparative Analysis
Drabinová, Adéla; Martinková, Patrícia – Journal of Educational Measurement, 2017
In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…
Descriptors: Test Items, Regression (Statistics), Guessing (Tests), Identification
Sinharay, Sandip – Journal of Educational Measurement, 2017
Person-fit assessment (PFA) is concerned with uncovering atypical test performance as reflected in the pattern of scores on individual items on a test. Existing person-fit statistics (PFSs) include both parametric and nonparametric statistics. Comparison of PFSs has been a popular research topic in PFA, but almost all comparisons have employed…
Descriptors: Goodness of Fit, Testing, Test Items, Scores
Liu, Chen-Wei; Wang, Wen-Chung – Journal of Educational Measurement, 2017
The examinee-selected-item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set of items (e.g., choose one item to respond from a pair of items), always yields incomplete data (i.e., only the selected items are answered and the others have missing data) that are likely nonignorable. Therefore, using…
Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Data Analysis
Olsen, Jennifer; Aleven, Vincent; Rummel, Nikol – Journal of Educational Measurement, 2017
Within educational data mining, many statistical models capture the learning of students working individually. However, not much work has been done to extend these statistical models of individual learning to a collaborative setting, despite the effectiveness of collaborative learning activities. We extend a widely used model (the additive factors…
Descriptors: Mathematical Models, Information Retrieval, Data Analysis, Educational Research
Guo, Rui; Zheng, Yi; Chang, Hua-Hua – Journal of Educational Measurement, 2015
An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…
Descriptors: Item Response Theory, Test Items, Evaluation Methods, Equated Scores
Lee, Soo; Suh, Youngsuk – Journal of Educational Measurement, 2018
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
Descriptors: Item Response Theory, Sample Size, Models, Error of Measurement
Raczynski, Kevin R.; Cohen, Allan S.; Engelhard, George, Jr.; Lu, Zhenqiu – Journal of Educational Measurement, 2015
There is a large body of research on the effectiveness of rater training methods in the industrial and organizational psychology literature. Less has been reported in the measurement literature on large-scale writing assessments. This study compared the effectiveness of two widely used rater training methods--self-paced and collaborative…
Descriptors: Interrater Reliability, Writing Evaluation, Training Methods, Pacing
Shin, Hyo Jeong; Wilson, Mark; Choi, In-Hee – Journal of Educational Measurement, 2017
This study proposes a structured constructs model (SCM) to examine measurement in the context of a multidimensional learning progression (LP). The LP is assumed to have features that go beyond a typical multidimentional IRT model, in that there are hypothesized to be certain cross-dimensional linkages that correspond to requirements between the…
Descriptors: Middle School Students, Student Evaluation, Measurement Techniques, Learning Processes

Peer reviewed
Direct link
