ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	8
Since 2007 (last 20 years)	16

Descriptor

Comparative Analysis	27
Evaluation Methods	15
Item Response Theory	11
Test Items	11
Simulation	9
Methods	6
Monte Carlo Methods	6
Models	5
Scores	5
Adaptive Testing	4
College Entrance Examinations	4
Computer Assisted Testing	4
Equated Scores	4
Item Analysis	4
Scoring	4
Test Bias	4
Accuracy	3
Computation	3
Data Analysis	3
Error of Measurement	3
Markov Processes	3
Sample Size	3
Tests	3
Bayesian Statistics	2
Bias	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	26
Reports - Research	18
Reports - Evaluative	7
Reports - Descriptive	1

Education Level

Elementary Secondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
National Assessment of…	1
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 27 results Save | Export

Two IRT Characteristic Curve Linking Methods Weighted by Information

Peer reviewed

Direct link

Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022

Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…

Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods

Scale Alignment in Between-Item Multidimensional Rasch Models

Peer reviewed

Direct link

Feuerstahler, Leah; Wilson, Mark – Journal of Educational Measurement, 2019

Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described--delta dimensional alignment (DDA) and logistic regression alignment (LRA)--to transform estimated…

Descriptors: Item Response Theory, Models, Scores, Comparative Analysis

Structural Zeros and Their Implications with Log-Linear Bivariate Presmoothing under the Internal-Anchor Design

Peer reviewed

Direct link

Kim, Hyung Jin; Brennan, Robert L.; Lee, Won-Chan – Journal of Educational Measurement, 2017

In equating, when common items are internal and scoring is conducted in terms of the number of correct items, some pairs of total scores ("X") and common-item scores ("V") can never be observed in a bivariate distribution of "X" and "V"; these pairs are called "structural zeros." This simulation…

Descriptors: Test Items, Equated Scores, Comparative Analysis, Methods

Classroom Assessment and Large-Scale Psychometrics: Shall the Twain Meet? (A Conversation with Margaret Heritage and Neal Kingston)

Peer reviewed

Direct link

Heritage, Margaret; Kingston, Neal M. – Journal of Educational Measurement, 2019

Classroom assessment and large-scale assessment have, for the most part, existed in mutual isolation. Some experts have felt this is for the best and others have been concerned that the schism limits the potential contribution of both forms of assessment. Margaret Heritage has long been a champion of best practices in classroom assessment. Neal…

Descriptors: Measurement, Psychometrics, Context Effect, Classroom Environment

On-the-Fly Constraint-Controlled Assembly Methods for Multistage Adaptive Testing for Cognitive Diagnosis

Peer reviewed

Direct link

Liu, Shuchang; Cai, Yan; Tu, Dongbo – Journal of Educational Measurement, 2018

This study applied the mode of on-the-fly assembled multistage adaptive testing to cognitive diagnosis (CD-OMST). Several and several module assembly methods for CD-OMST were proposed and compared in terms of measurement precision, test security, and constrain management. The module assembly methods in the study included the maximum priority index…

Descriptors: Adaptive Testing, Monte Carlo Methods, Computer Security, Clinical Diagnosis

Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms

Peer reviewed

Direct link

Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017

Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…

Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis

Subjective Priors for Item Response Models: Application of Elicitation by Design

Peer reviewed

Direct link

Ames, Allison; Smith, Elizabeth – Journal of Educational Measurement, 2018

Bayesian methods incorporate model parameter information prior to data collection. Eliciting information from content experts is an option, but has seen little implementation in Bayesian item response theory (IRT) modeling. This study aims to use ethical reasoning content experts to elicit prior information and incorporate this information into…

Descriptors: Item Response Theory, Bayesian Statistics, Ethics, Specialists

Lord's Wald Test for Detecting Dif in Multidimensional Irt Models: A Comparison of Two Estimation Approaches

Peer reviewed

Direct link

Lee, Soo; Suh, Youngsuk – Journal of Educational Measurement, 2018

Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…

Descriptors: Item Response Theory, Sample Size, Models, Error of Measurement

Optimal Bandwidth Selection in Observed-Score Kernel Equating

Peer reviewed

Direct link

Häggström, Jenny; Wiberg, Marie – Journal of Educational Measurement, 2014

The selection of bandwidth in kernel equating is important because it has a direct impact on the equated test scores. The aim of this article is to examine the use of double smoothing when selecting bandwidths in kernel equating and to compare double smoothing with the commonly used penalty method. This comparison was made using both an equivalent…

Descriptors: Equated Scores, Data Analysis, Comparative Analysis, Simulation

Evaluating Equating Accuracy and Assumptions for Groups that Differ in Performance

Peer reviewed

Direct link

Powers, Sonya; Kolen, Michael J. – Journal of Educational Measurement, 2014

Accurate equating results are essential when comparing examinee scores across exam forms. Previous research indicates that equating results may not be accurate when group differences are large. This study compared the equating results of frequency estimation, chained equipercentile, item response theory (IRT) true-score, and IRT observed-score…

Descriptors: Accuracy, Equated Scores, Differences, Groups

Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores with Item Exposure Control and Content Constraints

Peer reviewed

Direct link

Yao, Lihua – Journal of Educational Measurement, 2014

The intent of this research was to find an item selection procedure in the multidimensional computer adaptive testing (CAT) framework that yielded higher precision for both the domain and composite abilities, had a higher usage of the item pool, and controlled the exposure rate. Five multidimensional CAT item selection procedures (minimum angle;…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Detection of Invalid Test Scores: The Usefulness of Simple Nonparametric Statistics

Peer reviewed

Direct link

Tendeiro, Jorge N.; Meijer, Rob R. – Journal of Educational Measurement, 2014

In recent guidelines for fair educational testing it is advised to check the validity of individual test scores through the use of person-fit statistics. For practitioners it is unclear on the basis of the existing literature which statistic to use. An overview of relatively simple existing nonparametric approaches to identify atypical response…

Descriptors: Educational Assessment, Test Validity, Scores, Statistical Analysis

Differential Item Functioning Assessment in Cognitive Diagnostic Modeling: Application of the Wald Test to Investigate DIF in the DINA Model

Peer reviewed

Direct link

Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna – Journal of Educational Measurement, 2014

Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…

Descriptors: Test Bias, Models, Simulation, Error Patterns

A Comparison of Linking Methods for Estimating National Trends in International Comparative Large-Scale Assessments in the Presence of Cross-national DIF

Peer reviewed

Direct link

Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016

Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…

Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation

Estimation Methods for One-Parameter Testlet Models

Peer reviewed

Direct link

Jiao, Hong; Wang, Shudong; He, Wei – Journal of Educational Measurement, 2013

This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…

Descriptors: Computation, Item Response Theory, Models, Monte Carlo Methods

Previous Page | Next Page »

Pages: 1 | 2

Lee, Won-Chan	2
Nandakumar, Ratna	2
Ali, Usama S.	1
Ames, Allison	1
Ansley, Timothy	1
Baxter, Gail P.	1
Brennan, Robert L.	1
Cahn, Miriam F.	1
Cai, Yan	1
Chang, Hua-Hua	1
Chen, Shu-Ying	1
Debeer, Dries	1
Deng, Hui	1
Dorans, Neil J.	1
Feuerstahler, Leah	1
Finch, Holmes	1
Haag, Nicole	1
Habing, Brian	1
He, Wei	1
Heritage, Margaret	1
Hills, John R.	1
Hocevar, Dennis	1
Hou, Likun	1
Huang, Feifei	1
More ▼