ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	6
Since 2007 (last 20 years)	9

Descriptor

Error of Measurement	11
Test Bias	11
Test Length	11
Item Response Theory	9
Sample Size	7
Test Items	7
Foreign Countries	5
Monte Carlo Methods	4
Computation	3
Differences	3
Models	3
Ability	2
Comparative Analysis	2
Effect Size	2
Error Patterns	2
Factor Analysis	2
International Assessment	2
Psychological Studies	2
Regression (Statistics)	2
Scores	2
Simulation	2
Statistical Analysis	2
True Scores	2
Accuracy	1
Achievement Tests	1
More ▼

Source

Applied Psychological…	3
International Journal of…	2
Applied Measurement in…	1
Educational Sciences: Theory…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Educational and…	1
Physical Review Physics…	1

Publication Type

Journal Articles	11
Reports - Research	10
Reports - Evaluative	1

Education Level

Elementary Secondary Education	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Iran	1
Japan	1
Taiwan	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Type I Error and Power Rates: A Comparative Analysis of Techniques in Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Ayse Bilicioglu Gunes; Bayram Bicak – International Journal of Assessment Tools in Education, 2023

The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord's [chi-squared], and Raju's Areas…

Descriptors: Test Items, Item Response Theory, Error of Measurement, Test Bias

Examining the Impact of Violations of Local Item Independence Assumption on Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Mehmet Fatih Doguyurt; Seref Tan – International Journal of Assessment Tools in Education, 2025

This study investigates the impact of violating the local item independence assumption by loading certain items onto a second dimension on test equating errors in unidimensional and dichotomous tests. The research was designed as a simulation study, using data generated based on the PISA 2018 mathematics exam. Analyses were conducted under 36…

Descriptors: Equated Scores, Test Items, Mathematics Tests, International Assessment

Optimizing the Length of Computerized Adaptive Testing for the Force Concept Inventory

Peer reviewed

Direct link

Yasuda, Jun-ichiro; Mae, Naohiro; Hull, Michael M.; Taniguchi, Masa-aki – Physical Review Physics Education Research, 2021

As a method to shorten the test time of the Force Concept Inventory (FCI), we suggest the use of computerized adaptive testing (CAT). CAT is the process of administering a test on a computer, with items (i.e., questions) selected based upon the responses of the examinee to prior items. In so doing, the test length can be significantly shortened.…

Descriptors: Foreign Countries, College Students, Student Evaluation, Computer Assisted Testing

Item Parameter Drift in a Time-Varying Predictor

Peer reviewed

Direct link

Lee, HyeSun – Applied Measurement in Education, 2018

The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…

Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores

A Monte Carlo Study of an Iterative Wald Test Procedure for DIF Analysis

Peer reviewed

Direct link

Cao, Mengyang; Tay, Louis; Liu, Yaowu – Educational and Psychological Measurement, 2017

This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…

Descriptors: Monte Carlo Methods, Test Items, Test Bias, Error of Measurement

Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Peer reviewed

Direct link

Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

Descriptors: Test Bias, Test Reliability, Performance, Scores

Comparing Performances (Type I Error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Atalay Kabasakal, Kübra; Arsan, Nihan; Gök, Bilge; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2014

This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying…

Descriptors: Comparative Analysis, Item Response Theory, Statistical Analysis, Test Bias

Modification of the Mantel-Haenszel and Logistic Regression DIF Procedures to Incorporate the SIBTEST Regression Correction

Peer reviewed

Direct link

DeMars, Christine E. – Journal of Educational and Behavioral Statistics, 2009

The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…

Descriptors: Regression (Statistics), Test Bias, Error of Measurement, True Scores

Item Parameter Estimation for the MIRT Model: Bias and Precision of Confirmatory Factor Analysis-Based Models

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2010

The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…

Descriptors: Item Response Theory, Computation, Factor Analysis, Models

The MIMIC Model as a Method for Detecting DIF: Comparison With Mantel-Haenszel, SIBTEST, and the IRT Likelihood Ratio

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2005

This study compares the ability of the multiple indicators, multiple causes (MIMIC) confirmatory factor analysis model to correctly identify cases of differential item functioning (DIF) with more established methods. Although the MIMIC model might have application in identifying DIF for multiple grouping variables, there has been little…

Descriptors: Identification, Factor Analysis, Test Bias, Models

Factors Influencing the Mantel and Generalized Mantel-Haenszel Methods for the Assessment of Differential Item Functioning in Polytomous Items

Peer reviewed

Direct link

Wang, Wen-Chung; Su, Ya-Hui – Applied Psychological Measurement, 2004

Eight independent variables (differential item functioning [DIF] detection method, purification procedure, item response model, mean latent trait difference between groups, test length, DIF pattern, magnitude of DIF, and percentage of DIF items) were manipulated, and two dependent variables (Type I error and power) were assessed through…

Descriptors: Test Length, Test Bias, Simulation, Item Response Theory

Finch, Holmes	2
Arsan, Nihan	1
Atalay Kabasakal, Kübra	1
Ayse Bilicioglu Gunes	1
Bayram Bicak	1
Cao, Mengyang	1
DeMars, Christine E.	1
Gök, Bilge	1
Hull, Michael M.	1
Kelecioglu, Hülya	1
Lee, HyeSun	1
Lee, Yi-Hsuan	1
Liu, Yaowu	1
Mae, Naohiro	1
Mehmet Fatih Doguyurt	1
Seref Tan	1
Su, Ya-Hui	1
Taniguchi, Masa-aki	1
Tay, Louis	1
Wang, Wen-Chung	1
Yasuda, Jun-ichiro	1
Zhang, Jinming	1
More ▼