ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	7
Since 2007 (last 20 years)	17

Descriptor

Computation	18
Item Response Theory	10
Test Items	9
Comparative Analysis	8
Foreign Countries	7
Test Bias	6
Simulation	5
Ability	4
Evaluation Methods	4
Models	4
Achievement Tests	3
Computer Assisted Testing	3
Educational Assessment	3
Identification	3
Monte Carlo Methods	3
Probability	3
Sample Size	3
Scores	3
Accuracy	2
Adaptive Testing	2
Bayesian Statistics	2
Bias	2
Cheating	2
Computer Software	2
Cutting Scores	2
More ▼

Source

International Journal of…

Publication Type

Journal Articles	18
Reports - Research	10
Reports - Descriptive	5
Reports - Evaluative	3
Guides - Non-Classroom	1

Education Level

Elementary Education	2
Grade 4	2
Secondary Education	2
Elementary Secondary Education	1
Grade 3	1
Grade 9	1
High Schools	1
Higher Education	1
Junior High Schools	1
Middle Schools	1

Audience

Practitioners	1
Researchers	1

Location

Canada	2
Armenia	1
Austria	1
Belgium	1
Hong Kong	1
Iran	1
Norway	1
Tunisia	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Progress in International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Exploring the Effects of Small Item Pools on Examinee Achievement Estimates for Computer-Adaptive Tests: A Simulation Study

Peer reviewed

Direct link

Beyza Aksu Dunya; Stefanie Wind – International Journal of Testing, 2025

We explored the practicality of relatively small item pools in the context of low-stakes Computer-Adaptive Testing (CAT), such as CAT procedures that might be used for quick diagnostic or screening exams. We used a basic CAT algorithm without content balancing and exposure control restrictions to reflect low stakes testing scenarios. We examined…

Descriptors: Item Banks, Adaptive Testing, Computer Assisted Testing, Achievement

Evaluating the Impact of Careless Responding on Aggregated-Scores: To Filter Unmotivated Examinees or Not?

Peer reviewed

Direct link

Rios, Joseph A.; Guo, Hongwen; Mao, Liyang; Liu, Ou Lydia – International Journal of Testing, 2017

When examinees' test-taking motivation is questionable, practitioners must determine whether careless responding is of practical concern and if so, decide on the best approach to filter such responses. As there has been insufficient research on these topics, the objectives of this study were to: a) evaluate the degree of underestimation in the…

Descriptors: Response Style (Tests), Scores, Motivation, Computation

Leveraging Evidence-Centered Design to Develop Assessments of Computational Thinking Practices

Peer reviewed

Direct link

Snow, Eric; Rutstein, Daisy; Basu, Satabdi; Bienkowski, Marie; Everson, Howard T. – International Journal of Testing, 2019

Computational thinking is a core skill in computer science that has become a focus of instruction in primary and secondary education worldwide. Since 2010, researchers have leveraged Evidence-Centered Design (ECD) methods to develop measures of students' Computational Thinking (CT) practices. This article describes how ECD was used to develop CT…

Descriptors: Evidence Based Practice, Test Construction, Computation, Cognitive Tests

Investigating Separate and Concurrent Approaches for Item Parameter Drift in 3PL Item Response Theory Equating

Peer reviewed

Direct link

Arce-Ferrer, Alvaro J.; Bulut, Okan – International Journal of Testing, 2017

This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…

Descriptors: Item Response Theory, Equated Scores, Identification, Computation

Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Peer reviewed

Direct link

Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

Descriptors: Test Bias, Test Reliability, Performance, Scores

An Algorithm to Improve Test Answer Copying Detection Using the Omega Statistic

Peer reviewed

Direct link

Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017

The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…

Descriptors: Cheating, Test Items, Mathematics, Statistics

Spurious Latent Class Problem in the Mixed Rasch Model: A Comparison of Three Maximum Likelihood Estimation Methods under Different Ability Distributions

Peer reviewed

Direct link

Sen, Sedat – International Journal of Testing, 2018

Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…

Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics

Item Calibration Samples and the Stability of Achievement Estimates and System Rankings: Another Look at the PISA Model

Peer reviewed

Direct link

Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016

Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…

Descriptors: Simulation, International Programs, Adolescents, Student Evaluation

Fitting the Reduced RUM with Mplus: A Tutorial

Peer reviewed

Direct link

Chiu, Chia-Yi; Köhn, Hans-Friedrich; Wu, Huey-Min – International Journal of Testing, 2016

The Reduced Reparameterized Unified Model (Reduced RUM) is a diagnostic classification model for educational assessment that has received considerable attention among psychometricians. However, the computational options for researchers and practitioners who wish to use the Reduced RUM in their work, but do not feel comfortable writing their own…

Descriptors: Educational Diagnosis, Classification, Models, Educational Assessment

Toward Increasing Fairness in Score Scale Calibrations Employed in International Large-Scale Assessments

Peer reviewed

Direct link

Oliveri, Maria Elena; von Davier, Matthias – International Journal of Testing, 2014

In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often…

Descriptors: Test Bias, Scores, International Programs, Educational Assessment

Explore the Usefulness of Person-Fit Analysis on Large-Scale Assessment

Peer reviewed

Direct link

Cui, Ying; Mousavi, Amin – International Journal of Testing, 2015

The current study applied the person-fit statistic, l[subscript z], to data from a Canadian provincial achievement test to explore the usefulness of conducting person-fit analysis on large-scale assessments. Item parameter estimates were compared before and after the misfitting student responses, as identified by l[subscript z], were removed. The…

Descriptors: Measurement, Achievement Tests, Comparative Analysis, Test Items

A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

Peer reviewed

Direct link

Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul – International Journal of Testing, 2011

We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

Descriptors: Language Skills, Identification, Foreign Countries, Evaluation Methods

A Monte Carlo Simulation Investigating the Validity and Reliability of Ability Estimation in Item Response Theory with Speeded Computer Adaptive Tests

Peer reviewed

Direct link

Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010

Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…

Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing

Differential Item Functioning Analysis Using Rasch Item Information Functions

Peer reviewed

Direct link

Wyse, Adam E.; Mapuranga, Raymond – International Journal of Testing, 2009

Differential item functioning (DIF) analysis is a statistical technique used for ensuring the equity and fairness of educational assessments. This study formulates a new DIF analysis method using the information similarity index (ISI). ISI compares item information functions when data fits the Rasch model. Through simulations and an international…

Descriptors: Test Bias, Evaluation Methods, Test Items, Educational Assessment

Modeling DIF in Complex Response Data Using Test Design Strategies

Peer reviewed

Direct link

Kahraman, Nilufer; De Boeck, Paul; Janssen, Rianne – International Journal of Testing, 2009

This study introduces an approach for modeling multidimensional response data with construct-relevant group and domain factors. The item level parameter estimation process is extended to incorporate the refined effects of test dimension and group factors. Differences in item performances over groups are evaluated, distinguishing two levels of…

Descriptors: Test Bias, Test Items, Groups, Interaction

Previous Page | Next Page »

Pages: 1 | 2

Arce-Ferrer, Alvaro J.	1
Basu, Satabdi	1
Beland, Sebastien	1
Beyza Aksu Dunya	1
Bienkowski, Marie	1
Bulut, Okan	1
Childs, Ruth A.	1
Chiu, Chia-Yi	1
Cui, Ying	1
De Boeck, Paul	1
Everson, Howard T.	1
Gerard, Paul	1
Guo, Hongwen	1
Jaciw, Andrew P.	1
Janssen, Rianne	1
Kahraman, Nilufer	1
Köhn, Hans-Friedrich	1
Lee, Yi-Hsuan	1
Liu, Ou Lydia	1
Maeda, Hotaka	1
Magis, David	1
Mao, Liyang	1
Mapuranga, Raymond	1
Marcoulides, George A.	1
Mousavi, Amin	1
More ▼