ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	8

Descriptor

Monte Carlo Methods	10
Item Response Theory	6
Simulation	6
Comparative Analysis	4
Bayesian Statistics	3
Computation	3
Foreign Countries	3
Ability	2
Cheating	2
Computer Assisted Testing	2
Error of Measurement	2
Goodness of Fit	2
Grade 4	2
Language Tests	2
Markov Processes	2
Mathematics Achievement	2
Mathematics Tests	2
Multivariate Analysis	2
Probability	2
Sample Size	2
Sampling	2
Statistical Analysis	2
Statistical Bias	2
Achievement Tests	1
Adaptive Testing	1
More ▼

Source

International Journal of…

Publication Type

Journal Articles	10
Reports - Research	9
Reports - Evaluative	1

Education Level

Grade 4	2
Elementary Education	1
Intermediate Grades	1

Audience

Location

Austria	2
Armenia	1
Australia	1
El Salvador	1
Germany	1
Hong Kong	1
Iran	1
Norway	1
Qatar	1
Singapore	1
Slovakia	1
Tunisia	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Detecting Curvilinear Relationships: A Comparison of Scoring Approaches Based on Different Item Response Models

Peer reviewed

Direct link

Cao, Mengyang; Song, Q. Chelsea; Tay, Louis – International Journal of Testing, 2018

There is a growing use of noncognitive assessments around the world, and recent research has posited an ideal point response process underlying such measures. A critical issue is whether the typical use of dominance approaches (e.g., average scores, factor analysis, and the Samejima's graded response model) in scoring such measures is adequate.…

Descriptors: Comparative Analysis, Item Response Theory, Factor Analysis, Models

Examining Severity and Centrality Effects in TestDaF Writing and Speaking Assessments: An Extended Bayesian Many-Facet Rasch Analysis

Peer reviewed

Direct link

Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021

Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…

Descriptors: Language Tests, German, Second Languages, Writing Tests

Response Time Based Nonparametric Kullback-Leibler Divergence Measure for Detecting Aberrant Test-Taking Behavior

Peer reviewed

Direct link

Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018

Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…

Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating

An Algorithm to Improve Test Answer Copying Detection Using the Omega Statistic

Peer reviewed

Direct link

Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017

The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…

Descriptors: Cheating, Test Items, Mathematics, Statistics

Spurious Latent Class Problem in the Mixed Rasch Model: A Comparison of Three Maximum Likelihood Estimation Methods under Different Ability Distributions

Peer reviewed

Direct link

Sen, Sedat – International Journal of Testing, 2018

Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…

Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics

Differential Item Functioning Analysis Using a Mixture 3-Parameter Logistic Model with a Covariate on the TIMSS 2007 Mathematics Test

Peer reviewed

Direct link

Choi, Youn-Jeng; Alexeev, Natalia; Cohen, Allan S. – International Journal of Testing, 2015

The purpose of this study was to explore what may be contributing to differences in performance in mathematics on the Trends in International Mathematics and Science Study 2007. This was done by using a mixture item response theory modeling approach to first detect latent classes in the data and then to examine differences in performance on items…

Descriptors: Test Bias, Mathematics Achievement, Mathematics Tests, Item Response Theory

Review of Sample Size for Structural Equation Models in Second Language Testing and Learning Research: A Monte Carlo Approach

Peer reviewed

Direct link

In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2013

The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of…

Descriptors: Structural Equation Models, Sample Size, Second Language Instruction, Monte Carlo Methods

A Monte Carlo Simulation Investigating the Validity and Reliability of Ability Estimation in Item Response Theory with Speeded Computer Adaptive Tests

Peer reviewed

Direct link

Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010

Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…

Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing

Differential Item Functioning: A Mixture Distribution Conceptualization.

Peer reviewed

De Ayala, Ralph J.; Kim, Seock-Ho; Stapleton, Laura M.; Dayton, C. Mitchell – International Journal of Testing, 2002

Conducted a Monte Carlo study to compare various approaches to detecting differential item functioning (DIF) under a conceptualization of DIF that recognizes that observed data are a mixture of data from multiple latent populations or classes. Demonstrated the usefulness of the approach. (SLD)

Descriptors: Data Analysis, Item Bias, Monte Carlo Methods, Simulation

Conceptualizing Interrater Agreement as Testing the Existence of Extra Variation in the Multinomial Model.

Peer reviewed

Bartfay, Emma – International Journal of Testing, 2003

Used Monte Carlo simulation to compare the properties of a goodness-of-fit (GOF) procedure and a test statistic developed by E. Bartfay and A. Donner (2001) to the likelihood ratio test in assessing the existence of extra variation. Results show the GOF procedure possess satisfactory Type I error rate and power. (SLD)

Descriptors: Goodness of Fit, Interrater Reliability, Monte Carlo Methods, Simulation

Alexeev, Natalia	1
Bartfay, Emma	1
Cao, Mengyang	1
Choi, Youn-Jeng	1
Cohen, Allan S.	1
Dayton, C. Mitchell	1
De Ayala, Ralph J.	1
Eckes, Thomas	1
Harring, Jeffery R.	1
In'nami, Yo	1
Jin, Kuan-Yu	1
Kim, Seock-Ho	1
Koizumi, Rie	1
Maeda, Hotaka	1
Man, Kaiwen	1
Ouyang, Yunbo	1
Sass, D. A.	1
Schmitt, T. A.	1
Sen, Sedat	1
Song, Q. Chelsea	1
Stapleton, Laura M.	1
Sullivan, J. R.	1
Tay, Louis	1
Thomas, Sarah L.	1
Walker, C. M.	1
More ▼