NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Cao, Mengyang; Song, Q. Chelsea; Tay, Louis – International Journal of Testing, 2018
There is a growing use of noncognitive assessments around the world, and recent research has posited an ideal point response process underlying such measures. A critical issue is whether the typical use of dominance approaches (e.g., average scores, factor analysis, and the Samejima's graded response model) in scoring such measures is adequate.…
Descriptors: Comparative Analysis, Item Response Theory, Factor Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021
Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…
Descriptors: Language Tests, German, Second Languages, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018
Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…
Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017
The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…
Descriptors: Cheating, Test Items, Mathematics, Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Sen, Sedat – International Journal of Testing, 2018
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Choi, Youn-Jeng; Alexeev, Natalia; Cohen, Allan S. – International Journal of Testing, 2015
The purpose of this study was to explore what may be contributing to differences in performance in mathematics on the Trends in International Mathematics and Science Study 2007. This was done by using a mixture item response theory modeling approach to first detect latent classes in the data and then to examine differences in performance on items…
Descriptors: Test Bias, Mathematics Achievement, Mathematics Tests, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2013
The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of…
Descriptors: Structural Equation Models, Sample Size, Second Language Instruction, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010
Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…
Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing
Peer reviewed Peer reviewed
De Ayala, Ralph J.; Kim, Seock-Ho; Stapleton, Laura M.; Dayton, C. Mitchell – International Journal of Testing, 2002
Conducted a Monte Carlo study to compare various approaches to detecting differential item functioning (DIF) under a conceptualization of DIF that recognizes that observed data are a mixture of data from multiple latent populations or classes. Demonstrated the usefulness of the approach. (SLD)
Descriptors: Data Analysis, Item Bias, Monte Carlo Methods, Simulation
Peer reviewed Peer reviewed
Bartfay, Emma – International Journal of Testing, 2003
Used Monte Carlo simulation to compare the properties of a goodness-of-fit (GOF) procedure and a test statistic developed by E. Bartfay and A. Donner (2001) to the likelihood ratio test in assessing the existence of extra variation. Results show the GOF procedure possess satisfactory Type I error rate and power. (SLD)
Descriptors: Goodness of Fit, Interrater Reliability, Monte Carlo Methods, Simulation