Publication Date
In 2025 | 3 |
Since 2024 | 8 |
Since 2021 (last 5 years) | 21 |
Since 2016 (last 10 years) | 31 |
Since 2006 (last 20 years) | 53 |
Descriptor
Comparative Analysis | 63 |
Item Analysis | 63 |
Models | 63 |
Item Response Theory | 31 |
Test Items | 26 |
Evaluation Methods | 15 |
Factor Analysis | 15 |
Scores | 14 |
Simulation | 13 |
Goodness of Fit | 12 |
Foreign Countries | 11 |
More ▼ |
Source
Author
Chun Wang | 2 |
Finch, Holmes | 2 |
Gongjun Xu | 2 |
Raykov, Tenko | 2 |
von Davier, Matthias | 2 |
Acquaye, Rosemary | 1 |
Andreas Gold | 1 |
Arnon, Sara | 1 |
Arvay, Anett | 1 |
Baron, Simon | 1 |
Barrett, Frank | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 8 |
Secondary Education | 6 |
Postsecondary Education | 5 |
Elementary Secondary Education | 3 |
High Schools | 3 |
Elementary Education | 2 |
Grade 12 | 1 |
Grade 8 | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Audience
Practitioners | 1 |
Researchers | 1 |
Students | 1 |
Location
Israel | 2 |
South Korea | 2 |
United States | 2 |
Argentina | 1 |
Brazil | 1 |
Cyprus | 1 |
France | 1 |
Germany | 1 |
Greece | 1 |
Massachusetts | 1 |
Netherlands | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Trends in International… | 3 |
National Assessment of… | 2 |
Child Behavior Checklist | 1 |
Graduate Record Examinations | 1 |
National Longitudinal Study… | 1 |
Test of English as a Foreign… | 1 |
Wechsler Adult Intelligence… | 1 |
What Works Clearinghouse Rating
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation
Karl Schweizer; Andreas Gold; Dorothea Krampen; Stefan Troche – Educational and Psychological Measurement, 2024
Conceptualizing two-variable disturbances preventing good model fit in confirmatory factor analysis as item-level method effects instead of correlated residuals avoids violating the principle that residual variation is unique for each item. The possibility of representing such a disturbance by a method factor of a bifactor measurement model was…
Descriptors: Correlation, Factor Analysis, Measurement Techniques, Item Analysis
Kazuhiro Yamaguchi – Journal of Educational and Behavioral Statistics, 2025
This study proposes a Bayesian method for diagnostic classification models (DCMs) for a partially known Q-matrix setting between exploratory and confirmatory DCMs. This Q-matrix setting is practical and useful because test experts have pre-knowledge of the Q-matrix but cannot readily specify it completely. The proposed method employs priors for…
Descriptors: Models, Classification, Bayesian Statistics, Evaluation Methods
Jiaying Xiao; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Accurate item parameters and standard errors (SEs) are crucial for many multidimensional item response theory (MIRT) applications. A recent study proposed the Gaussian Variational Expectation Maximization (GVEM) algorithm to improve computational efficiency and estimation accuracy (Cho et al., 2021). However, the SE estimation procedure has yet to…
Descriptors: Error of Measurement, Models, Evaluation Methods, Item Analysis
Zsuzsa Bakk – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A standard assumption of latent class (LC) analysis is conditional independence, that is the items of the LC are independent of the covariates given the LCs. Several approaches have been proposed for identifying violations of this assumption. The recently proposed likelihood ratio approach is compared to residual statistics (bivariate residuals…
Descriptors: Goodness of Fit, Error of Measurement, Comparative Analysis, Models
Gyamfi, Abraham; Acquaye, Rosemary – Acta Educationis Generalis, 2023
Introduction: Item response theory (IRT) has received much attention in validation of assessment instrument because it allows the estimation of students' ability from any set of the items. Item response theory allows the difficulty and discrimination levels of each item on the test to be estimated. In the framework of IRT, item characteristics are…
Descriptors: Item Response Theory, Models, Test Items, Difficulty Level
Markus T. Jansen; Ralf Schulze – Educational and Psychological Measurement, 2024
Thurstonian forced-choice modeling is considered to be a powerful new tool to estimate item and person parameters while simultaneously testing the model fit. This assessment approach is associated with the aim of reducing faking and other response tendencies that plague traditional self-report trait assessments. As a result of major recent…
Descriptors: Factor Analysis, Models, Item Analysis, Evaluation Methods
Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023
This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…
Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis
Finch, Holmes – Applied Measurement in Education, 2022
Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…
Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation
Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022
This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…
Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy
Bayesian Adaptive Lasso for the Detection of Differential Item Functioning in Graded Response Models
Na Shan; Ping-Feng Xu – Journal of Educational and Behavioral Statistics, 2025
The detection of differential item functioning (DIF) is important in psychological and behavioral sciences. Standard DIF detection methods perform an item-by-item test iteratively, often assuming that all items except the one under investigation are DIF-free. This article proposes a Bayesian adaptive Lasso method to detect DIF in graded response…
Descriptors: Bayesian Statistics, Item Response Theory, Adolescents, Longitudinal Studies
Leventhal, Brian C.; Zigler, Christina K. – Measurement: Interdisciplinary Research and Perspectives, 2023
Survey score interpretations are often plagued by sources of construct-irrelevant variation, such as response styles. In this study, we propose the use of an IRTree Model to account for response styles by making use of self-report items and anchoring vignettes. Specifically, we investigate how the IRTree approach with anchoring vignettes compares…
Descriptors: Scores, Vignettes, Response Style (Tests), Item Response Theory
Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…
Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests
Salem, Alexandra C.; Gale, Robert; Casilio, Marianne; Fleegle, Mikala; Fergadiotis, Gerasimos; Bedrick, Steven – Journal of Speech, Language, and Hearing Research, 2023
Purpose: ParAlg (Paraphasia Algorithms) is a software that automatically categorizes a person with aphasia's naming error (paraphasia) in relation to its intended target on a picture-naming test. These classifications (based on lexicality as well as semantic, phonological, and morphological similarity to the target) are important for…
Descriptors: Semantics, Computer Software, Aphasia, Classification