Publication Date
| In 2026 | 0 |
| Since 2025 | 4 |
| Since 2022 (last 5 years) | 48 |
| Since 2017 (last 10 years) | 150 |
| Since 2007 (last 20 years) | 397 |
Descriptor
| Comparative Analysis | 530 |
| Models | 530 |
| Foreign Countries | 129 |
| Scores | 113 |
| Item Response Theory | 109 |
| Test Items | 96 |
| Statistical Analysis | 91 |
| Correlation | 83 |
| Teaching Methods | 67 |
| Evaluation Methods | 65 |
| Mathematics Tests | 59 |
| More ▼ | |
Source
Author
| von Davier, Matthias | 7 |
| Cho, Sun-Joo | 5 |
| Kolen, Michael J. | 4 |
| Suh, Youngsuk | 4 |
| Xu, Xueli | 4 |
| Amisha Jindal | 3 |
| Ashish Gurung | 3 |
| Cohen, Allan S. | 3 |
| DeMars, Christine E. | 3 |
| Erin Ottmar | 3 |
| Ji-Eun Lee | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 6 |
| Practitioners | 5 |
| Teachers | 2 |
| Policymakers | 1 |
| Students | 1 |
Location
| Australia | 13 |
| United States | 13 |
| Indonesia | 11 |
| Netherlands | 10 |
| California | 9 |
| Florida | 9 |
| South Korea | 9 |
| Canada | 8 |
| China | 8 |
| Germany | 8 |
| Italy | 8 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 3 |
| Elementary and Secondary… | 2 |
| Every Student Succeeds Act… | 2 |
| Defunis v Odegaard | 1 |
| Education Consolidation… | 1 |
| Hawkins Stafford Act 1988 | 1 |
| Workforce Investment Act 1998… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025
Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…
Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Kazuhiro Yamaguchi – Journal of Educational and Behavioral Statistics, 2025
This study proposes a Bayesian method for diagnostic classification models (DCMs) for a partially known Q-matrix setting between exploratory and confirmatory DCMs. This Q-matrix setting is practical and useful because test experts have pre-knowledge of the Q-matrix but cannot readily specify it completely. The proposed method employs priors for…
Descriptors: Models, Classification, Bayesian Statistics, Evaluation Methods
Huang, Hung-Yu – Educational and Psychological Measurement, 2023
The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs)…
Descriptors: Test Items, Classification, Bayesian Statistics, Decision Making
von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023
Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…
Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education
Mohaddeseh Salimpoor Aghdam; Javad Gholami; Mahnaz Saeidi – Language Teaching Research Quarterly, 2024
The need to create more efficient approaches to teaching L2 writing has grown due to the increasing demands placed on English writing proficiency in a global setting. The present investigation aimed to explore the impacts of the ENGAGE Model and task-based language teaching (TBLT) method on Iranian EFL learners' overall L2 writing performance.…
Descriptors: Task Analysis, English (Second Language), Second Language Learning, Second Language Instruction
Afsharrad, Mohammad; Pishghadam, Reza; Baghaei, Purya – International Journal of Language Testing, 2023
Testing organizations are faced with increasing demand to provide subscores in addition to the total test score. However, psychometricians argue that most subscores do not have added value to be worth reporting. To have added value, subscores need to meet a number of criteria: they should be reliable, distinctive, and distinct from each other and…
Descriptors: Comparative Analysis, Scores, Value Added Models, Psychometrics
Min, Shangchao; Cai, Hongwen; He, Lianzhen – Language Assessment Quarterly, 2022
The present study examined the performance of the bi-factor multidimensional item response theory (MIRT) model and higher-order (HO) cognitive diagnostic models (CDM) in providing diagnostic information and general ability estimation simultaneously in a listening test. The data used were 1,611 examinees' item-level responses to an in-house EFL…
Descriptors: Listening Comprehension Tests, English (Second Language), Second Language Learning, Foreign Countries
Michael C. Robbins; Zhuping Li – Field Methods, 2025
The Nolan Index (NI) is a normed, quantitative measure for comparing the degree of resemblance (similarity or dissimilarity) between free listings with an Excel program for calculating it. This article enhances that effort with the addition of an R program and additional applications. Free-list resemblance measures have been used to investigate…
Descriptors: Computation, Norm Referenced Tests, Comparative Analysis, Spreadsheets
Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…
Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests
Rachel L. Schechter; Maddie Lee Mason; Laura Janakiefski – Online Submission, 2024
The ongoing literacy crisis in the U.S. highlights an urgent need for effective, scalable literacy instruction. REED Charitable Foundation (RCF) is a non-profit organization that provides structured literacy training informed by Orton-Gillingham, along with ongoing professional coaching and comprehensive implementation support to help all students…
Descriptors: Literacy Education, Reading Instruction, Reading Achievement, Achievement Gains
Lúcio, Patrícia Silva; Vandekerckhove, Joachim; Polanczyk, Guilherme V.; Cogo-Moreira, Hugo – Journal of Psychoeducational Assessment, 2021
The present study compares the fit of two- and three-parameter logistic (2PL and 3PL) models of item response theory in the performance of preschool children on the Raven's Colored Progressive Matrices. The test of Raven is widely used for evaluating nonverbal intelligence of factor g. Studies comparing models with real data are scarce on the…
Descriptors: Guessing (Tests), Item Response Theory, Test Validity, Preschool Children
Salem, Alexandra C.; Gale, Robert; Casilio, Marianne; Fleegle, Mikala; Fergadiotis, Gerasimos; Bedrick, Steven – Journal of Speech, Language, and Hearing Research, 2023
Purpose: ParAlg (Paraphasia Algorithms) is a software that automatically categorizes a person with aphasia's naming error (paraphasia) in relation to its intended target on a picture-naming test. These classifications (based on lexicality as well as semantic, phonological, and morphological similarity to the target) are important for…
Descriptors: Semantics, Computer Software, Aphasia, Classification
Chu, Wei; Pavlik, Philip I., Jr. – International Educational Data Mining Society, 2023
In adaptive learning systems, various models are employed to obtain the optimal learning schedule and review for a specific learner. Models of learning are used to estimate the learner's current recall probability by incorporating features or predictors proposed by psychological theory or empirically relevant to learners' performance. Logistic…
Descriptors: Reaction Time, Accuracy, Models, Predictor Variables
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation

Peer reviewed
Direct link
