ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	6
Since 2007 (last 20 years)	7

Descriptor

Classification	8
Accuracy	6
Testing	4
Computer Assisted Testing	3
Adaptive Testing	2
Cognitive Tests	2
Comparative Analysis	2
Computation	2
Decision Making	2
Diagnostic Tests	2
Item Response Theory	2
Models	2
Monte Carlo Methods	2
Simulation	2
Test Items	2
Alternative Assessment	1
Artificial Intelligence	1
Clinical Diagnosis	1
Cognitive Measurement	1
Cognitive Structures	1
Computational Linguistics	1
Computer Games	1
Computer Security	1
Computer Software	1
Error Patterns	1
More ▼

Source

Journal of Educational…

Author

Chen, Ping	2
Ding, Shuliang	2
Song, Lihong	2
Wang, Wenyi	2
Alex J. Mechaber	1
Brian E. Clauser	1
Cai, Yan	1
Davey, Tim	1
Huang, Hung-Yu	1
Kai North	1
Kalohn, John C.	1
Kim, Kyung Yong	1
Le An Ha	1
Lee, Won-Chan	1
Lim, Hwanggyu	1
Liu, Shuchang	1
Meng, Yaru	1
Park, Seohee	1
Peter Baldwin	1
Spray, Judith A.	1
Tu, Dongbo	1
Victoria Yaneva	1
Wells, Craig S.	1
Yiyun Zhou	1
More ▼

Publication Type

Journal Articles	8
Reports - Research	7
Reports - Evaluative	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

The Vulnerability of AI-Based Scoring Systems to Gaming Strategies: A Case Study

Peer reviewed

Direct link

Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025

Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…

Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy

Estimating Classification Accuracy and Consistency Indices for Multiple Measures with the Simple Structure MIRT Model

Peer reviewed

Direct link

Park, Seohee; Kim, Kyung Yong; Lee, Won-Chan – Journal of Educational Measurement, 2023

Multiple measures, such as multiple content domains or multiple types of performance, are used in various testing programs to classify examinees for screening or selection. Despite the popular usages of multiple measures, there is little research on classification consistency and accuracy of multiple measures. Accordingly, this study introduces an…

Descriptors: Testing, Computation, Classification, Accuracy

A Recursion-Based Analytical Approach to Evaluate the Performance of MST

Peer reviewed

Direct link

Lim, Hwanggyu; Davey, Tim; Wells, Craig S. – Journal of Educational Measurement, 2021

This study proposed a recursion-based analytical approach to assess measurement precision of ability estimation and classification accuracy in multistage adaptive tests (MSTs). A simulation study was conducted to compare the proposed recursion-based analytical method with an analytical method proposed by Park, Kim, Chung, and Dodd and with the…

Descriptors: Adaptive Testing, Measurement, Accuracy, Classification

On-the-Fly Constraint-Controlled Assembly Methods for Multistage Adaptive Testing for Cognitive Diagnosis

Peer reviewed

Direct link

Liu, Shuchang; Cai, Yan; Tu, Dongbo – Journal of Educational Measurement, 2018

This study applied the mode of on-the-fly assembled multistage adaptive testing to cognitive diagnosis (CD-OMST). Several and several module assembly methods for CD-OMST were proposed and compared in terms of measurement precision, test security, and constrain management. The module assembly methods in the study included the maximum priority index…

Descriptors: Adaptive Testing, Monte Carlo Methods, Computer Security, Clinical Diagnosis

An Item-Level Expected Classification Accuracy and Its Applications in Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Wang, Wenyi; Song, Lihong; Chen, Ping; Ding, Shuliang – Journal of Educational Measurement, 2019

Most of the existing classification accuracy indices of attribute patterns lose effectiveness when the response data is absent in diagnostic testing. To handle this issue, this article proposes new indices to predict the correct classification rate of a diagnostic test before administering the test under the deterministic noise input…

Descriptors: Cognitive Tests, Classification, Accuracy, Diagnostic Tests

Multilevel Cognitive Diagnosis Models for Assessing Changes in Latent Attributes

Peer reviewed

Direct link

Huang, Hung-Yu – Journal of Educational Measurement, 2017

Cognitive diagnosis models (CDMs) have been developed to evaluate the mastery status of individuals with respect to a set of defined attributes or skills that are measured through testing. When individuals are repeatedly administered a cognitive diagnosis test, a new class of multilevel CDMs is required to assess the changes in their attributes…

Descriptors: Testing, Cognitive Measurement, Test Items, Classification

Attribute-Level and Pattern-Level Classification Consistency and Accuracy Indices for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang – Journal of Educational Measurement, 2015

Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…

Descriptors: Classification, Reliability, Accuracy, Cognitive Tests

The Effect of Model Misspecification on Classification Decisions Made Using a Computerized Test.

Peer reviewed

Kalohn, John C.; Spray, Judith A. – Journal of Educational Measurement, 1999

Examined the effects of model misspecification on the precision of decisions made using the sequential probability ratio test (SPRT) in computer testing. Simulation results show that the one-parameter logistic model produced more errors than the true model. (SLD)

Descriptors: Classification, Computer Assisted Testing, Decision Making, Models