ERIC - Search Results

Publication Date

In 2026	0
Since 2025	12
Since 2022 (last 5 years)	23
Since 2017 (last 10 years)	36
Since 2007 (last 20 years)	57

Descriptor

Computer Assisted Testing	119
Adaptive Testing	59
Test Items	57
Test Construction	36
Item Response Theory	33
Simulation	24
Comparative Analysis	20
Item Banks	18
Scoring	17
Automation	15
Scores	15
Higher Education	13
Psychometrics	12
Test Format	12
College Students	11
Item Analysis	11
Models	11
Accuracy	9
Difficulty Level	9
Comparative Testing	8
Evaluation Methods	8
Statistical Analysis	8
Test Length	8
College Entrance Examinations	7
Computer Simulation	7
More ▼

Source

Journal of Educational…

119

Publication Type

Journal Articles	119
Reports - Research	72
Reports - Evaluative	34
Reports - Descriptive	8
Speeches/Meeting Papers	7
Book/Product Reviews	3
Information Analyses	2
Opinion Papers	1

Education Level

Higher Education	2
Secondary Education	2
Elementary Education	1
Postsecondary Education	1

Audience

Researchers

Location

United Kingdom

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	4
Indiana Statewide Testing for…	2
Program for International…	2
Advanced Placement…	1
Law School Admission Test	1

What Works Clearinghouse Rating

Showing 1 to 15 of 119 results Save | Export

Identifying Features Contributing to Differential Prediction Bias of Automated Scoring Systems

Peer reviewed

Direct link

Ikkyu Choi; Matthew S. Johnson – Journal of Educational Measurement, 2025

Automated scoring systems provide multiple benefits but also pose challenges, notably potential bias. Various methods exist to evaluate these algorithms and their outputs for bias. Upon detecting bias, the next logical step is to investigate its cause, often by examining feature distributions. Recently, Johnson and McCaffrey proposed an…

Descriptors: Prediction, Bias, Automation, Scoring

Influence of Intersectional Routing Modules between Dimensions on Measurement Precision in Multidimensional Multistage Testing

Peer reviewed

Direct link

Yi-Ling Wu; Yao-Hsuan Huang; Chia-Wen Chen; Po-Hsi Chen – Journal of Educational Measurement, 2025

Multistage testing (MST), a variant of computerized adaptive testing (CAT), differs from conventional CAT in that it is adapted at the module level rather than at the individual item level. Typically, all examinees begin the MST with a linear test form in the first stage, commonly known as the routing stage. In 2020, Han introduced an innovative…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Measurement

Detecting Group Collaboration Using Multiple Correspondence Analysis

Peer reviewed

Direct link

Grochowalski, Joseph H.; Hendrickson, Amy – Journal of Educational Measurement, 2023

Test takers wishing to gain an unfair advantage often share answers with other test takers, either sharing all answers (a full key) or some (a partial key). Detecting key sharing during a tight testing window requires an efficient, easily interpretable, and rich form of analysis that is descriptive and inferential. We introduce a detection method…

Descriptors: Identification, Cooperative Learning, Cheating, Statistical Analysis

Using Response Time in Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

He, Yinhong; Qi, Yuanyuan – Journal of Educational Measurement, 2023

In multidimensional computerized adaptive testing (MCAT), item selection strategies are generally constructed based on responses, and they do not consider the response times required by items. This study constructed two new criteria (referred to as DT-inc and DT) for MCAT item selection by utilizing information from response times. The new designs…

Descriptors: Reaction Time, Adaptive Testing, Computer Assisted Testing, Test Items

Using GPT-4 to Augment Imbalanced Data for Automatic Scoring

Peer reviewed

Direct link

Luyang Fang; Gyeonggeon Lee; Xiaoming Zhai – Journal of Educational Measurement, 2025

Machine learning-based automatic scoring faces challenges with imbalanced student responses across scoring categories. To address this, we introduce a novel text data augmentation framework that leverages GPT-4, a generative large language model specifically tailored for imbalanced datasets in automatic scoring. Our experimental dataset consisted…

Descriptors: Computer Assisted Testing, Artificial Intelligence, Automation, Scoring

Two-Phase Content-Balancing CD-CAT Online Item Calibration

Peer reviewed

Direct link

Jing Huang; Yuxiao Zhang; Jason W. Morphew; Jayson M. Nissen; Ben Van Dusen; Hua Hua Chang – Journal of Educational Measurement, 2025

Online calibration estimates new item parameters alongside previously calibrated items, supporting efficient item replenishment. However, most existing online calibration procedures for Cognitive Diagnostic Computerized Adaptive Testing (CD-CAT) lack mechanisms to ensure content balance during live testing. This limitation can lead to uneven…

Descriptors: Adaptive Testing, Computer Assisted Testing, Cognitive Measurement, Test Items

A Generalized Objective Function for Computer Adaptive Item Selection

Peer reviewed

Direct link

Harold Doran; Testsuhiro Yamada; Ted Diaz; Emre Gonulates; Vanessa Culver – Journal of Educational Measurement, 2025

Computer adaptive testing (CAT) is an increasingly common mode of test administration offering improved test security, better measurement precision, and the potential for shorter testing experiences. This article presents a new item selection algorithm based on a generalized objective function to support multiple types of testing conditions and…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms

The Vulnerability of AI-Based Scoring Systems to Gaming Strategies: A Case Study

Peer reviewed

Direct link

Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025

Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…

Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy

Using Multiple Maximum Exposure Rates in Computerized Adaptive Testing

Peer reviewed

Direct link

Kylie Gorney; Mark D. Reckase – Journal of Educational Measurement, 2025

In computerized adaptive testing, item exposure control methods are often used to provide a more balanced usage of the item pool. Many of the most popular methods, including the restricted method (Revuelta and Ponsoda), use a single maximum exposure rate to limit the proportion of times that each item is administered. However, Barrada et al.…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks

Automatic Prompt Engineering for Automatic Scoring

Peer reviewed

Direct link

Mingfeng Xue; Yunting Liu; Xingyao Xiao; Mark Wilson – Journal of Educational Measurement, 2025

Prompts play a crucial role in eliciting accurate outputs from large language models (LLMs). This study examines the effectiveness of an automatic prompt engineering (APE) framework for automatic scoring in educational measurement. We collected constructed-response data from 930 students across 11 items and used human scores as the true labels. A…

Descriptors: Computer Assisted Testing, Prompting, Educational Assessment, Automation

Online Monitoring of Test-Taking Behavior Based on Item Responses and Response Times

Peer reviewed

Direct link

Han, Suhwa; Kang, Hyeon-Ah – Journal of Educational Measurement, 2023

The study presents multivariate sequential monitoring procedures for examining test-taking behaviors online. The procedures monitor examinee's responses and response times and signal aberrancy as soon as significant change is identifieddetected in the test-taking behavior. The study in particular proposes three schemes to track different…

Descriptors: Test Wiseness, Student Behavior, Item Response Theory, Computer Assisted Testing

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

Pretest Item Calibration in Computerized Multistage Adaptive Testing

Peer reviewed

Direct link

Ersen, Rabia Karatoprak; Lee, Won-Chan – Journal of Educational Measurement, 2023

The purpose of this study was to compare calibration and linking methods for placing pretest item parameter estimates on the item pool scale in a 1-3 computerized multistage adaptive testing design in terms of item parameter recovery. Two models were used: embedded-section, in which pretest items were administered within a separate module, and…

Descriptors: Pretesting, Test Items, Computer Assisted Testing, Adaptive Testing

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Vispoel, Walter P.	7
Bennett, Randy Elliot	5
Chang, Hua-Hua	5
Bridgeman, Brent	4
Rock, Donald A.	4
Wainer, Howard	4
van der Linden, Wim J.	4
Bleiler, Timothy	3
Stocking, Martha L.	3
Tatsuoka, Kikumi K.	3
Veldkamp, Bernard P.	3
Wang, Tianyou	3
Bejar, Isaac I.	2
Cai, Yan	2
Chen, Shu-Ying	2
Choi, Ikkyu	2
Choi, Seung W.	2
Finkelman, Matthew	2
Kang, Hyeon-Ah	2
Kim, Dong-In	2
Lewis, Charles	2
Li, Jie	2
Luecht, Richard M.	2
Morley, Mary	2
More ▼