ERIC - Search Results

Publication Date

In 2026	0
Since 2025	4
Since 2022 (last 5 years)	16
Since 2017 (last 10 years)	35
Since 2007 (last 20 years)	59

Descriptor

Accuracy	59
Test Length	59
Test Items	38
Item Response Theory	35
Sample Size	23
Computer Assisted Testing	21
Computation	20
Adaptive Testing	18
Simulation	15
Correlation	14
Classification	13
Monte Carlo Methods	13
Comparative Analysis	12
Measurement	10
Models	9
Ability	8
Foreign Countries	8
Test Bias	8
Test Construction	8
Error of Measurement	7
Goodness of Fit	7
Item Banks	7
Reliability	7
Bayesian Statistics	6
Difficulty Level	6
More ▼

Publication Type

Reports - Research	46
Journal Articles	45
Dissertations/Theses -…	10
Reports - Evaluative	3
Speeches/Meeting Papers	2
Numerical/Quantitative Data	1

Education Level

Higher Education	4
Postsecondary Education	4
Secondary Education	4
Early Childhood Education	2
Elementary Education	2
High Schools	2
Elementary Secondary Education	1
Grade 3	1
Junior High Schools	1
Middle Schools	1
Preschool Education	1
Primary Education	1
More ▼

Audience

Location

Japan	2
Germany	1
Michigan	1
Turkey	1
Ukraine	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Force Concept Inventory	1
MacArthur Communicative…	1
National Assessment of…	1
Trends in International…	1

What Works Clearinghouse Rating

Accuracy X

Showing 1 to 15 of 59 results Save | Export

The Effect of Polytomous Item Ratio on Ability Estimation in Multistage Tests

Peer reviewed
PDF on ERIC

Download full text

Hasibe Yahsi Sari; Hulya Kelecioglu – International Journal of Assessment Tools in Education, 2025

The aim of the study is to examine the effect of polytomous item ratio on ability estimation in different conditions in multistage tests (MST) using mixed tests. The study is simulation-based research. In the PISA 2018 application, the ability parameters of the individuals and the item pool were created by using the item parameters estimated from…

Descriptors: Test Items, Test Format, Accuracy, Test Length

Evaluating Six Approaches to Handling Zero-Frequency Scores under Equipercentile Equating

Peer reviewed

Direct link

Sun, Ting; Kim, Stella Yun – Measurement: Interdisciplinary Research and Perspectives, 2021

In many large testing programs, equipercentile equating has been widely used under a random groups design to adjust test difficulty between forms. However, one thorny issue occurs with equipercentile equating when a particular score has no observed frequency. The purpose of this study is to suggest and evaluate six potential methods in…

Descriptors: Equated Scores, Test Length, Sample Size, Methods

The Impact of Scoring Later on Mixed Format Adaptive Testing

Direct link

Jing Ma – ProQuest LLC, 2024

This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…

Descriptors: Scoring, Adaptive Testing, Test Items, Classification

A Simulation Study on the Performance of Different Reliability Estimation Methods

Peer reviewed

Direct link

Edwards, Ashley A.; Joyner, Keanan J.; Schatschneider, Christopher – Educational and Psychological Measurement, 2021

The accuracy of certain internal consistency estimators have been questioned in recent years. The present study tests the accuracy of six reliability estimators (Cronbach's alpha, omega, omega hierarchical, Revelle's omega, and greatest lower bound) in 140 simulated conditions of unidimensional continuous data with uncorrelated errors with varying…

Descriptors: Reliability, Computation, Accuracy, Sample Size

Utilizing Response Time for Item Selection in On-the-Fly Multistage Adaptive Testing for PISA Assessment

Peer reviewed

Direct link

Xiuxiu Tang; Yi Zheng; Tong Wu; Kit-Tai Hau; Hua-Hua Chang – Journal of Educational Measurement, 2025

Multistage adaptive testing (MST) has been recently adopted for international large-scale assessments such as Programme for International Student Assessment (PISA). MST offers improved measurement efficiency over traditional nonadaptive tests and improved practical convenience over single-item-adaptive computerized adaptive testing (CAT). As a…

Descriptors: Reaction Time, Test Items, Achievement Tests, Foreign Countries

Accuracy and Sensitivity of Coefficient Alpha and Its Alternatives with Unidimensional and Contaminated Scales

Peer reviewed

Direct link

Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023

We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…

Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

A Note on Improving Variational Estimation for Multidimensional Item Response Theory

Peer reviewed

Direct link

Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…

Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy

Exploring Number of Response Categories in Factor Analysis: Implications for Sample Size

Peer reviewed
PDF on ERIC

Download full text

Fatih Orçan – International Journal of Assessment Tools in Education, 2025

Factor analysis is a statistical method to explore the relationships among observed variables and identify latent structures. It is crucial in scale development and validity analysis. Key factors affecting the accuracy of factor analysis results include the type of data, sample size, and the number of response categories. While some studies…

Descriptors: Factor Analysis, Factor Structure, Item Response Theory, Sample Size

Are We There Yet? Evaluating the Effectiveness of a Recurrent Neural Network-Based Stopping Algorithm for an Adaptive Assessment

Peer reviewed

Direct link

Matayoshi, Jeffrey; Cosyn, Eric; Uzun, Hasan – International Journal of Artificial Intelligence in Education, 2021

Many recent studies have looked at the viability of applying recurrent neural networks (RNNs) to educational data. In most cases, this is done by comparing their performance to existing models in the artificial intelligence in education (AIED) and educational data mining (EDM) fields. While there is increasing evidence that, in many situations,…

Descriptors: Artificial Intelligence, Data Analysis, Student Evaluation, Adaptive Testing

Multidimensional Forced-Choice CAT with Dominance Items: An Empirical Comparison with Optimal Static Testing under Different Desirability Matching

Peer reviewed

Direct link

Lin, Yin; Brown, Anna; Williams, Paul – Educational and Psychological Measurement, 2023

Several forced-choice (FC) computerized adaptive tests (CATs) have emerged in the field of organizational psychology, all of them employing ideal-point items. However, despite most items developed historically follow dominance response models, research on FC CAT using dominance items is limited. Existing research is heavily dominated by…

Descriptors: Measurement Techniques, Computer Assisted Testing, Adaptive Testing, Industrial Psychology

Real-Life Applications of Competence-Based Test Development to the Construction, Improvement, and Shortening of Tests

Peer reviewed

Direct link

Pasquale Anselmi; Jürgen Heller; Luca Stefanutti; Egidio Robusto; Giulia Barillari – Education and Information Technologies, 2025

Competence-based test development (CbTD) is a novel method for constructing tests that are as informative as possible about the competence state (the set of skills an individual masters) underlying the item responses. If desired, the tests can also be minimal, meaning that no item can be eliminated without reducing their informativeness. To…

Descriptors: Competency Based Education, Test Construction, Test Length, Usability

Application of Change Point Analysis of Response Time Data to Detect Test Speededness

Peer reviewed

Direct link

Cheng, Ying; Shao, Can – Educational and Psychological Measurement, 2022

Computer-based and web-based testing have become increasingly popular in recent years. Their popularity has dramatically expanded the availability of response time data. Compared to the conventional item response data that are often dichotomous or polytomous, response time has the advantage of being continuous and can be collected in an…

Descriptors: Reaction Time, Test Wiseness, Computer Assisted Testing, Simulation

Differential Performance of Computerized Adaptive Testing in Students with and without Disabilities -- A Simulation Study

Peer reviewed

Direct link

Nikola Ebenbeck; Markus Gebhardt – Journal of Special Education Technology, 2024

Technologies that enable individualization for students have significant potential in special education. Computerized Adaptive Testing (CAT) refers to digital assessments that automatically adjust their difficulty level based on students' abilities, allowing for personalized, efficient, and accurate measurement. This article examines whether CAT…

Descriptors: Computer Assisted Testing, Students with Disabilities, Special Education, Grade 3

The Comparison of Estimation Methods for the Four-Parameter Logistic Item Response Theory Model

Peer reviewed

Direct link

Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022

The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…

Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Educational and Psychological…	13
ProQuest LLC	10
Applied Psychological…	4
International Journal of…	4
Journal of Educational…	3
Measurement:…	3
Applied Measurement in…	2
Grantee Submission	2
Physical Review Physics…	2
Advanced Education	1
ETS Research Report Series	1
Education Sciences	1
Education and Information…	1
Educational Measurement:…	1
Educational Sciences: Theory…	1
Eurasian Journal of…	1
International Educational…	1
International Journal of…	1
International Journal of…	1
Journal of Educational and…	1
Journal of Experimental…	1
Journal of Special Education…	1
Journal of Speech, Language,…	1
Pearson	1
Universal Journal of…	1
More ▼

Cheng, Ying	4
Wang, Chun	3
Baris Pekmezci, Fulya	2
Bradshaw, Laine	2
He, Wei	2
Huggins-Manley, Anne Corinne	2
Hull, Michael M.	2
Lathrop, Quinn N.	2
Mae, Naohiro	2
Svetina, Dubravka	2
Wolfe, Edward W.	2
Yasuda, Jun-ichiro	2
Allan S. Cohen	1
Anil, Duygu	1
Arikan, Serkan	1
Aybek, Eren Can	1
Bao, Yu	1
Bleses, Dorthe	1
Boughton, Keith A.	1
Brown, Anna	1
Chang, Hua-Hua	1
Chenchen Ma	1
Chien, Yuehmei	1
Chun Wang	1
Cosyn, Eric	1
More ▼