Publication Date
| In 2026 | 0 |
| Since 2025 | 4 |
| Since 2022 (last 5 years) | 16 |
| Since 2017 (last 10 years) | 35 |
| Since 2007 (last 20 years) | 59 |
Descriptor
| Accuracy | 59 |
| Test Length | 59 |
| Test Items | 38 |
| Item Response Theory | 35 |
| Sample Size | 23 |
| Computer Assisted Testing | 21 |
| Computation | 20 |
| Adaptive Testing | 18 |
| Simulation | 15 |
| Correlation | 14 |
| Classification | 13 |
| More ▼ | |
Source
Author
| Cheng, Ying | 4 |
| Wang, Chun | 3 |
| Baris Pekmezci, Fulya | 2 |
| Bradshaw, Laine | 2 |
| He, Wei | 2 |
| Huggins-Manley, Anne Corinne | 2 |
| Hull, Michael M. | 2 |
| Lathrop, Quinn N. | 2 |
| Mae, Naohiro | 2 |
| Svetina, Dubravka | 2 |
| Wolfe, Edward W. | 2 |
| More ▼ | |
Publication Type
| Reports - Research | 46 |
| Journal Articles | 45 |
| Dissertations/Theses -… | 10 |
| Reports - Evaluative | 3 |
| Speeches/Meeting Papers | 2 |
| Numerical/Quantitative Data | 1 |
Education Level
Audience
Laws, Policies, & Programs
Assessments and Surveys
| Program for International… | 2 |
| Force Concept Inventory | 1 |
| MacArthur Communicative… | 1 |
| National Assessment of… | 1 |
| Trends in International… | 1 |
What Works Clearinghouse Rating
Hasibe Yahsi Sari; Hulya Kelecioglu – International Journal of Assessment Tools in Education, 2025
The aim of the study is to examine the effect of polytomous item ratio on ability estimation in different conditions in multistage tests (MST) using mixed tests. The study is simulation-based research. In the PISA 2018 application, the ability parameters of the individuals and the item pool were created by using the item parameters estimated from…
Descriptors: Test Items, Test Format, Accuracy, Test Length
Sun, Ting; Kim, Stella Yun – Measurement: Interdisciplinary Research and Perspectives, 2021
In many large testing programs, equipercentile equating has been widely used under a random groups design to adjust test difficulty between forms. However, one thorny issue occurs with equipercentile equating when a particular score has no observed frequency. The purpose of this study is to suggest and evaluate six potential methods in…
Descriptors: Equated Scores, Test Length, Sample Size, Methods
Jing Ma – ProQuest LLC, 2024
This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…
Descriptors: Scoring, Adaptive Testing, Test Items, Classification
Edwards, Ashley A.; Joyner, Keanan J.; Schatschneider, Christopher – Educational and Psychological Measurement, 2021
The accuracy of certain internal consistency estimators have been questioned in recent years. The present study tests the accuracy of six reliability estimators (Cronbach's alpha, omega, omega hierarchical, Revelle's omega, and greatest lower bound) in 140 simulated conditions of unidimensional continuous data with uncorrelated errors with varying…
Descriptors: Reliability, Computation, Accuracy, Sample Size
Xiuxiu Tang; Yi Zheng; Tong Wu; Kit-Tai Hau; Hua-Hua Chang – Journal of Educational Measurement, 2025
Multistage adaptive testing (MST) has been recently adopted for international large-scale assessments such as Programme for International Student Assessment (PISA). MST offers improved measurement efficiency over traditional nonadaptive tests and improved practical convenience over single-item-adaptive computerized adaptive testing (CAT). As a…
Descriptors: Reaction Time, Test Items, Achievement Tests, Foreign Countries
Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023
We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…
Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…
Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy
Fatih Orçan – International Journal of Assessment Tools in Education, 2025
Factor analysis is a statistical method to explore the relationships among observed variables and identify latent structures. It is crucial in scale development and validity analysis. Key factors affecting the accuracy of factor analysis results include the type of data, sample size, and the number of response categories. While some studies…
Descriptors: Factor Analysis, Factor Structure, Item Response Theory, Sample Size
Matayoshi, Jeffrey; Cosyn, Eric; Uzun, Hasan – International Journal of Artificial Intelligence in Education, 2021
Many recent studies have looked at the viability of applying recurrent neural networks (RNNs) to educational data. In most cases, this is done by comparing their performance to existing models in the artificial intelligence in education (AIED) and educational data mining (EDM) fields. While there is increasing evidence that, in many situations,…
Descriptors: Artificial Intelligence, Data Analysis, Student Evaluation, Adaptive Testing
Lin, Yin; Brown, Anna; Williams, Paul – Educational and Psychological Measurement, 2023
Several forced-choice (FC) computerized adaptive tests (CATs) have emerged in the field of organizational psychology, all of them employing ideal-point items. However, despite most items developed historically follow dominance response models, research on FC CAT using dominance items is limited. Existing research is heavily dominated by…
Descriptors: Measurement Techniques, Computer Assisted Testing, Adaptive Testing, Industrial Psychology
Pasquale Anselmi; Jürgen Heller; Luca Stefanutti; Egidio Robusto; Giulia Barillari – Education and Information Technologies, 2025
Competence-based test development (CbTD) is a novel method for constructing tests that are as informative as possible about the competence state (the set of skills an individual masters) underlying the item responses. If desired, the tests can also be minimal, meaning that no item can be eliminated without reducing their informativeness. To…
Descriptors: Competency Based Education, Test Construction, Test Length, Usability
Cheng, Ying; Shao, Can – Educational and Psychological Measurement, 2022
Computer-based and web-based testing have become increasingly popular in recent years. Their popularity has dramatically expanded the availability of response time data. Compared to the conventional item response data that are often dichotomous or polytomous, response time has the advantage of being continuous and can be collected in an…
Descriptors: Reaction Time, Test Wiseness, Computer Assisted Testing, Simulation
Nikola Ebenbeck; Markus Gebhardt – Journal of Special Education Technology, 2024
Technologies that enable individualization for students have significant potential in special education. Computerized Adaptive Testing (CAT) refers to digital assessments that automatically adjust their difficulty level based on students' abilities, allowing for personalized, efficient, and accurate measurement. This article examines whether CAT…
Descriptors: Computer Assisted Testing, Students with Disabilities, Special Education, Grade 3
Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022
The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…
Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms

Peer reviewed
Direct link
