Publication Date
| In 2026 | 1 |
| Since 2025 | 6 |
| Since 2022 (last 5 years) | 35 |
| Since 2017 (last 10 years) | 91 |
| Since 2007 (last 20 years) | 303 |
Descriptor
| Classification | 432 |
| Hypothesis Testing | 123 |
| Computer Assisted Testing | 122 |
| Testing | 86 |
| Foreign Countries | 84 |
| Models | 64 |
| Comparative Analysis | 57 |
| Test Items | 53 |
| Statistical Analysis | 51 |
| Evaluation Methods | 50 |
| Accuracy | 49 |
| More ▼ | |
Source
Author
| Wang, Wen-Chung | 4 |
| Fields, Lanny | 3 |
| Sinharay, Sandip | 3 |
| Thompson, Nathan A. | 3 |
| Abedi, Jamal | 2 |
| Bell, Raoul | 2 |
| Buchner, Axel | 2 |
| Chen, Ping | 2 |
| Chin-Parker, Seth | 2 |
| Chung, Hyewon | 2 |
| Cui, Ying | 2 |
| More ▼ | |
Publication Type
Education Level
Location
| Australia | 9 |
| Canada | 7 |
| China | 7 |
| United Kingdom | 6 |
| United States | 6 |
| Germany | 5 |
| New York | 5 |
| United Kingdom (England) | 5 |
| California | 4 |
| Georgia | 4 |
| Greece | 4 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 5 |
| Every Student Succeeds Act… | 1 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Amanda A. Wolkowitz; Russell Smith – Practical Assessment, Research & Evaluation, 2024
A decision consistency (DC) index is an estimate of the consistency of a classification decision on an exam. More specifically, DC estimates the percentage of examinees that would have the same classification decision on an exam if they were to retake the same or a parallel form of the exam again without memory of taking the exam the first time.…
Descriptors: Testing, Test Reliability, Replication (Evaluation), Decision Making
V. N. Vimal Rao; Jeffrey K. Bye; Sashank Varma – Cognitive Research: Principles and Implications, 2024
The 0.05 boundary within Null Hypothesis Statistical Testing (NHST) "has made a lot of people very angry and been widely regarded as a bad move" (to quote Douglas Adams). Here, we move past meta-scientific arguments and ask an empirical question: What is the psychological standing of the 0.05 boundary for statistical significance? We…
Descriptors: Psychological Patterns, Statistical Analysis, Testing, Statistical Significance
Coggeshall, Whitney Smiley – Educational Measurement: Issues and Practice, 2021
The continuous testing framework, where both successful and unsuccessful examinees have to demonstrate continued proficiency at frequent prespecified intervals, is a framework that is used in noncognitive assessment and is gaining in popularity in cognitive assessment. Despite the rigorous advantages of this framework, this paper demonstrates that…
Descriptors: Classification, Accuracy, Testing, Failure
Kang, Yewon; Ha, Hyorim; Lee, Hee Seung – Educational Psychology Review, 2023
Natural category learning is important in science education. One strategy that has been empirically supported for enhancing category learning is testing, which facilitates not only the learning of previously studied information (backward testing effect) but also the learning of newly studied information (forward testing effect). However, in…
Descriptors: Science Education, Science Tests, Testing, Classification
Kayla V. Campaña; Benjamin G. Solomon – Assessment for Effective Intervention, 2025
The purpose of this study was to compare the classification accuracy of data produced by the previous year's end-of-year New York state assessment, a computer-adaptive diagnostic assessment ("i-Ready"), and the gating combination of both assessments to predict the rate of students passing the following year's end-of-year state assessment…
Descriptors: Accuracy, Classification, Diagnostic Tests, Adaptive Testing
Sinharay, Sandip – Educational and Psychological Measurement, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests.…
Descriptors: Mastery Tests, Computer Assisted Testing, Probability, Test Wiseness
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Putnikovic, Marko; Jovanovic, Jelena – IEEE Transactions on Learning Technologies, 2023
Automatic grading of short answers is an important task in computer-assisted assessment (CAA). Recently, embeddings, as semantic-rich textual representations, have been increasingly used to represent short answers and predict the grade. Despite the recent trend of applying embeddings in automatic short answer grading (ASAG), there are no…
Descriptors: Automation, Computer Assisted Testing, Grading, Natural Language Processing
Park, Seohee; Kim, Kyung Yong; Lee, Won-Chan – Journal of Educational Measurement, 2023
Multiple measures, such as multiple content domains or multiple types of performance, are used in various testing programs to classify examinees for screening or selection. Despite the popular usages of multiple measures, there is little research on classification consistency and accuracy of multiple measures. Accordingly, this study introduces an…
Descriptors: Testing, Computation, Classification, Accuracy
Lim, Hwanggyu; Davey, Tim; Wells, Craig S. – Journal of Educational Measurement, 2021
This study proposed a recursion-based analytical approach to assess measurement precision of ability estimation and classification accuracy in multistage adaptive tests (MSTs). A simulation study was conducted to compare the proposed recursion-based analytical method with an analytical method proposed by Park, Kim, Chung, and Dodd and with the…
Descriptors: Adaptive Testing, Measurement, Accuracy, Classification
Jonathan Liu; Seth Poulsen; Erica Goodwin; Hongxuan Chen; Grace Williams; Yael Gertner; Diana Franklin – ACM Transactions on Computing Education, 2025
Algorithm design is a vital skill developed in most undergraduate Computer Science (CS) programs, but few research studies focus on pedagogy related to algorithms coursework. To understand the work that has been done in the area, we present a systematic survey and literature review of CS Education studies. We search for research that is both…
Descriptors: Teaching Methods, Algorithms, Design, Computer Science Education
Daniel Corral; Shana K. Carpenter – Cognitive Research: Principles and Implications, 2024
We report six experiments that examine how two essential components of a category-learning paradigm, training and feedback, can be manipulated to maximize learning and transfer of real-world, complex concepts. Some subjects learned through classification and were asked to classify hypothetical experiment scenarios as either true or non-true…
Descriptors: Concept Formation, Teaching Methods, Observational Learning, Classification
Yang Du; Susu Zhang – Journal of Educational and Behavioral Statistics, 2025
Item compromise has long posed challenges in educational measurement, jeopardizing both test validity and test security of continuous tests. Detecting compromised items is therefore crucial to address this concern. The present literature on compromised item detection reveals two notable gaps: First, the majority of existing methods are based upon…
Descriptors: Item Response Theory, Item Analysis, Bayesian Statistics, Educational Assessment
Beechey, Timothy – Journal of Speech, Language, and Hearing Research, 2023
Purpose: This article provides a tutorial introduction to ordinal pattern analysis, a statistical analysis method designed to quantify the extent to which hypotheses of relative change across experimental conditions match observed data at the level of individuals. This method may be a useful addition to familiar parametric statistical methods…
Descriptors: Hypothesis Testing, Multivariate Analysis, Data Analysis, Statistical Inference
Dalia Khairy; Nouf Alharbi; Mohamed A. Amasha; Marwa F. Areed; Salem Alkhalaf; Rania A. Abougalala – Education and Information Technologies, 2024
Student outcomes are of great importance in higher education institutions. Accreditation bodies focus on them as an indicator to measure the performance and effectiveness of the institution. Forecasting students' academic performance is crucial for every educational establishment seeking to enhance performance and perseverance of its students and…
Descriptors: Prediction, Tests, Scores, Information Retrieval

Peer reviewed
Direct link
