Publication Date
| In 2026 | 0 |
| Since 2025 | 10 |
Descriptor
| Scoring | 10 |
| Test Reliability | 10 |
| Test Validity | 6 |
| Scores | 5 |
| Item Response Theory | 4 |
| Test Items | 4 |
| Error of Measurement | 3 |
| Foreign Countries | 3 |
| Test Construction | 3 |
| Testing | 3 |
| Comparative Analysis | 2 |
| More ▼ | |
Source
Author
| Amanda Hut | 1 |
| Barbara Bruno | 1 |
| Benjamin W. Domingue | 1 |
| Boris Forthmann | 1 |
| Doris Lee | 1 |
| Esmat Babaii | 1 |
| Estefanía Martín-Barroso | 1 |
| Farshad Effatpanah | 1 |
| Francesco Mondada | 1 |
| G. Thomas Schanding Jr. | 1 |
| James G. Soland | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 8 |
| Reports - Research | 8 |
| Guides - Non-Classroom | 1 |
| Information Analyses | 1 |
| Reports - Descriptive | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Higher Education | 3 |
| Postsecondary Education | 3 |
| Elementary Education | 2 |
| Early Childhood Education | 1 |
| High Schools | 1 |
| Kindergarten | 1 |
| Primary Education | 1 |
| Secondary Education | 1 |
Audience
Location
| Austria | 1 |
| Germany | 1 |
| Iran | 1 |
| Switzerland | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| ACT Assessment | 1 |
| Program for International… | 1 |
| Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Janika Saretzki; Rosalie Andrae; Boris Forthmann; Mathias Benedek – Journal of Creative Behavior, 2025
Divergent thinking (DT) ability is widely regarded as a central cognitive capacity underlying creativity, but its assessment is challenged by the fact that DT tasks yield a variable number of responses. Various approaches for the scoring of DT tasks have been proposed, which differ in how responses are evaluated and aggregated within a task. The…
Descriptors: Creative Thinking, Creativity Tests, Scoring, Metacognition
Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025
Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…
Descriptors: Value Added Models, Tests, Testing, Scoring
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…
Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory
Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025
In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…
Descriptors: Automation, Grading, Computer Assisted Testing, Scoring
Katherine L. Buchanan; Milena Keller-Margulis; Amanda Hut; Weihua Fan; Sarah S. Mire; G. Thomas Schanding Jr. – Early Childhood Education Journal, 2025
There is considerable research regarding measures of early reading but much less in early writing. Nevertheless, writing is a critical skill for success in school and early difficulties in writing are likely to persist without intervention. A necessary step toward identifying those students who need additional support is the use of screening…
Descriptors: Writing Evaluation, Evaluation Methods, Emergent Literacy, Beginning Writing
Reuben S. Asempapa; Doris Lee – Discover Education, 2025
Across the world, standards and practices for preparing teachers of mathematics emphasize the importance of math modeling (MM) in developing students' mathematical thinking. The aim of this research study was to develop the Mathematical Modeling Knowledge Scale (MAMKS), capable of determining preservice teachers' (PSTs') knowledge of MM. The study…
Descriptors: Preservice Teachers, Preservice Teacher Education, Mathematics Education, Mathematics Curriculum
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Jeff Allen; Ty Cruce – ACT Education Corp., 2025
This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…
Descriptors: College Entrance Examinations, Testing, Change, Scores
Laila El-Hamamsy; María Zapata-Cáceres; Estefanía Martín-Barroso; Francesco Mondada; Jessica Dehler Zufferey; Barbara Bruno; Marcos Román-González – Technology, Knowledge and Learning, 2025
The introduction of computing education into curricula worldwide requires multi-year assessments to evaluate the long-term impact on learning. However, no single Computational Thinking (CT) assessment spans primary school, and no group of CT assessments provides a means of transitioning between instruments. This study therefore investigated…
Descriptors: Cognitive Tests, Computation, Thinking Skills, Test Validity
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Peer reviewed
Direct link
