ERIC - Search Results

Publication Date

In 2025	2
Since 2024	10
Since 2021 (last 5 years)	42
Since 2016 (last 10 years)	62
Since 2006 (last 20 years)	99

Descriptor

Scoring	99
Test Format	99
Test Items	46
Computer Assisted Testing	37
Test Validity	22
Test Reliability	20
Foreign Countries	18
Language Tests	16
Test Construction	16
Student Evaluation	15
Comparative Analysis	14
Mathematics Tests	14
Testing	14
Elementary School Students	13
Language Arts	13
Multiple Choice Tests	13
Responses	13
English (Second Language)	12
Psychometrics	12
Scores	12
Second Language Learning	12
Testing Accommodations	12
Test Preparation	11
Tests	11
Evaluation Methods	10
More ▼

Publication Type

Journal Articles	65
Reports - Research	53
Reports - Descriptive	16
Reports - Evaluative	13
Tests/Questionnaires	7
Dissertations/Theses -…	5
Guides - Classroom - Teacher	5
Guides - General	4
Guides - Non-Classroom	3
Speeches/Meeting Papers	3
Numerical/Quantitative Data	2
Books	1
Collected Works - General	1
Guides - Classroom - Learner	1
Information Analyses	1
Reference Materials -…	1
More ▼

Education Level

Higher Education	23
Postsecondary Education	20
Elementary Education	19
Secondary Education	17
Junior High Schools	10
Middle Schools	10
Elementary Secondary Education	8
Grade 8	7
Intermediate Grades	7
Grade 4	6
Early Childhood Education	5
Grade 3	5
Grade 5	5
High Schools	5
Primary Education	5
Grade 6	4
Grade 7	3
Kindergarten	2
Preschool Education	2
Adult Education	1
Grade 12	1
Two Year Colleges	1
More ▼

Audience

Administrators	7
Teachers	6
Students	2
Parents	1

Location

New York	8
Louisiana	4
United States	3
Asia	2
California	2
China	2
Turkey	2
United Kingdom (England)	2
Arizona	1
Australia	1
Austria	1
Bhutan	1
Cambodia	1
Canada	1
Colombia	1
Estonia	1
Europe	1
Finland	1
Germany	1
Hong Kong	1
India	1
Indonesia	1
Iran	1
Ireland	1
Israel	1
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

National Assessment of…	4
Test of English as a Foreign…	3
Trends in International…	3
ACT Assessment	1
Advanced Placement…	1
California Learning…	1
College Level Examination…	1
Computer Attitude Scale	1
Cornell Critical Thinking Test	1
Graduate Record Examinations	1
International English…	1
National Merit Scholarship…	1
Praxis Series	1
Preliminary Scholastic…	1
Program for International…	1
SAT (College Admission Test)	1
Torrance Tests of Creative…	1
Wechsler Intelligence Scale…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 99 results Save | Export

Detecting Rater Bias in Mixed-Format Assessments

Peer reviewed

Direct link

Stefanie A. Wind; Yuan Ge – Measurement: Interdisciplinary Research and Perspectives, 2024

Mixed-format assessments made up of multiple-choice (MC) items and constructed response (CR) items that are scored using rater judgments include unique psychometric considerations. When these item types are combined to estimate examinee achievement, information about the psychometric quality of each component can depend on that of the other. For…

Descriptors: Interrater Reliability, Test Bias, Multiple Choice Tests, Responses

Automated Scoring of Short-Answer Questions: A Progress Report

Peer reviewed

Direct link

Brian E. Clauser; Victoria Yaneva; Peter Baldwin; Le An Ha; Janet Mee – Applied Measurement in Education, 2024

Multiple-choice questions have become ubiquitous in educational measurement because the format allows for efficient and accurate scoring. Nonetheless, there remains continued interest in constructed-response formats. This interest has driven efforts to develop computer-based scoring procedures that can accurately and efficiently score these items.…

Descriptors: Computer Uses in Education, Artificial Intelligence, Scoring, Responses

To Score or Not to Score: Factors Influencing Performance and Feasibility of Automatic Content Scoring of Text Responses

Peer reviewed

Direct link

Zesch, Torsten; Horbach, Andrea; Zehner, Fabian – Educational Measurement: Issues and Practice, 2023

In this article, we systematize the factors influencing performance and feasibility of automatic content scoring methods for short text responses. We argue that performance (i.e., how well an automatic system agrees with human judgments) mainly depends on the linguistic variance seen in the responses and that this variance is indirectly influenced…

Descriptors: Influences, Academic Achievement, Feasibility Studies, Automation

Automated Scoring of Figural Tests of Creativity with Computer Vision

Peer reviewed

Direct link

Selcuk Acar; Peter Organisciak; Denis Dumas – Journal of Creative Behavior, 2025

In this three-study investigation, we applied various approaches to score drawings created in response to both Form A and Form B of the Torrance Tests of Creative Thinking-Figural (broadly TTCT-F) as well as the Multi-Trial Creative Ideation task (MTCI). We focused on TTCT-F in Study 1, and utilizing a random forest classifier, we achieved 79% and…

Descriptors: Scoring, Computer Assisted Testing, Models, Correlation

Do Scoring Techniques and Number of Choices Affect the Reliability of Multiple-Choice Tests in Elementary Schools?

Peer reviewed
PDF on ERIC

Download full text

Herwin, Herwin; Pristiwaluyo, Triyanto; Ruslan, Ruslan; Dahalan, Shakila Che – Cypriot Journal of Educational Sciences, 2022

The application of multiple-choice tests often does not consider the scoring technique and the number of choices. The study aims at describing the effect of the scoring technique and numerous options towards the reliability of multiple-choice objective tests on social subjects in elementary school. The study is quantitative research with…

Descriptors: Scoring, Multiple Choice Tests, Test Reliability, Elementary School Students

The Impact of Scoring Later on Mixed Format Adaptive Testing

Direct link

Jing Ma – ProQuest LLC, 2024

This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…

Descriptors: Scoring, Adaptive Testing, Test Items, Classification

Reducing Workload in Short Answer Grading Using Machine Learning

Peer reviewed

Direct link

Rebecka Weegar; Peter Idestam-Almquist – International Journal of Artificial Intelligence in Education, 2024

Machine learning methods can be used to reduce the manual workload in exam grading, making it possible for teachers to spend more time on other tasks. However, when it comes to grading exams, fully eliminating manual work is not yet possible even with very accurate automated grading, as any grading mistakes could have significant consequences for…

Descriptors: Grading, Computer Assisted Testing, Introductory Courses, Computer Science Education

Current Writing Assessment Practices of Kindergarten through Second Grade Educators

Peer reviewed

Direct link

Meaghan McKenna; Hope Gerde; Nicolette Grasley-Boy – Reading and Writing: An Interdisciplinary Journal, 2025

This article describes the development and administration of the "Kindergarten-Second Grade (K-2) Writing Data-Based Decision Making (DBDM) Survey." The "K-2 Writing DBDM Survey" was developed to learn more about current DBDM practices specific to early writing. A total of 376 educational professionals (175 general education…

Descriptors: Writing Evaluation, Writing Instruction, Preschool Teachers, Kindergarten

Best Practices for Constructed-Response Scoring. Research Report. ETS RR-22-17

Peer reviewed
PDF on ERIC

Download full text

McCaffrey, Daniel F.; Casabianca, Jodi M.; Ricker-Pedley, Kathryn L.; Lawless, René R.; Wendler, Cathy – ETS Research Report Series, 2022

This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed-response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses.…

Descriptors: Best Practices, Scoring, Test Format, Computer Assisted Testing

Selecting Technically Adequate Tests

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2024

The author provides a checklist for educators who are selecting technically adequate tests for identifying and referring students for gifted education services and programs. The checklist includes questions related to how the test was normed, reliability and validity studies as well as questions related to types of scores, administration, and…

Descriptors: Test Selection, Academically Gifted, Gifted Education, Test Validity

Interpreting Testing and Assessment: A State-of-the-Art Review

Peer reviewed

Direct link

Han, Chao – Language Testing, 2022

Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the…

Descriptors: Translation, Language Tests, Testing, Evaluation Methods

Understanding and Interpreting Human Scoring

Peer reviewed

Direct link

Glazer, Nancy; Wolfe, Edward W. – Applied Measurement in Education, 2020

This introductory article describes how constructed response scoring is carried out, particularly the rater monitoring processes and illustrates three potential designs for conducting rater monitoring in an operational scoring project. The introduction also presents a framework for interpreting research conducted by those who study the constructed…

Descriptors: Scoring, Test Format, Responses, Predictor Variables

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Peer reviewed

Direct link

Baldwin, Peter; Clauser, Brian E. – Journal of Educational Measurement, 2022

While score comparability across test forms typically relies on common (or randomly equivalent) examinees or items, innovations in item formats, test delivery, and efforts to extend the range of score interpretation may require a special data collection before examinees or items can be used in this way--or may be incompatible with common examinee…

Descriptors: Scoring, Testing, Test Items, Test Format

Comparing the Score Interpretation across Modes in PISA: An Investigation of How Item Facets Affect Difficulty

Peer reviewed

Direct link

Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023

Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…

Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries

Marginalized Learners in International and Regional Test Data: The Extent of Floor Effects

Peer reviewed

Direct link

Gustafsson, Martin; Barakat, Bilal Fouad – Comparative Education Review, 2023

International assessments inform education policy debates, yet little is known about their floor effects: To what extent do they fail to differentiate between the lowest performers, and what are the implications of this? TIMSS, SACMEQ, and LLECE data are analyzed to answer this question. In TIMSS, floor effects have been reduced through the…

Descriptors: Achievement Tests, Elementary Secondary Education, International Assessment, Foreign Countries

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

New York State Education…	8
ETS Research Report Series	5
ProQuest LLC	5
Practical Assessment,…	4
Applied Measurement in…	3
Educational and Psychological…	3
Grantee Submission	3
Journal of Educational…	3
Language Testing	3
Louisiana Department of…	3
Assessment & Evaluation in…	2
College Board	2
Computers & Education	2
Educational Assessment	2
International Association for…	2
Journal of Psychoeducational…	2
Language Assessment Quarterly	2
American Journal of…	1
Arizona Department of…	1
Athletic Training Education…	1
British Journal of…	1
CEA Forum	1
California Collaborative on…	1
Comparative Education Review	1
Computer Assisted Language…	1
More ▼

Kim, Sooyeon	3
Wind, Stefanie A.	3
Boyer, Michelle	2
Cheng, Liying	2
Guo, Wenjing	2
Kim, Doyoung	2
Martin, Michael O., Ed.	2
Moses, Tim	2
Mullis, Ina V. S., Ed.	2
Stergiopoulos, Charalampos	2
Triantis, Dimos	2
Tsiakas, Panagiotis	2
Ventouras, Errikos	2
Akyildiz, Murat	1
Al Habbash, Maha	1
Al Mohammedi, Najah	1
Al Othali, Safa	1
Alcaraz-Mármol, Gema	1
Ali, Usama S.	1
Alsheikh, Negmeldin	1
Ault, Marilyn	1
Aviad-Levitzky, Tami	1
Bacanli, Salih S.	1
Bakhoda, Iman	1
Baldwin, Peter	1
More ▼