ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	4
Since 2017 (last 10 years)	7
Since 2007 (last 20 years)	18

Descriptor

Evaluation Methods	50
Scoring	50
Test Reliability	50
Test Validity	24
Student Evaluation	15
Interrater Reliability	14
Writing Evaluation	13
Educational Assessment	8
Elementary Secondary Education	8
Higher Education	8
Testing	8
Scores	7
Test Construction	7
Writing (Composition)	7
Essays	6
Foreign Countries	6
Measurement Techniques	6
Performance Based Assessment	6
Standardized Tests	6
Test Use	6
Testing Problems	6
Writing Skills	6
Evaluation Criteria	5
Holistic Evaluation	5
Testing Programs	5
More ▼

Publication Type

Reports - Research	24
Journal Articles	22
Reports - Evaluative	12
Speeches/Meeting Papers	7
Information Analyses	5
Opinion Papers	3
Reports - Descriptive	3
Tests/Questionnaires	3
ERIC Publications	2
Numerical/Quantitative Data	2
Reports - General	2
ERIC Digests in Full Text	1
Guides - Classroom - Teacher	1
Guides - General	1
Guides - Non-Classroom	1
Historical Materials	1
Reference Materials -…	1
More ▼

Education Level

Elementary Secondary Education	5
Elementary Education	4
Kindergarten	3
Early Childhood Education	1
Grade 1	1
Grade 2	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Practitioners	8
Policymakers	3
Teachers	3
Researchers	2

Location

Vermont	2
Australia	1
Canada	1
Finland	1
Nebraska	1
Russia	1
United Kingdom (Scotland)	1

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

Advanced Placement…	2
Childrens Depression Inventory	1
Graduate Record Examinations	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 50 results Save | Export

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

A Systematic Review of Early Writing Assessment Tools

Peer reviewed

Direct link

Katherine L. Buchanan; Milena Keller-Margulis; Amanda Hut; Weihua Fan; Sarah S. Mire; G. Thomas Schanding Jr. – Early Childhood Education Journal, 2025

There is considerable research regarding measures of early reading but much less in early writing. Nevertheless, writing is a critical skill for success in school and early difficulties in writing are likely to persist without intervention. A necessary step toward identifying those students who need additional support is the use of screening…

Descriptors: Writing Evaluation, Evaluation Methods, Emergent Literacy, Beginning Writing

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

Reconsidering the Assessment Policy: Practical Use of Liberal Multiple-Choice Tests (SAC Method)

Peer reviewed
PDF on ERIC

Download full text

Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019

Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…

Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)

2023-2024 NSCAS Growth: English Language Arts, Mathematics, and Science Technical Report

Download full text

Nebraska Department of Education, 2024

The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…

Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students

Testing Methodology in the Student Learning Process

Peer reviewed
PDF on ERIC

Download full text

Gorbunova, Tatiana N. – European Journal of Contemporary Education, 2017

The subject of the research is to build methodologies to evaluate the student knowledge by testing. The author points to the importance of feedback about the mastering level in the learning process. Testing is considered as a tool. The object of the study is to create the test system models for defence practice problems. Special attention is paid…

Descriptors: Testing, Evaluation Methods, Feedback (Response), Simulation

A Review of Evidence Presented in Support of Three Key Claims in the Validity Argument for the "TextEvaluator"® Text Analysis Tool. Research Report. ETS RR-16-12

Peer reviewed
PDF on ERIC

Download full text

Sheehan, Kathleen M. – ETS Research Report Series, 2016

The "TextEvaluator"® text analysis tool is a fully automated text complexity evaluation tool designed to help teachers and other educators select texts that are consistent with the text complexity guidelines specified in the Common Core State Standards (CCSS). This paper provides an overview of the TextEvaluator measurement approach and…

Descriptors: Automation, Evaluation Methods, Reading Material Selection, Common Core State Standards

ITC Guidelines for the Large-Scale Assessment of Linguistically and Culturally Diverse Populations

Peer reviewed

Direct link

International Journal of Testing, 2019

These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…

Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage

Subjective Scoring of Divergent Thinking: Examining the Reliability of Unusual Uses, Instances, and Consequences Tasks

Peer reviewed

Direct link

Silvia, Paul J. – Thinking Skills and Creativity, 2011

The present research examined the reliability of three types of divergent thinking tasks (unusual uses, instances, consequences/implications) and two types of subjective scoring (an average across all responses vs. the responses people chose as their top-two responses) within a latent variable framework, using the maximal-reliability "H"…

Descriptors: Scoring, Creative Thinking, Thinking Skills, Test Reliability

Retell as an Indicator of Reading Comprehension

Peer reviewed

Direct link

Reed, Deborah K.; Vaughn, Sharon – Scientific Studies of Reading, 2012

The purpose of this narrative synthesis is to determine the reliability and validity of retell protocols for assessing reading comprehension of students in grades K-12. Fifty-four studies were systematically coded for data related to the administration protocol, scoring procedures, and technical adequacy of the retell component. Retell was…

Descriptors: Reading Comprehension, Reading Difficulties, Elementary Secondary Education, Learning Disabilities

Using Calibrated Exemplars in the Teacher-Assessment of Writing: An Empirical Study

Peer reviewed

Direct link

Heldsinger, Sandra A.; Humphry, Stephen M. – Educational Research, 2013

Background: Many in education argue for the importance of incorporating teacher judgements in the assessment and reporting of student performance. Advocates of such an approach are cognisant, though, that obtaining a satisfactory level of consistency in teacher judgements poses a challenge. Purpose: This study investigates the extent to which the…

Descriptors: Evaluation Methods, Student Evaluation, Teacher Attitudes, Comparative Analysis

Scoring Subjectivity and Item Performance on Measures Used to Assess Violence Risk: The PCL-R and HCR-20 as Exemplars

Peer reviewed

Direct link

Rufino, Katrina A.; Boccaccini, Marcus T.; Guy, Laura S. – Assessment, 2011

Although reliability is essential to validity, most research on violence risk assessment tools has paid little attention to strategies for improving rater agreement. The authors evaluated the degree to which perceived subjectivity in scoring guidelines for items from two measures--the Psychopathy Checklist-Revised (PCL-R) and the Historical,…

Descriptors: Risk Management, Predictive Validity, Interrater Reliability, Scoring

Test Review: Kovacs, M. "Children's Depression Inventory 2 (CDI 2)" (2nd ed.). North Tonawanda, NY: Multi-Health Systems Inc, 2011

Peer reviewed

Direct link

Bae, Yunhee – Journal of Psychoeducational Assessment, 2012

This article presents a review of the Children's Depression Inventory 2 (CDI 2), published by Multi-Health Systems (MHS) to assess depressive symptoms in 7- to 17-year-old children and adolescents. Given the importance of early diagnosis and treatment (Kovacs & Devlin, 1998), the CDI 2 can assist professionals to pinpoint critical depressive…

Descriptors: Disability Identification, Depression (Psychology), Mental Disorders, Norms

Gathering Feedback for Teaching: Combining High-Quality Observations with Student Surveys and Achievement Gains. Research Paper. MET Project

Download full text

Kane, Thomas J.; Staiger, Douglas O. – Bill & Melinda Gates Foundation, 2012

There is a growing consensus that teacher evaluation in the United States is fundamentally broken. Few would argue that a system that tells 98 percent of teachers they are "satisfactory" benefits anyone--including teachers. The nation's collective failure to invest in high-quality professional feedback to teachers is inconsistent with…

Descriptors: Teacher Effectiveness, Achievement Gains, Evaluation Methods, Teaching Methods

Gathering Feedback for Teaching: Combining High-Quality Observations with Student Surveys and Achievement Gains. Policy and Practice Brief. MET Project

Download full text

Kane, Thomas J.; Staiger, Douglas O. – Bill & Melinda Gates Foundation, 2012

Research has long been clear that teachers matter more to student learning than any other in-school factor. Improving the quality of teaching is critical to student success. Yet only recently have many states and districts begun to take seriously the importance of evaluating teacher performance and providing teachers with the feedback they need to…

Descriptors: Teacher Effectiveness, Achievement Gains, Evaluation Methods, Teaching Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Bill & Melinda Gates…	2
College Teaching	2
Psychology in the Schools	2
Applied Measurement in…	1
Assessing Writing	1
Assessment	1
Audio-Visual Language Journal	1
ETS Research Report Series	1
Early Childhood Education…	1
Early Education and…	1
Educational Policy Analysis…	1
Educational Research	1
European Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Psychoeducational…	1
National Center for Research…	1
National Clearinghouse for…	1
Nebraska Department of…	1
Physical Educator	1
Research Quarterly for…	1
Scientific Studies of Reading	1
Thinking Skills and Creativity	1
Yearbook of the National…	1
More ▼

Gearhart, Maryl	2
Kane, Thomas J.	2
Koretz, Daniel	2
Staiger, Douglas O.	2
Aksu, Gökhan	1
Amanda Hut	1
Andrews, Jac	1
Apache, R. R.	1
Bae, Yunhee	1
Baker, Eva L.	1
Bejar, Isaac I.	1
Boccaccini, Marcus T.	1
Burgin, John	1
Carlson, Sybil B.	1
Cesur, Kursat	1
Cohen, Allan S., Comp.	1
Cooper, Peter L.	1
Crehan, Kevin D.	1
Eignor, Daniel R.	1
Eser, Mehmet Taha	1
Feldt, Leonard S.	1
Fisher, Steve	1
G. Thomas Schanding Jr.	1
Gorbunova, Tatiana N.	1
More ▼