ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	8

Descriptor

Evaluators	10
Interrater Reliability	10
Simulation	10
Scores	5
Scoring	4
Comparative Analysis	3
Correlation	3
Decision Making	3
Essays	3
Foreign Countries	3
Performance Based Assessment	3
Accuracy	2
Computer Software	2
Educational Assessment	2
English (Second Language)	2
Evaluation Criteria	2
Evaluation Methods	2
High Stakes Tests	2
Language Proficiency	2
Language Tests	2
Oral Language	2
Personnel Selection	2
Rating Scales	2
Statistical Analysis	2
Test Construction	2
More ▼

Source

College Board	1
ETS Research Report Series	1
Educational Measurement:…	1
English Language Teaching	1
Journal of Continuing…	1
Journal of Research on…	1
ProQuest LLC	1
RELC Journal: A Journal of…	1

Publication Type

Journal Articles	6
Reports - Research	5
Reports - Evaluative	2
Dissertations/Theses -…	1
Non-Print Media	1
Reference Materials - General	1
Reports - Descriptive	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Postsecondary Education	3
Higher Education	2
Adult Education	1
Elementary Education	1
Grade 4	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Singapore	1
United Kingdom	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 10 results Save | Export

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

How Do Raters Judge Spoken Vocabulary?

Peer reviewed
PDF on ERIC

Download full text

Li, Hui – English Language Teaching, 2016

The aim of the study was to investigate how raters come to their decisions when judging spoken vocabulary. Segmental rating was introduced to quantify raters' decision-making process. It is hoped that this simulated study brings fresh insight to future methodological considerations with spoken data. Twenty trainee raters assessed five Chinese…

Descriptors: Foreign Countries, Evaluators, Interrater Reliability, Decision Making

Refining Methods for Estimating Critical Values for an Alignment Index

Peer reviewed

Direct link

Polikoff, Morgan S.; Fulmer, Gavin W. – Journal of Research on Educational Effectiveness, 2013

The alignment among standards, assessments, and teachers' instruction is an essential element of standards-based educational reforms. The Surveys of Enacted Curriculum (SEC) is the only common tool that can be used to measure the alignment among all three of these sources (Martone & Sireci, 2009). Prior SEC alignment work has been limited by…

Descriptors: Alignment (Education), Academic Standards, Educational Assessment, Instruction

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Understanding Discrepancies in Rater Judgement on National-Level Oral Examination Tasks

Direct link

Ang-Aw, Hui Teng; Goh, Christine Chuen Meng – RELC Journal: A Journal of Language Teaching and Research, 2011

The oral examination is an important component of the high-stakes "O" Level examination in Singapore taken by 16-17 year olds whose first language may or may not be English. In spite of this, there has been sparse research into the examination. This paper reports findings of an exploratory study which attempted to determine whether there…

Descriptors: Protocol Analysis, Rating Scales, Examiners, Foreign Countries

Rater Training to Support High-Stakes Simulation-Based Assessments

Peer reviewed

Direct link

Feldman, Moshe; Lazzara, Elizabeth H.; Vanderbilt, Allison A.; DiazGranados, Deborah – Journal of Continuing Education in the Health Professions, 2012

Competency-based assessment and an emphasis on obtaining higher-level outcomes that reflect physicians' ability to demonstrate their skills has created a need for more advanced assessment practices. Simulation-based assessments provide medical education planners with tools to better evaluate the 6 Accreditation Council for Graduate Medical…

Descriptors: Performance Based Assessment, Physicians, Accuracy, High Stakes Tests

Score Resolution in Essay Grading: A View from a Signal Detection Model of Rater Behavior

Download full text

DeCarlo, Lawrence T.; Kim, YoungKoung – College Board, 2008

[Slides] presented at the American Educational Research Association (AERA) Conference in New York in March 2008. This presentation explores what cues are used as a deciding factor in essay scoring by the essay grader.

Descriptors: Essays, Grading, Evaluation Criteria, Scoring Rubrics

New Stuff in I/O (In-Baskets and Orals). The Development, Administration and Scoring of In-Baskets and Orals for the New York State Correction Captain Examination.

Download full text

Kaiser, Paul D.; Brull, Harry – 1994

The design, administration, scoring, and results of the 1993 New York State Correctional Captain Examination are described. The examination was administered to 405 candidates. As in previous Sergeant and Lieutenant examinations, candidates also completed latent image written simulation problems and open/closed book multiple choice test components.…

Descriptors: Competitive Selection, Correctional Rehabilitation, Decision Making, Educational Innovation

The Investigator Planning Exercise: The Selection of Detectives in the Chicago Police Department.

Download full text

Conley, Patrick; Jegerski, Jane – 1991

Construction of a work sample test, the Investigator Planning Exercise (IPE), for the job of detective in the Chicago (Illinois) Police Department is described. Simulated crime scenarios, a mock crime scene, and five checklists of necessary skills (i.e., ability to summarize and communicate facts, identify inconsistencies, and determine the next…

Descriptors: Check Lists, Deduction, Evaluators, Interrater Reliability

Ang-Aw, Hui Teng	1
Breyer, F. Jay	1
Brull, Harry	1
Conley, Patrick	1
DeCarlo, Lawrence T.	1
DiazGranados, Deborah	1
Feldman, Moshe	1
Fulmer, Gavin W.	1
Goh, Christine Chuen Meng	1
Jegerski, Jane	1
Kaiser, Paul D.	1
Kim, YoungKoung	1
Lazzara, Elizabeth H.	1
Li, Hui	1
Lorenz, Florian	1
Polikoff, Morgan S.	1
Vanderbilt, Allison A.	1
Walker, A. Adrienne	1
Wind, Stefanie A.	1
Yun, Jiyeo	1
Zhang, Mo	1
More ▼