ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	4

Descriptor

Interrater Reliability	7
Scoring	7
Simulation	7
Correlation	4
Evaluators	4
Comparative Analysis	3
Test Construction	3
Computer Software	2
Essays	2
Generalization	2
Occupational Tests	2
Personnel Selection	2
Rating Scales	2
Test Items	2
Test Validity	2
Administrator Evaluation	1
Allied Health Personnel	1
Certification	1
Check Lists	1
Competence	1
Competitive Selection	1
Computer Assisted Testing	1
Concurrent Validity	1
Correctional Rehabilitation	1
Decision Making	1
More ▼

Source

Advances in Health Sciences…	1
ETS Research Report Series	1
Practical Assessment,…	1
ProQuest LLC	1

Publication Type

Journal Articles	3
Reports - Research	3
Reports - Evaluative	2
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Tests/Questionnaires	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Exploring Differences in Measurement and Reporting of Classroom Observation Inter-Rater Reliability

Peer reviewed
PDF on ERIC

Download full text

Wilhelm, Anne Garrison; Gillespie Rouse, Amy; Jones, Francesca – Practical Assessment, Research & Evaluation, 2018

Although inter-rater reliability is an important aspect of using observational instruments, it has received little theoretical attention. In this article, we offer some guidance for practitioners and consumers of classroom observations so that they can make decisions about inter-rater reliability, both for study design and in the reporting of data…

Descriptors: Interrater Reliability, Measurement, Observation, Educational Research

Applying Kane's Validity Framework to a Simulation Based Assessment of Clinical Competence

Peer reviewed

Direct link

Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud – Advances in Health Sciences Education, 2018

Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…

Descriptors: Competence, Simulation, Allied Health Personnel, Certification

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

New Stuff in I/O (In-Baskets and Orals). The Development, Administration and Scoring of In-Baskets and Orals for the New York State Correction Captain Examination.

Download full text

Kaiser, Paul D.; Brull, Harry – 1994

The design, administration, scoring, and results of the 1993 New York State Correctional Captain Examination are described. The examination was administered to 405 candidates. As in previous Sergeant and Lieutenant examinations, candidates also completed latent image written simulation problems and open/closed book multiple choice test components.…

Descriptors: Competitive Selection, Correctional Rehabilitation, Decision Making, Educational Innovation

Traditional In-Baskets vs. the General Management In-Basket (GMIB).

Download full text

Joines, Richard C. – 1991

The development and validation of the General Management In-Basket (GMIB) is described. The GMIB is a theory-based generic in-basket simulation, designed to assess supervisory and management skills independent of any job classification. Three of the 15 in-basket items in the GMIB are critical and are scored on a 0-5 scale. The remaining 12 items…

Descriptors: Administrator Evaluation, Concurrent Validity, Factor Analysis, Interrater Reliability

The Investigator Planning Exercise: The Selection of Detectives in the Chicago Police Department.

Download full text

Conley, Patrick; Jegerski, Jane – 1991

Construction of a work sample test, the Investigator Planning Exercise (IPE), for the job of detective in the Chicago (Illinois) Police Department is described. Simulated crime scenarios, a mock crime scene, and five checklists of necessary skills (i.e., ability to summarize and communicate facts, identify inconsistencies, and determine the next…

Descriptors: Check Lists, Deduction, Evaluators, Interrater Reliability

Breyer, F. Jay	1
Brull, Harry	1
Brydges, Ryan	1
Conley, Patrick	1
Gillespie Rouse, Amy	1
Huiskamp, Maud	1
Jegerski, Jane	1
Joines, Richard C.	1
Jones, Francesca	1
Kaiser, Paul D.	1
Lorenz, Florian	1
Myre, Paul	1
Prpic, Jason	1
Tavares, Walter	1
Turner, Linda	1
Wilhelm, Anne Garrison	1
Yelle, Richard	1
Yun, Jiyeo	1
Zhang, Mo	1
More ▼