ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	13
Since 2017 (last 10 years)	29
Since 2007 (last 20 years)	55

Descriptor

Decision Making	72
Interrater Reliability	72
Evaluators	20
Foreign Countries	15
Comparative Analysis	14
Evaluation Methods	14
Scores	14
Scoring	13
Writing Evaluation	12
Performance Based Assessment	11
Second Language Learning	11
Correlation	9
Evaluation Criteria	9
Student Evaluation	9
Data Analysis	8
Reliability	8
Teaching Methods	8
English (Second Language)	7
Feedback (Response)	7
Problem Solving	7
Standards	7
Accuracy	6
Bias	6
Educational Assessment	6
Elementary School Students	6
More ▼

Publication Type

Journal Articles	57
Reports - Research	50
Reports - Evaluative	13
Speeches/Meeting Papers	7
Reports - Descriptive	4
Information Analyses	2
Tests/Questionnaires	2
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Opinion Papers	1
More ▼

Education Level

Higher Education	18
Postsecondary Education	13
Elementary Education	9
Secondary Education	8
Early Childhood Education	2
Grade 4	2
Grade 5	2
High Schools	2
Middle Schools	2
Primary Education	2
Elementary Secondary Education	1
Grade 1	1
Grade 3	1
Grade 7	1
Grade 9	1
Intermediate Grades	1
Junior High Schools	1
Kindergarten	1
More ▼

Audience

Practitioners	2
Administrators	1
Researchers	1

Location

Australia	3
Netherlands	3
Norway	3
Ohio	3
Florida	2
Germany	2
Pennsylvania	2
United Kingdom	2
United Kingdom (England)	2
United States	2
Asia	1
Belgium	1
Brazil	1
China	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Europe	1
Greece	1
Hawaii	1
Illinois (Chicago)	1
Ireland	1
Israel	1
Italy	1
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…

Assessments and Surveys

ACT Assessment	1
Bayley Scales of Infant…	1
MacArthur Communicative…	1
SAT (College Admission Test)	1
Wechsler Individual…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 72 results Save | Export

Validity and Reliability of Speech Data in the Norwegian Registry of Cleft Lip and Palate

Peer reviewed

Direct link

Øydis Hide; Dagrun Slettebø Daltveit; Åse Sivertsen; Anne Katherine Hvistendahl; Randi Lovise Kjerstad; Marit Berntsen Kvinnsland; Nina Helen Pedersen; Christina Sørensen – International Journal of Language & Communication Disorders, 2025

Background: Cleft lip and palate (CLP) treatment in Norway is centralized and multidisciplinary, with long-term follow-up from birth to adulthood. The Norwegian Registry of Cleft Lip and Palate was established to ensure high-quality care and enable systematic data collection. Speech data are a key component, assessed by speech--language therapists…

Descriptors: Foreign Countries, Validity, Reliability, Data Collection

Chasing Rainbows? Ofsted's Quest for Inter-Inspector Reliability

Peer reviewed

Direct link

Pearson, Terry – FORUM: for promoting 3-19 comprehensive education, 2023

Ofsted has frequently defended the judgements made during inspections by claiming that inspection ratings are reliable, as shown by the results from the collection of studies the inspectorate has conducted. I outline the inspectorate's view of reliability and problematise the studies that it has carried out, noting that these provide insufficient…

Descriptors: Inspection, Interrater Reliability, Decision Making, Value Judgment

The Effects of X-Axis Time Compression on the Visual Analysis of Single-Case Data

Peer reviewed

Direct link

Dart, Evan H.; Radley, Keith C. – Psychology in the Schools, 2023

Single-case design is a research methodology that entails repeated measurement to assess the influence of an independent variable on a dependent variable over time. Data collected in this manner are regularly analyzed using visual analysis of data displayed in a linear graph. Although there is agreement regarding critical elements of visual…

Descriptors: Research Design, Research Methodology, Data Collection, Data Analysis

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Interrater Reliability in Systematic Review Methodology: Exploring Variation in Coder Decision-Making

Peer reviewed

Direct link

Belur, Jyoti; Tompson, Lisa; Thornton, Amy; Simon, Miranda – Sociological Methods & Research, 2021

A methodologically sound systematic review is characterized by transparency, replicability, and a clear inclusion criterion. However, little attention has been paid to reporting the details of interrater reliability (IRR) when multiple coders are used to make decisions at various points in the screening and data extraction stages of a study. Prior…

Descriptors: Interrater Reliability, Decision Making, Accuracy, Coding

Agreement between Visual Inspection and Objective Analysis Methods: A Replication and Extension

Peer reviewed

Direct link

Taylor, Tessa; Lanovaz, Marc J. – Journal of Applied Behavior Analysis, 2022

Behavior analysts typically rely on visual inspection of single-case experimental designs to make treatment decisions. However, visual inspection is subjective, which has led to the development of supplemental objective methods such as the conservative dual-criteria method. To replicate and extend a study conducted by Wolfe et al. (2018) on the…

Descriptors: Visual Perception, Artificial Intelligence, Decision Making, Evaluators

Who Gets the Grant? A Persona-Based Investigation into Research Funding Panelist Preferences

Peer reviewed

Direct link

João M. Santos – Research Evaluation, 2024

The allocation of scientific funding through grant programs is crucial for research advancement. While independent peer panels typically handle evaluations, their decisions can lean on personal preferences that go beyond the stated criteria, leading to inconsistencies and potential biases. Given these concerns, our study employs a novel method,…

Descriptors: Grants, Program Proposals, Funding Formulas, Scientific Research

Using Inter-Rater Discourse to Trace the Origins of Disagreement: Towards Collective Reflective Practice in L2 Assessment

Peer reviewed

Direct link

Matthews, Joshua – RELC Journal: A Journal of Language Teaching and Research, 2023

This article explores how the analysis of inter-rater discourse can be used to support collective reflective practice in second language (L2) assessment. To demonstrate, a focused case of the discourse between two experienced language teachers as they negotiate assessment decisions on L2 written texts is presented. Of particular interest was the…

Descriptors: Interrater Reliability, Discourse Analysis, Student Evaluation, Second Language Learning

Evidence on the Dimensionality and Reliability of Professional References' Ratings of Teacher Applicants. Working Paper No. 237-0620

Download full text

Goldhaber, Dan; Grout, Cyrus; Wolf, Malcom; Martinkova, Patricia – National Center for Analysis of Longitudinal Data in Education Research (CALDER), 2020

There is growing interest in using measures of teacher applicant quality to improve hiring decisions, but the statistical properties of such measures are poorly understood. We present evidence on structured ratings solicited from teacher applicants' references. We find that the reference ratings capture only one underlying dimension of applicant…

Descriptors: Job Applicants, Teacher Selection, Interrater Reliability, Decision Making

Adaptation and Validation of a Test of Ethical Sensitivity in Teaching

Peer reviewed

Direct link

Maxwell, Bruce; Boon, Helen; Tanchuk, Nicolas; Rauwerda, Bryan – Journal of Moral Education, 2021

This article documents the adaptation, piloting and validation of a measure of teachers' ethical sensitivity. To create the test, we modified a measure from dentistry drawing on literature in teacher professional ethics and drew on the expertise of professional ethics scholars and practitioners. Based on the results of Rasch analysis combined with…

Descriptors: Ethics, Moral Values, Scores, Teacher Education Programs

Depth-Perception-Based Representation in Holistic Rating on ESL Essay Writing

Peer reviewed

Direct link

Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024

This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…

Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Automated Assessment of Second Language Comprehensibility: Review, Training, Validation, and Generalization Studies

Peer reviewed

Direct link

Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023

Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…

Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication

Measuring Adaptive Intelligence of the Gifted through Critical Problem Analysis

Peer reviewed

Direct link

Robert J. Sternberg; Jenna Landy; Jennifer Long – Roeper Review, 2024

Procedures for identifying the gifted often make use of tests of general intelligence, among other assessments. Robert J. Sternberg recently suggested that identification of the gifted should further involve assessment of what he refers to as adaptive intelligence--the ability to adapt to real-world environments. Such a conception of intelligence…

Descriptors: Intelligence, Intelligence Tests, Gifted, Identification

Vocal Development in a Large-Scale Crosslinguistic Corpus

Peer reviewed

Direct link

Cychosz, Margaret; Cristia, Alejandrina; Bergelson, Elika; Casillas, Marisa; Baudet, Gladys; Warlaumont, Anne S.; Scaff, Camila; Yankowitz, Lisa; Seidl, Amanda – Developmental Science, 2021

This study evaluates whether early vocalizations develop in similar ways in children across diverse cultural contexts. We analyze data from daylong audio recordings of 49 children (1-36 months) from five different language/cultural backgrounds. Citizen scientists annotated these recordings to determine if child vocalizations contained canonical…

Descriptors: Cultural Context, Contrastive Linguistics, Audio Equipment, Cultural Differences

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Advances in Health Sciences…	2
Assessing Writing	2
Educational Measurement:…	2
Journal of Teaching in…	2
Psychology in the Schools	2
Action in Teacher Education	1
Administrative Issues…	1
Advances in Language and…	1
Applied Measurement in…	1
Assessment for Effective…	1
Assessment in Education:…	1
Bilingual Research Journal	1
Child & Youth Care Forum	1
College and University	1
Developmental Science	1
ETS Research Report Series	1
Education and Training in…	1
Education and Treatment of…	1
Educational Assessment	1
Educational Assessment,…	1
Educational Researcher	1
Educational and Psychological…	1
English Language Teaching	1
European Journal of…	1
Evaluation Review	1
More ▼

Algozzine, Kate M.	1
Algozzine, Robert F.	1
Allen, Abigail	1
Almond, Patricia	1
Anne Katherine Hvistendahl	1
Archbald, Doug	1
Armijo-Olivo, Susan	1
Barkaoui, Khaled	1
Bartelink, C.	1
Baudet, Gladys	1
Beaudoin, Christina	1
Belur, Jyoti	1
Bergelson, Elika	1
Boen, Filip	1
Boon, Helen	1
Brocken, Johanna E. A.	1
Brown-Esters, Onikia	1
Brull, Harry	1
Byrd, Gary R.	1
Byrd-Bredbenner, Carol	1
Campbell, Sandy	1
Casillas, Marisa	1
Cato, Heather	1
Ceux, Tanja	1
More ▼