ERIC - Search Results

Publication Date

In 2026	0
Since 2025	6
Since 2022 (last 5 years)	24
Since 2017 (last 10 years)	58
Since 2007 (last 20 years)	188

Descriptor

Interrater Reliability	231
Reliability	231
Validity	90
Foreign Countries	54
Scores	48
Correlation	47
Measures (Individuals)	38
Statistical Analysis	34
Evaluation Methods	33
Comparative Analysis	27
Observation	27
Psychometrics	26
Rating Scales	23
Student Evaluation	23
Scoring Rubrics	22
Children	21
Evaluators	21
Measurement Techniques	20
Teaching Methods	19
Factor Analysis	18
Intervention	18
Academic Achievement	17
Scoring	17
College Students	16
Construct Validity	16
More ▼

Publication Type

Journal Articles	195
Reports - Research	167
Reports - Evaluative	34
Speeches/Meeting Papers	13
Dissertations/Theses -…	12
Information Analyses	11
Tests/Questionnaires	9
Reports - Descriptive	8
Opinion Papers	6
Guides - Non-Classroom	2
Books	1
Non-Print Media	1
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	51
Postsecondary Education	40
Elementary Education	24
Secondary Education	16
Early Childhood Education	13
High Schools	9
Elementary Secondary Education	7
Kindergarten	7
Middle Schools	7
Primary Education	7
Junior High Schools	6
Preschool Education	6
Grade 2	4
Grade 1	3
Grade 3	3
Grade 4	3
Grade 6	3
Intermediate Grades	3
Grade 5	2
Adult Education	1
Grade 7	1
Grade 8	1
High School Equivalency…	1
Two Year Colleges	1
More ▼

Audience

Researchers	13
Practitioners	2
Administrators	1
Counselors	1
Policymakers	1

Location

Canada	7
Turkey	6
Australia	5
United States	5
Netherlands	4
Taiwan	4
California	3
China	3
Italy	3
Norway	3
Spain	3
United Kingdom (England)	3
Belgium	2
Finland	2
Indonesia	2
New York	2
North Carolina	2
Singapore	2
Thailand	2
United Kingdom	2
Argentina	1
Bahrain	1
Brazil	1
California (Berkeley)	1
China (Beijing)	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Reliability X

Showing 16 to 30 of 231 results Save | Export

The Reliability of Simultaneous versus Individual Data Collection during Stuttering Assessment

Peer reviewed

Direct link

Davidow, Jason H.; Ye, Jun; Edge, Robin L. – International Journal of Language & Communication Disorders, 2023

Background: Speech-language pathologists often multitask in order to be efficient with their commonly large caseloads. In stuttering assessment, multitasking often involves collecting multiple measures simultaneously. Aims: The present study sought to determine reliability when collecting multiple measures simultaneously versus individually.…

Descriptors: Graduate Students, Measurement, Reliability, Group Activities

Visualizing Agreement: Bland-Altman Plots as a Supplement to Inter-Rater Reliability Indices

Peer reviewed

Direct link

Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024

Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…

Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques

Teacher Evaluation and Reliability: Additional Insights Gathered from Inter-Rater Reliability Analyses

Peer reviewed
PDF on ERIC

Download full text

Zepeda, Sally J.; Jimenez, Albert M. – Journal of Educational Supervision, 2019

Using a newly created teacher evaluation instrument, Inter-rater Reliability (IRR) analyses were conducted on four teacher videos as a means to establish instrument reliability. Raters included 42 principals and assistant principals in a southern US school district. The videos used spanned the teacher quality spectrum and the IRR findings across…

Descriptors: Teacher Evaluation, Interrater Reliability, Classroom Observation Techniques, Validity

Intra- and Inter-Rater Reliability of the Behaviour Mapping Schedule: A Direct Observational Tool for Classifying Children's Play Behaviour

Peer reviewed

Direct link

Dankiw, Kylie A.; Baldock, Katherine L.; Kumar, Saravana; Tsiros, Margarita D. – Australasian Journal of Early Childhood, 2021

Identifying and describing children's play behaviours is an important component of evaluating child development. The Behaviour Mapping Schedule is a direct observational tool which aims to describe and quantify children's play behaviours but is yet to undergo reliability testing. This study aimed to determine the intra- and inter-rater reliability…

Descriptors: Interrater Reliability, Classification, Child Behavior, Play

Developing a Tool for Measuring Student Orientations with Respect to Understanding in Mathematical Learning

Peer reviewed
PDF on ERIC

Download full text

Siqi Huang – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023

The goal of this paper is twofold. First, the paper clarifies and elaborates on an important theoretical construct called orientation with respect to understanding in mathematics, which denotes the degree to which students exhibit an inclination towards and demonstrate an earnest concern for understanding in mathematical learning. Second, the…

Descriptors: Mathematics Instruction, Teaching Methods, Problem Solving, Reliability

Spanish Validation of the Impact of Event Scale for People with Intellectual Disabilities, IES-ID

Peer reviewed

Direct link

Nuñez-Polo, Mercedes H. – Journal of Mental Health Research in Intellectual Disabilities, 2022

Introduction: The aim of this study is to validate a Spanish version of the Impact of Event Scale on People with ID (IES-ID). Methods: IES-ID was administered to adults with ID (n = 120), analyzing internal consistency, inter-rater and test-retest reliability, criterion validity, construct validity and feasibility. Results: Good internal…

Descriptors: Spanish, Translation, Construct Validity, Factor Analysis

Adaptation, Content Validity and Reliability of the Autism Classification System of Functioning for Social Communication: From Toddlerhood to Adolescent-Aged Children with Autism

Peer reviewed

Direct link

Di Rezze, Briano; Gentles, Stephen James; Hidecker, Mary Jo Cooley; Zwaigenbaum, Lonnie; Rosenbaum, Peter; Duku, Eric; Georgiades, Stelios; Roncadin, Caroline; Fang, Hanna; Tajik-Parvinchi, Diana; Viveiros, Helena – Journal of Autism and Developmental Disorders, 2022

The Autism Classification System of Functioning: Social Communication (ACSF) describes social communication functioning levels. First developed for preschoolers with ASD, this study tests an expanded age range (2-to-18 years). The ACFS rates the child's typical and best (i.e., capacity) performance. Qualitative methods tested parent and clinician…

Descriptors: Content Validity, Reliability, Autism Spectrum Disorders, Classification

Scoring Rubric Reliability and Internal Validity in Rater-Mediated EFL Writing Assessment: Insights from Many-Facet Rasch Measurement

Peer reviewed

Direct link

Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022

Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…

Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods

The Correlation between Perceptual Ratings and Nasalance Scores in Resonance Disorders: A Systematic Review

Peer reviewed

Direct link

Liu, Yilan; Lee, Sue Ann S.; Chen, Wenjun – Journal of Speech, Language, and Hearing Research, 2022

Introduction: Assessment of resonance characteristics is essential in research and clinical practice in individuals with velopharyngeal impairment. The purpose of this study was to systematically review correlations between auditory perceptual ratings and nasalance scores obtained by a nasometer in individuals with resonance disorders and to…

Descriptors: Correlation, Auditory Perception, Meta Analysis, Guidelines

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Perceptual and Acoustic Assessment of Strain Using Synthetically Modified Voice Samples

Peer reviewed

Direct link

Park, Yeonggwang; Cádiz, Manuel Díaz; Nagle, Kathleen F.; Stepp, Cara E. – Journal of Speech, Language, and Hearing Research, 2020

Purpose: Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Method: Stimuli were created using recordings of…

Descriptors: Acoustics, Audio Equipment, Auditory Perception, Correlation

Reliable Application of the MATH Taxonomy Sheds Light on Assessment Practices

Peer reviewed

Direct link

Kinnear, George; Bennett, Max; Binnie, Rachel; Bolt, Róisín; Zheng, Yinglan – Teaching Mathematics and Its Applications, 2020

The MATH taxonomy classifies questions according to the mathematical skills required to answer them. It was created to aid the development of more balanced assessments in undergraduate mathematics and has since been used to compare different assessment regimes across school and university. To date, there has been no systematic investigation of the…

Descriptors: Taxonomy, Mathematics Instruction, Teaching Methods, Reliability

How Much Is Enough? Evaluating Intervention Implementation Efficiently

Peer reviewed

Direct link

Fritz, Ronda; Harn, Beth; Biancarosa, Gina; Lucero, Audrey; Flannery, K. Brigid – Assessment for Effective Intervention, 2019

This study investigated the use of brief observations to measure implementation of small group interventions using the Quality of Intervention Delivery and Receipt (QIDR) tool. Videos of 10-min segments representing the beginning, middle, and end of each 30-min intervention lesson were coded for implementation. Results indicated that (a)…

Descriptors: Intervention, Program Implementation, Efficiency, Observation

Analytic or Holistic: A Study of Agreement between Different Grading Models

Peer reviewed
PDF on ERIC

Download full text

Jönsson, Anders; Balan, Andreia – Practical Assessment, Research & Evaluation, 2018

Research on teachers' grading has shown that there is great variability among teachers regarding both the process and product of grading, resulting in low comparability and issues of inequality when using grades for selection purposes. Despite this situation, not much is known about the merits or disadvantages of different models for grading. In…

Descriptors: Grading, Models, Reliability, Validity

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 16

ProQuest LLC	12
International Journal of…	9
Journal of Speech, Language,…	9
Assessment & Evaluation in…	6
Online Submission	5
Grantee Submission	4
Journal of Autism and…	4
Applied Measurement in…	3
Educational Assessment	3
Journal of Psychoeducational…	3
Language Assessment Quarterly	3
Research in Autism Spectrum…	3
Research in Developmental…	3
American Journal on Mental…	2
Assessment for Effective…	2
Behavioral Disorders	2
Child Development	2
Creativity Research Journal	2
Early Education and…	2
Education and Treatment of…	2
Educational Sciences: Theory…	2
Educational and Psychological…	2
International Journal of…	2
International Journal of…	2
Journal of Early Intervention	2
More ▼

Altszuler, Amy R.	2
Beretvas, S. Natasha	2
Cawthon, Stephanie W.	2
French, Brian F.	2
Ge, Jin Jin	2
Goe, Laura	2
Holdheide, Lynn	2
Katz, Larry	2
Mantzicopoulos, Panayota	2
Merrill, Brittany M.	2
Miller, Tricia	2
Morrow, Anne S.	2
Patrick, Helen	2
Reutzel, D. Ray	2
Shavelson, Richard J.	2
Sibley, Margaret H.	2
Wendel, Erica	2
Williams, Thomas O., Jr.	2
Zwaigenbaum, Lonnie	2
Abbott, Maree J.	1
Abd-Hamid, Nor Hashidah	1
Abdelhalim, Suzan M.	1
Abou-Khalil, Rima	1
Adamson, Katie Anne	1
More ▼

Early Childhood Environment…	3
Draw a Person Test	2
Vineland Adaptive Behavior…	2
Autism Diagnostic Observation…	1
Center for Epidemiologic…	1
Childrens Depression Inventory	1
Clinical Evaluation of…	1
Dynamic Indicators of Basic…	1
Family Adaptability Cohesion…	1
Graduate Record Examinations	1
Iowa Tests of Basic Skills	1
Neale Analysis of Reading…	1
Oral and Written Language…	1
Parenting Stress Index	1
Peabody Developmental Motor…	1
Peabody Picture Vocabulary…	1
Pediatric Evaluation of…	1
Strengths and Difficulties…	1
Test of Gross Motor…	1
Test of Language Development	1
Wechsler Adult Intelligence…	1
Wechsler Individual…	1
Woodcock Johnson Psycho…	1
More ▼