ERIC - Search Results

Publication Date

In 2026	0
Since 2025	56
Since 2022 (last 5 years)	282
Since 2017 (last 10 years)	778
Since 2007 (last 20 years)	2040

Descriptor

Interrater Reliability	3122
Foreign Countries	654
Test Reliability	503
Evaluation Methods	502
Test Validity	410
Correlation	401
Scoring	347
Comparative Analysis	327
Scores	324
Validity	310
Student Evaluation	308
Measures (Individuals)	298
Evaluators	295
Rating Scales	282
Statistical Analysis	268
Higher Education	264
Psychometrics	241
Reliability	230
Observation	229
Scoring Rubrics	216
Test Construction	212
English (Second Language)	211
Teaching Methods	208
Writing Evaluation	206
Intervention	200
More ▼

Education Level

Higher Education	574
Postsecondary Education	420
Elementary Education	282
Secondary Education	179
Early Childhood Education	145
Elementary Secondary Education	120
Middle Schools	109
High Schools	85
Preschool Education	72
Junior High Schools	65
Adult Education	58
Primary Education	57
Kindergarten	45
Grade 4	41
Grade 5	40
Intermediate Grades	40
Grade 1	37
Grade 6	35
Grade 8	32
Grade 3	31
Grade 2	27
Grade 7	27
Grade 10	13
Grade 9	11
Two Year Colleges	8
More ▼

Audience

Researchers	130
Practitioners	42
Teachers	22
Administrators	11
Counselors	3
Policymakers	2

Location

Australia	56
Turkey	53
United Kingdom	46
Canada	45
Netherlands	40
China	38
California	37
United States	30
United Kingdom (England)	24
Taiwan	23
Germany	22
Japan	22
Pennsylvania	22
Florida	21
Sweden	21
Iran	19
North Carolina	19
Hong Kong	17
South Korea	17
Texas	17
Georgia	16
Israel	15
New Zealand	14
South Africa	14
Washington	14
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	13
Individuals with Disabilities…	7
Elementary and Secondary…	3
Race to the Top	3
Elementary and Secondary…	2
American Recovery and…	1
Americans with Disabilities…	1
Education Consolidation…	1
Education for All Handicapped…	1
Every Student Succeeds Act…	1
Improving Americas Schools…	1
Individuals with Disabilities…	1
Individuals with Disabilities…	1
Pell Grant Program	1
Rehabilitation Act 1973…	1
Stewart B McKinney Homeless…	1
Temporary Assistance for…	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	3
Meets WWC Standards with or without Reservations	3
Does not meet standards	3

Showing 61 to 75 of 3,122 results Save | Export

Interactional Competencies in Medical Student Admission -- What Makes a "Good Medical Doctor"?

Peer reviewed

Direct link

Leonie Fleck; Dorothee Amelung; Anna Fuchs; Benjamin Mayer; Malvin Escher; Lena Listunova; Jobst-Hendrik Schultz; Andreas Möltner; Clara Schütte; Tim Wittenberg; Isabella Schneider; Sabine C. Herpertz – Advances in Health Sciences Education, 2025

Doctors' interactional competencies play a crucial role in patient satisfaction, well-being, and compliance. Accordingly, it is in medical schools' interest to select candidates with strong interactional abilities. While Multiple Mini Interviews (MMIs) provide a useful context to assess such abilities, the evaluation of candidate performance…

Descriptors: Medical Students, Medical Schools, College Admission, Admission Criteria

Examining the Consistency of Instructor versus Large Language Model Ratings on Summary Content: Toward Checklist-Based Feedback Provision with Second Language Writers

Peer reviewed

Direct link

Yasuyo Sawaki; Yutaka Ishii; Hiroaki Yamada; Takenobu Tokunaga – Language Testing, 2025

This study examined the consistency between instructor ratings of learner-generated summaries and those estimated by a large language model (LLM) on summary content checklist items designed for undergraduate second language (L2) writing instruction in Japan. The effects of the LLM prompt design on the consistency between the two were also explored…

Descriptors: Interrater Reliability, Writing Teachers, College Faculty, Artificial Intelligence

Artificial Intelligence in International English Language Testing System Writing Assessments: A Comparative Study of Human Ratings and DeepAI

Peer reviewed
PDF on ERIC

Download full text

Somayeh Fathali; Fatemeh Mohajeri – Technology in Language Teaching & Learning, 2025

The International English Language Testing System (IELTS) is a high-stakes exam where Writing Task 2 significantly influences the overall scores, requiring reliable evaluation. While trained human raters perform this task, concerns about subjectivity and inconsistency have led to growing interest in artificial intelligence (AI)-based assessment…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Artificial Intelligence

Reliability Evidence for the NC Teacher Evaluation Process Using a Variety of Indicators of Inter-Rater Agreement

Peer reviewed
PDF on ERIC

Download full text

Holcomb, T. Scott; Lambert, Richard; Bottoms, Bryndle L. – Journal of Educational Supervision, 2022

In this study, various statistical indexes of agreement were calculated using empirical data from a group of evaluators (n = 45) of early childhood teachers. The group of evaluators rated ten fictitious teacher profiles using the North Carolina Teacher Evaluation Process (NCTEP) rubric. The exact and adjacent agreement percentages were calculated…

Descriptors: Interrater Reliability, Teacher Evaluation, Statistical Analysis, Early Childhood Teachers

Inter-Evaluator Reliability of Sagittal and Rotational Spinal Measurements from 3D Ultrasound Imaging of Healthy Females in Standing with Varying Arm Positions

Peer reviewed

Direct link

Aislinn Ganci; Miran Qazizada; Brianna Fehr; Ana Vucenovic; Edmond Lou; Eric Parent – Measurement in Physical Education and Exercise Science, 2024

Spinal alignment can be assessed without radiation using three-dimensional ultrasound imaging (3DUS). Reliable measurements could inform the ideal arm position for scoliosis radiographs. This study determined the inter-evaluator reliability of axial vertebral rotation (AVR) measurements and sagittal curve angles in healthy females from 3DUS spinal…

Descriptors: Foreign Countries, Young Adults, Adults, Adolescents

The Development of a Novel, Standardized, Norm-Referenced Arabic Discourse Assessment Tool (ADAT), Including an Examination of Psychometric Properties of Discourse Measures in Aphasia

Peer reviewed

Direct link

Reem S. W. Alyahya – International Journal of Language & Communication Disorders, 2024

Background: People with aphasia (PWA) typically exhibit deficits in spoken discourse. Discourse analysis is the gold standard approach to assess language deficits beyond sentence level. However, the available discourse assessment tools are biased towards English and European languages and Western culture. Additionally, there is a lack of consensus…

Descriptors: Arabic, Aphasia, Psychometrics, Test Construction

The Impact of Online Peer Assessment on Academic Performance in Higher Education: A Meta-Analytic Review

Peer reviewed
PDF on ERIC

Download full text

Kübra Karakaya Özyer – Journal of Educators Online, 2025

This meta-analytic study investigates the impact of online peer assessment on academic achievement in higher education. By synthesizing 20 effect sizes, we provide a comprehensive understanding of how online peer assessment influences student learning outcomes. The findings reveal a statistically significant positive effect (Hedges's g = 0.672),…

Descriptors: Electronic Learning, Peer Evaluation, Higher Education, Meta Analysis

Interrater Reliability of the Test of Gross Motor Development--Third Edition following Raters' Agreement on Measurement Criteria

Peer reviewed

Direct link

Carballo-Fazanes, Aida; Rey, Ezequiel; Valentini, Nadia C.; Varela-Casal, Cristina; Abelairas-Gómez, Cristian – Journal of Motor Learning and Development, 2023

We aimed to calculate interrater reliability of the Test of Gross Motor Development--Third Edition (TGMD-3) after raters reached a consensus regarding measurement criteria. Three raters measured the fundamental movement skills of 25 children on the TGMD-3 at two different times: (a) once when simply following the measurement criteria in the TGMD-3…

Descriptors: Motor Development, Children, Norm Referenced Tests, Interrater Reliability

Examining Inter-Rater Reliability of Evaluators Judging Teacher Performance: Proposing an Alternative to Cohen's Kappa. CEME Technical Report. CEMETR-2022-02

Download full text

Lambert, Richard G.; Holcomb, T. Scott; Bottoms, Bryndle – Center for Educational Measurement and Evaluation, 2022

The validity of the Kappa coefficient of chance-corrected agreement has been questioned when the prevalence of specific rating scale categories is low and agreement between raters is high. The researchers proposed the Lambda Coefficient of Rater-Mediated Agreement as an alternative to Kappa to address these concerns. Lambda corrects for chance…

Descriptors: Interrater Reliability, Evaluators, Rating Scales, Teacher Evaluation

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

The Politics of Reading Textbooks: Intergenerational and International Reflections on China

Peer reviewed

Direct link

Liz Jackson; Michael W. Apple; Fei Yan; Jason Cong Lin; Chenxi Jiang; Tongzhou Li; Edward Vickers – Educational Philosophy and Theory, 2024

In this collective essay the authors consider the nature and consequences of reading and researching across difference in an international and intergenerational team, whose core members are focused on understanding how curriculum operates and the nature of textbook representation of diversity in Mainland China, Hong Kong, Taiwan, and Macau.…

Descriptors: Foreign Countries, Textbooks, Reading Research, Educational Research

Naive Listener Ratings of Speech Intelligibility over the Course of Motor-Based Intervention in Children with Childhood Apraxia of Speech

Peer reviewed

Direct link

Emily W. Wang; Maria I. Grigos – Journal of Speech, Language, and Hearing Research, 2024

Purpose: The aim of this study was to describe changes in speech intelligibility and interrater and intrarater reliability of naive listeners' ratings of words produced by young children diagnosed with childhood apraxia of speech (CAS) over a period of motor-based intervention (dynamic temporal and tactile cueing [DTTC]). Method: A total of 120…

Descriptors: Speech Communication, Intelligibility, Speech Impairments, Perceptual Motor Learning

Development of a Scoring Key to Evaluate the Creative Story Writing Levels of Secondary School Seventh Grade Students

Peer reviewed
PDF on ERIC

Download full text

Ebru Öztürk; Erol Duran – Educational Policy Analysis and Strategic Research, 2024

In this study, it was aimed to develop a rubric to evaluate the creative story writing skill levels of seventh grade secondary school students. The research was designed in quantitative research method and survey model. In the research, convenience sampling technique was used and 270 students studying at the seventh grade level of secondary school…

Descriptors: Scoring Rubrics, Writing Evaluation, Creative Writing, Middle School Students

Using Machine Learning to Score Multi-Dimensional Assessments of Chemistry and Physics

Peer reviewed

Direct link

Maestrales, Sarah; Zhai, Xiaoming; Touitou, Israel; Baker, Quinton; Schneider, Barbara; Krajcik, Joseph – Journal of Science Education and Technology, 2021

In response to the call for promoting three-dimensional science learning (NRC, 2012), researchers argue for developing assessment items that go beyond rote memorization tasks to ones that require deeper understanding and the use of reasoning that can improve science literacy. Such assessment items are usually performance-based constructed…

Descriptors: Artificial Intelligence, Scoring, Evaluation Methods, Chemistry

Large-Sample Variance of Fleiss Generalized Kappa

Peer reviewed

Direct link

Gwet, Kilem L. – Educational and Psychological Measurement, 2021

Cohen's kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss' generalized kappa. Fleiss' generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among…

Descriptors: Sample Size, Statistical Analysis, Interrater Reliability, Computation

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 209

ProQuest LLC	86
Journal of Speech, Language,…	62
Educational and Psychological…	61
Journal of Autism and…	56
Grantee Submission	40
Language Testing	39
Online Submission	35
International Journal of…	34
Assessment & Evaluation in…	33
Research in Developmental…	31
Applied Measurement in…	28
Advances in Health Sciences…	26
Assessment for Effective…	26
ETS Research Report Series	25
Journal of Educational…	24
Educational Measurement:…	23
Measurement in Physical…	20
Language Assessment Quarterly	19
Psychology in the Schools	19
Topics in Early Childhood…	19
Psychological Assessment	18
Educational Assessment	16
Autism: The International…	15
Journal of Consulting and…	15
Personnel Psychology	15
More ▼

Lunz, Mary E.	10
Wind, Stefanie A.	10
Engelhard, George, Jr.	8
Epstein, Michael H.	8
Ingham, Roger J.	8
Johnson, Evelyn S.	8
Matson, Johnny L.	7
McLeod, Bryce D.	7
Moylan, Laura A.	7
Cason, Carolyn L.	6
Cordes, Anne K.	6
Jaeger, Richard M.	6
Johnson, Robert L.	6
Lecavalier, Luc	6
Plake, Barbara S.	6
Tasse, Marc J.	6
Wyse, Adam E.	6
Zheng, Yuzhu	6
Aman, Michael G.	5
Barton, Erin E.	5
Cason, Gerald J.	5
Coniam, David	5
Conroy, Maureen A.	5
Crawford, Angela R.	5
More ▼

Journal Articles	2553
Reports - Research	2241
Reports - Evaluative	515
Speeches/Meeting Papers	272
Reports - Descriptive	163
Tests/Questionnaires	162
Information Analyses	130
Dissertations/Theses -…	89
Opinion Papers	61
Numerical/Quantitative Data	31
Guides - Non-Classroom	11
Books	7
Collected Works - General	3
Guides - Classroom - Teacher	3
Non-Print Media	3
Book/Product Reviews	2
Collected Works - Serials	2
Dissertations/Theses	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - General	2
Reports - General	2
Collected Works - Proceedings	1
Reference Materials -…	1
Reference Materials - General	1
More ▼

Test of English as a Foreign…	30
Child Behavior Checklist	18
National Assessment of…	14
Vineland Adaptive Behavior…	14
Autism Diagnostic Observation…	13
Strengths and Difficulties…	11
Woodcock Johnson Tests of…	10
Peabody Picture Vocabulary…	9
SAT (College Admission Test)	9
Wechsler Intelligence Scale…	9
Behavior Assessment System…	8
Dynamic Indicators of Basic…	8
Early Childhood Environment…	8
Graduate Record Examinations	8
International English…	7
Teacher Performance…	6
ACT Assessment	5
Advanced Placement…	5
Behavioral and Emotional…	5
Childhood Autism Rating Scale	5
Classroom Assessment Scoring…	5
Conners Teacher Rating Scale	5
Draw a Person Test	5
Raven Progressive Matrices	5
ACTFL Oral Proficiency…	4
More ▼