Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Bryce D. McLeod; Nicole Porter; Aaron Hogue; Emily M. Becker-Haimes; Amanda Jensen-Doss – Grantee Submission, 2023
Objective: The precise measurement of treatment fidelity (quantity and quality in the delivery of treatment strategies in an intervention) is essential for intervention development, evaluation, and implementation. Various informants are used in fidelity assessment (e.g., observers, practitioners [clinicians, teachers], clients), but these…
Descriptors: Measurement, Fidelity, Educational Research, Evidence Based Practice
Bonett, Douglas G. – Journal of Educational and Behavioral Statistics, 2022
The limitations of Cohen's ? are reviewed and an alternative G-index is recommended for assessing nominal-scale agreement. Maximum likelihood estimates, standard errors, and confidence intervals for a two-rater G-index are derived for one-group and two-group designs. A new G-index of agreement for multirater designs is proposed. Statistical…
Descriptors: Statistical Inference, Statistical Data, Interrater Reliability, Design
Marcus Messer; Neil C. C. Brown; Michael Kölling; Miaojing Shi – ACM Transactions on Computing Education, 2025
Providing consistent summative assessment to students is important, as the grades they are awarded affect their progression through university and future career prospects. While small cohorts are typically assessed by a single assessor, such as the module/class leader, larger cohorts are often assessed by multiple assessors, typically teaching…
Descriptors: Foreign Countries, Grading, Interrater Reliability, Teaching Assistants
Barbara Jane Cunningham; Peter Rosenbaum; Anastasia Nepotiuk; Nancy Thomas-Stonell – Communication Disorders Quarterly, 2024
This brief report presents interrater reliability data for the Focus on the Outcomes of Communication Under Six (FOCUS-34) between parents, and between parents and speech-language pathologists (SLPs). Reliability for all three raters combined was good to excellent across three assessments. Reliability for pairs of raters was variable but generally…
Descriptors: Interrater Reliability, Outcome Measures, Preschool Children, Parents
Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024
This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…
Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory
Enninga, Annemieke; Waninge, Aly; Post, Wendy J.; van der Putten, Annette A. J. – Journal of Applied Research in Intellectual Disabilities, 2023
Background: Persons with profound intellectual and multiple disabilities (PIMD) are vulnerable when it comes to experiencing pain. Reliable assessment of pain-related behaviour in these persons is difficult. "Aim" To determine how pain items can be reliably scored in adults with PIMD. Methods: We developed an instruction protocol for the…
Descriptors: Test Reliability, Pain, Behavior, Adults
Gilstrap, Donald L.; Whitver, Sara Maurice; Scalfani, Vincent F.; Bray, Nathaniel J. – Innovative Higher Education, 2023
This article explores how well bibliometrics and altmetrics reflect research impact in relation to Boyer's Model of the Scholarship. Indices used for both types of metrics are explored and discussed while including an analysis on primary methodological works performed on each in the literature to date. As confirmatory in nature, we chose as our…
Descriptors: Bibliometrics, Models, Scholarship, Research
Ichikowitz, Kerri; Bruce, Carolyn; Meitanis, Vanessa; Cheung, Kelly; Kim, Yekyung; Talbourdet, Esther; Newton, Caroline – International Journal of Language & Communication Disorders, 2023
Background: People with aphasia (PWA) can experience functional numeracy difficulties, that is, problems understanding or using numbers in everyday life, which can have numerous negative impacts on their daily lives. There is growing interest in designing functional numeracy interventions for PWA; however, there are limited suitable assessments…
Descriptors: Test Construction, Test Validity, Numeracy, Adults
Lamprianou, Iasonas – Sociological Methods & Research, 2023
This study investigates inter- and intracoder reliability, proposing a new approach based on social network analysis (SNA) and exponential random graph models (ERGM). During a recent exit poll, the responses of voters to two open-ended questions were recorded. A coding experiment was conducted where a group of coders coded a sample of text…
Descriptors: Interrater Reliability, Coding, Social Networks, Network Analysis
Evans, Tanya; Mejía-Ramos, Juan Pablo; Inglis, Matthew – Educational Studies in Mathematics, 2022
Offering explanations is a central part of teaching mathematics, and understanding those explanations is a vital activity for learners. Given this, it is natural to ask what makes a good mathematical explanation. This question has received surprisingly little attention in the mathematics education literature, perhaps because the field has no…
Descriptors: Mathematics, Professional Personnel, Undergraduate Students, Mathematics Activities
Rice, C. E.; Carpenter, L. A.; Morrier, M. J.; Lord, C.; DiRienzo, M.; Boan, A.; Skowyra, C.; Fusco, A.; Baio, J.; Esler, A.; Zahorodny, W.; Hobson, N.; Mars, A.; Thurm, A.; Bishop, S.; Wiggins, L. D. – Journal of Autism and Developmental Disorders, 2022
This paper describes a process to define a comprehensive list of exemplars for seven core Diagnostic and Statistical Manual (DSM) diagnostic criteria for autism spectrum disorder (ASD), and report on interrater reliability in applying these exemplars to determine ASD case classification. Clinicians completed an iterative process to map specific…
Descriptors: Autism Spectrum Disorders, Clinical Diagnosis, Test Reliability, Interrater Reliability
Vonna L. Hemmler; Allison W. Kenney; Susan Dulong Langley; Carolyn M. Callahan; E. Jean Gubbins; Shannon Holder – Grantee Submission, 2022
Though qualitative research has become more prevalent in practice over the last 30 years, there is still considerable uncertainty among researchers regarding how to ensure inter-rater consistency when teams are tasked with coding qualitative data. In this article, we offer an explanation of a methodology our qualitative team used to achieve…
Descriptors: Interrater Reliability, Coding, Guides, Data Collection
Rosanna Cole – Sociological Methods & Research, 2024
The use of inter-rater reliability (IRR) methods may provide an opportunity to improve the transparency and consistency of qualitative case study data analysis in terms of the rigor of how codes and constructs have been developed from the raw data. Few articles on qualitative research methods in the literature conduct IRR assessments or neglect to…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Research Methodology
Matthew K. Burns; Heba Z. Abdelnaby; Jonie B. Welland; Katherine A. Graves; Kari Kurto – Assessment for Effective Intervention, 2024
The current study examined the reliability of The Reading League Curriculum-Evaluation Guidelines (CEGs), which were developed to help school-based teams rate the presence of red flags when considering adopting specific literacy curricula. Coders (n = 30) independently used the CEGs to evaluate a free online English language arts curriculum. The…
Descriptors: English Curriculum, English Instruction, Language Arts, Curriculum Evaluation
Cheung, Kason Ka Ching; Tai, Kevin W. H. – Research in Science & Technological Education, 2023
Background: Intercoder reliability is a statistic commonly reported by researchers to demonstrate the rigour of coding procedures during data analysis. Its importance is debatable in the analysis of qualitative interview data. It raises a question on whether researchers should identify the same codes and themes in a transcript or they should…
Descriptors: Interrater Reliability, Data Analysis, Interviews, Research Methodology

Peer reviewed
Direct link
