Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Peltier, Corey; Flores, Margaret M.; Strickland, Tricia K. – Learning Disability Quarterly, 2023
Single-case research design is a useful methodology for evaluating the presence of a functional relation between an intervention and the mathematical performance of students with a learning disability. However, a functional relation cannot be established with threats to internal validity of the design. External validity is impacted if researchers…
Descriptors: Research Design, Intervention, Mathematics Achievement, Students with Disabilities
Quinn McAvoy – ProQuest LLC, 2023
Assessment and evaluation are at the heart of all functioning systems, as progress can be determined with the information collected by these assessments. The problem addressed in this study is the lack of actionable data to evaluate whether assessments aligned with Montessori benchmarks support the achievement of student outcomes. To address the…
Descriptors: Mathematics Tests, Scores, Montessori Method, Benchmarking
Wesley A. Sims; Rondy Yu; Kathleen R. King; Danielle Zahn; Nina Mandracchia; Elissa Monteiro; Melissa Klaib – Assessment for Effective Intervention, 2023
Classroom management (CM) practices have a well-established, intuitive, and empirical connection with student academic, social, emotional, and behavioral outcomes. CM, defined as educator practices used to create supportive classroom environments, may be the implementation factor that is most impactful of the universal Tier I supports. Recognizing…
Descriptors: Classroom Techniques, Secondary School Teachers, Inservice Teacher Education, Behavior Rating Scales
Rashid, Sehar; Mahmood, Nasir – Bulletin of Education and Research, 2020
The study aimed to validate the factors that affect the inter-rater reliability of secondary school certificate (SSC) papers in high-stake testing. For this purpose, papers of Urdu and English of Board of Intermediate and Secondary Education (BISE) were selected. A survey method was used to collect marking on the same set of papers from each rater…
Descriptors: High Stakes Tests, Interrater Reliability, Secondary School Students, Foreign Countries
Baimpos, Theodoros; Dittel, Nils; Borissov, Roumen – Research Evaluation, 2020
In this study, we analyze the two-phase bottom-up procedure applied by the Future and Emerging Technologies Program (FET-Open) at the Research Executive Agency (REA) of the European Commission (EC), for the evaluation of highly interdisciplinary, multi-beneficiary research proposals which request funding. In the first phase, remote experts assess…
Descriptors: Peer Evaluation, Research Proposals, Interdisciplinary Approach, Financial Support
Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items
Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020
The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…
Descriptors: Test Bias, Interrater Reliability, Responses, Correlation
Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020
Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…
Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries
Gunjawate, Dhanshree R.; Ravi, Rohit; Bhagavan, Srividya – Journal of Speech, Language, and Hearing Research, 2020
Purpose: The purpose of this study was to evaluate the reliability and validity of the Kannada version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). Method: The Kannada version of CAPE-V comprises six phrases that are phonetically designed as per the CAPE-V requirements. Sixty-five (21 individuals with dysphonia and 44…
Descriptors: Test Reliability, Test Validity, Dravidian Languages, Voice Disorders
Pin, Tamis W.; So, Vincent K. K.; Siu, Cynthia S. H.; Yip, Sheila S. N.; Cheung, Stella See-wing; Kan, Jenny Yim-mui – Journal of Autism and Developmental Disorders, 2021
To examine reliability and validity of the new Social Motor Function Classification System for Children with Autism Spectrum Disorders (SMFCS-ASD). The SMFCS-ASD reliability was examined on 25 children (62.4 months SD 7.8) with ASD among six physical therapists. The validity study involved 1001 children (57.0 months, SD 9.9) with ASD using the…
Descriptors: Autism, Pervasive Developmental Disorders, Children, Classification
Wind, Stefanie A.; Jones, Eli; Bergin, Christi – School Effectiveness and School Improvement, 2021
Classroom observation is a common approach to teacher evaluation. Yet, concerns about differences in rater judgment are widespread. Despite this concern, few researchers have examined the practical impact of such differences in rater judgments on teachers' judged effectiveness. This study fills that gap. Using data from a large-scale teacher…
Descriptors: Principals, Teacher Evaluation, Interrater Reliability, Elementary School Teachers
Weston, Timothy J.; Hayward, Charles N.; Laursen, Sandra L. – American Journal of Evaluation, 2021
Observations are widely used in research and evaluation to characterize teaching and learning activities. Because conducting observations is typically resource intensive, it is important that inferences from observation data are made confidently. While attention focuses on interrater reliability, the reliability of a single-class measure over the…
Descriptors: Generalizability Theory, Observation, Inferences, Social Science Research
Konstantin Vinokic; Lukas Begrich; Mareike Kunter; Susanne Kuger – Frontline Learning Research, 2024
Thin slices ratings (i.e., ratings based on first impressions) have yielded intriguingly accurate results in various domains. Among other, researcher have applied the thin slices technique to assess instructional quality, showing that teacher-student interactions can be reliably inferred by just very short snippets of classroom instruction. The…
Descriptors: Teacher Effectiveness, Teacher Student Relationship, Foreign Countries, Classroom Observation Techniques
Primary School Students' Ratings of Teaching -- Do They Differentiate between Subjects and Teachers?
Svenja Rieser; Alexander Naumann – School Effectiveness and School Improvement, 2024
Our study aims to provide empirical evidence for and against the valid use of primary school students' ratings of three generic dimensions of teaching quality (classroom management, supportive climate, cognitive activation). We examine whether students discriminate between corresponding dimensions in different subjects, taking into account whether…
Descriptors: Foreign Countries, Elementary School Students, Elementary School Teachers, Student Evaluation of Teacher Performance
Walker, Grant M.; Basilakos, Alexandra; Fridriksson, Julius; Hickok, Gregory – Journal of Speech, Language, and Hearing Research, 2022
Purpose: Meaningful changes in picture naming responses may be obscured when measuring accuracy instead of quality. A statistic that incorporates information about the severity and nature of impairments may be more sensitive to the effects of treatment. Method: We analyzed data from repeated administrations of a naming test to 72 participants with…
Descriptors: Naming, Change, Aphasia, Severity (of Disability)
Tschida, Jessica E.; Yerys, Benjamin E. – Autism: The International Journal of Research and Practice, 2022
Executive function challenges are commonly reported in the home setting for children with an autism spectrum disorder diagnosis (hereafter, autism), but little is known about these challenges in the school setting. A total of 337 youth (autism, N = 241 and typically developing, N = 96) were assessed using Behavior Rating Inventory of Executive…
Descriptors: Executive Function, Students with Disabilities, Age Differences, Behavior Problems

Peer reviewed
Direct link
