Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Bieda, Kristen N.; Salloum, Serena J.; Hu, Sihua; Sweeny, Shannon; Lane, John; Torphy, Kaitlin – Journal of Classroom Interaction, 2020
This paper discusses the challenges and lessons learned from conducting observations to measure the quality of classroom practice for a large-scale study of elementary teachers' mathematics instruction. Specifically, this paper shares our process for obtaining valid data for quality of elementary mathematics instruction; what we learned can inform…
Descriptors: Mathematics Instruction, Classroom Observation Techniques, Elementary School Teachers, Interrater Reliability
Steele, Catriona M.; Peladeau-Pigeon, Melanie; Nagy, Ahmed; Waito, Ashley A. – Journal of Speech, Language, and Hearing Research, 2020
Purpose: The field lacks consensus about preferred metrics for capturing pharyngeal residue on videofluoroscopy. We explored four different methods, namely, the visuoperceptual Eisenhuber scale and three pixel-based methods: (a) residue area divided by vallecular or pyriform sinus spatial housing ("%-Full"), (b) the Normalized Residue…
Descriptors: Human Body, Physiology, Speech Language Pathology, Measurement Techniques
Chekurov, Sergei; Wang, Meng; Salmi, Mika; Partanen, Jouni – Education Sciences, 2020
The purpose of this article is to present a design for additive manufacturing assignment focused on creativity rather than functionality and to analyze its results (N = 70) acquired during five years. The assignment teaches the unique advantages of additive manufacturing to engineering students and encourages learning from failure to achieve…
Descriptors: Undergraduate Students, Engineering Education, Manufacturing, Computer Peripherals
Erguvan, Inan Deniz; Aksu Dunya, Beyza – Language Testing in Asia, 2020
This study examined the rater severity of instructors using a multi-trait rubric in a freshman composition course offered in a private university in Kuwait. Use of standardized multi-trait rubrics is a recent development in this course and student feedback and anchor papers provided by instructors for each essay exam necessitated the assessment of…
Descriptors: Foreign Countries, College Freshmen, Freshman Composition, Writing Evaluation
Simonsen, Brandi; Freeman, Jennifer; Kooken, Janice; Dooley, Kathryn; Gambino, Anthony J.; Wilkinson, Sarah; VanLone, Janet; Walters, Sharon; Byun, Sang Gyu; Xu, Xin; Lupo, Kelly; Kern, Laura – School Psychology, 2020
Effective classroom management is critical for student and teacher success. Because teachers receive limited preservice preparation and in-service support in classroom management, educational leaders (e.g., school psychologists, behavior coaches, mentor teachers, and administrators) need efficient and effective tools to identify teachers'…
Descriptors: Test Validity, Classroom Techniques, Teacher Evaluation, Rating Scales
Evaluating an Explicit Instruction Teacher Observation Protocol through a Validity Argument Approach
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Grantee Submission, 2020
In this study, we examined the scoring and generalizability assumptions of an Explicit Instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…
Descriptors: Direct Instruction, Teacher Evaluation, Classroom Observation Techniques, Validity
Bergmann, Thomas; Heinrich, Manuel; Ziegler, Matthias; Dziobek, Isabel; Diefenbacher, Albert; Sappok, Tanja – Journal of Autism and Developmental Disorders, 2019
Initial studies have presented the "Music-based Scale for Autism Diagnostics" (MUSAD) as a promising DSM-5-based observational tool to identify autism spectrum disorder (ASD) in adults with intellectual disability (ID). The current study is the first to address its clinical utility in a new sample of 124 adults with ID (60.5% diagnosed…
Descriptors: Autism, Pervasive Developmental Disorders, Adults, Intellectual Disability
Palermo, Corey; Bunch, Michael B.; Ridge, Kirk – Journal of Educational Measurement, 2019
Although much attention has been given to rater effects in rater-mediated assessment contexts, little research has examined the overall stability of leniency and severity effects over time. This study examined longitudinal scoring data collected during three consecutive administrations of a large-scale, multi-state summative assessment program.…
Descriptors: Scoring, Interrater Reliability, Measurement, Summative Evaluation
Sondergeld, Toni A.; Johnson, Carla C. – School Science and Mathematics, 2019
In response to the call for more rigorously validated educational assessments, this study used an iterative multimethod validation process to develop and validate outcomes from the 21st Century Skills Assessment global rating scale. Qualitative and quantitative data sources were used to inform four types of validity evidence: content, response…
Descriptors: 21st Century Skills, Test Construction, Test Validity, Educational Assessment
Cascio, M. Ariel; Lee, Eunlye; Vaudrin, Nicole; Freedman, Darcy A. – Field Methods, 2019
In this article, we discuss methodological opportunities related to using a team-based approach for iterative-inductive analysis of qualitative data involving detailed open coding of semistructured interviews and focus groups. Iterative-inductive methods generate rich thematic analyses useful in sociology, anthropology, public health, and many…
Descriptors: Coding, Teamwork, Interrater Reliability, Data Analysis
Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022
Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…
Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning
Rossman, Teri L. – ProQuest LLC, 2022
This study utilized a quantitative research approach with a survey design to determine whether a sample of K-8 principals in the state of Illinois, that rated teachers in a set of classroom instructional videos, exhibited inter-rater reliability since they all received the Illinois state-approved training and certification for teacher evaluators.…
Descriptors: Interrater Reliability, Kindergarten, Principals, Administrator Surveys
McKenna, Meaghan; Dedrick, Robert F.; Goldstein, Howard – Assessment for Effective Intervention, 2022
This article describes the development of the Early Elementary Writing Rubric (EEWR), an analytic assessment designed to measure kindergarten and first-grade writing and inform educators' instruction. Crocker and Algina's (1986) approach to instrument development and validation was used as a guide to create and refine the writing measure. Study 1…
Descriptors: Scoring Rubrics, Beginning Writing, Writing Evaluation, Test Construction
Yesildag Hasancebi, Funda; Yuksel, Busra Tuncay; Mesci, Gunkut – International Journal of Assessment Tools in Education, 2022
The purpose of this study was to develop a reliable and valid rating scale for the use of the assessment and evaluation of lesson plans and teaching practices that are based on argumentation-based inquiry (ABI). The study covered two academic years (four academic semesters). Qualitative and quantitative methods were utilized throughout the…
Descriptors: Foreign Countries, Rating Scales, Test Construction, Test Validity
Marzieh Pashmdarfard; Afsoon Hassani Mehraban; Narges Shafaroodi; Kamran Soltani Arabshahi; Soroor Parvizy; Akram Azad; Samaneh Karamali Esmaeili – Journal of Occupational Therapy Education, 2022
Fieldwork education is an integral part of the educational process in occupational therapy and assessing student competency at the end of fieldwork is important. The aim of this study was to design and conduct an Objective Structured Clinical Examination (OSCE) based on the Occupational Therapy Practice Framework (OTPF) for occupational therapy…
Descriptors: Occupational Therapy, Allied Health Occupations Education, Test Construction, Test Validity

Peer reviewed
Direct link
