Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Álvarez-Díaz, Marcos; Muñiz-Bascón, Luis Magín; Soria-Alemany, Antonio; Veintimilla-Bonet, Alberto; Fernández-Alonso, Rubén – International Journal of Music Education, 2021
Evaluation of music performance in competitive contexts often produces discrepancies between the expert judges. These discrepancies can be reduced by using appropriate rubrics that minimise the differences between judges. The objective of this study was the design and validation of an analytical evaluation rubric, which would allow the most…
Descriptors: Competition, Music Activities, Performance, Scoring Rubrics
Han, Chao; Zhao, Xiao – Assessment & Evaluation in Higher Education, 2021
The accuracy of peer ratings on students' performance has attracted much attention from higher education researchers. In this study, we attempted to explore the accuracy of peer ratings on the quality of spoken-language interpreting in the context of tertiary-level interpreter training. We sought to understand how different types of peer raters…
Descriptors: Accuracy, Peer Evaluation, Oral Language, Interpretive Skills
Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024
This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…
Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy
Anne Hyslop – Urban Institute, 2025
Governors and other state leaders know that the future economic competitiveness of their state depends on the strength of their education system. For years, college and career readiness (CCR) has been the mantra of many education leaders. Using education and workforce data, states went beyond "measuring" whether their high school…
Descriptors: College Readiness, Career Readiness, High School Students, High Schools
Emrah Higde; Ahmet Volkan Yüzüak; Zekiye Merve Öcal; Hilal Aktamis – Journal of Baltic Science Education, 2024
The Many-Facet Rasch model is frequently used to analyse and minimize disparities in rater (judge) severity in performance evaluations, in which raters assign scores to test-takers' performances. In this research, the aim of the present study was to analyse science teacher candidates' laboratory activities by using the Many-facet Rasch model.…
Descriptors: Science Laboratories, Learning Activities, Science Process Skills, Student Attitudes
Kastelic, Kaja; Šarabon, Nejc – Measurement in Physical Education and Exercise Science, 2019
Self-reports are commonly used tools for obtaining sedentary behaviors. The aim of our study was to assess agreement between two self-reports of sedentary time and a gold standard sedentary time objective monitor. A worksite sample (n = 42) completed the Slovenian version of the Global Physical Activity Questionnaire (GPAQ), the Slovenian version…
Descriptors: Physical Activity Level, Measurement Techniques, Measurement Equipment, Foreign Countries
Qiu, Jia; Barton, Erin E.; Choi, Gounah – Journal of Special Education, 2019
The purpose of this study was to examine the efficacy of the system of least prompts (SLP) for increasing the levels of play behaviors in four young children with disabilities. A multiple probe across participants' single case research design was used to examine the relation between SLP and child-targeted behaviors. The results demonstrated that…
Descriptors: Prompting, Play, Young Children, Disabilities
Gitomer, Drew H.; Martínez, José Felipe; Battey, Dan; Hyland, Nora E. – American Educational Research Journal, 2021
The Educative Teacher Performance Assessment (edTPA) is a system of standardized portfolio assessments of teaching performance mandated for use by educator preparation programs in 18 states, and approved in 21 others, as part of initial certification for preservice teachers. Because of the high stakes involved for examinees, it is critical that…
Descriptors: Evaluation, Performance Based Assessment, Test Reliability, Test Validity
Lemay, Lise; Cantin, Gilles; Lemire, Julie; Bouchard, Caroline – Early Years: An International Journal of Research and Development, 2021
The objective of this paper is to present the conception and validation of the Quality of Educators' Observation and Planning Practices Scale (QEOPPS). The instrument assesses the quality of early childhood educators' observation practices (observing children, collecting information, using the collected information, paying attention to each child)…
Descriptors: Test Construction, Test Validity, Observation, Educational Planning
Lynsey Joohyun Lee – ProQuest LLC, 2021
Reliability and validity are two important topics that have been studied for many decades in the educational measurement field, including discussions of Writing Studies' subfield of writing assessment, since the establishment of the College Entrance Exam Board [CEEB] in 1899 (Huot et al., 2010). In recent years, scholarly conversations of fairness…
Descriptors: Writing Evaluation, Test Validity, Test Reliability, Case Studies
Begrich, Lukas; Fauth, Benjamin; Kunter, Mareike – Social Psychology of Education: An International Journal, 2020
In recent decades, the assessment of instructional quality has grown into a popular and well-funded arm of educational research. The present study contributes to this field by exploring first impressions of untrained raters as an innovative approach of assessment. We apply the thin slice procedure to obtain ratings of instructional quality along…
Descriptors: Student Attitudes, Expertise, Classroom Techniques, Educational Assessment
Gardiner, Lorraine R.; Kim, Dong-Gook; Helms, Marilyn M. – Journal of Education for Business, 2020
Assurance of learning assessment results are used as a basis for curriculum changes at Association to Advance Collegiate Schools of Business-accredited institutions. Because budget constraints and opportunity costs impact all schools, assessment quality is critical. This research uses a longitudinal, business school case study to highlight bias…
Descriptors: College Outcomes Assessment, Business Schools, Bias, Interrater Reliability
Mary M. Stone; Sudi Kash; Teresa Butler; Karolina Callahan; Miguel A. Verdugo; Laura E. Gómez – Journal of Developmental and Physical Disabilities, 2020
Quality of life (QoL) is a key outcome used to monitor service planning and delivery for individuals with Intellectual and Developmental Disabilities (IDD). Unfortunately, many current instruments used to measure QoL have psychometric and content limitations and none are suitable for use with individuals with the lowest levels of functioning and…
Descriptors: Quality of Life, Autism Spectrum Disorders, Residential Care, Measures (Individuals)
Jingwan Tang – ProQuest LLC, 2023
This study aims to explore the validity of measuring joint attention through gaze coordination in computer-supported collaborative learning (CSCL) research. Gaze coordination, aligning visual attention in social contexts, aids comprehension and communication. Many CSCL researchers use gaze coordination to gauge joint attention quantitatively.…
Descriptors: Mathematics Instruction, Problem Solving, Cooperative Learning, Eye Movements
Rosario A. Marroquín-Flores; Rose Marie Tijerina; Mason Tedeschi; Sofia Banjara; Redmon Warmsley; Luke McFather; Zianna Casas; Lisa B. Limeri – CBE - Life Sciences Education, 2024
Students who hold minoritized identities are underrepresented in science, technology, engineering, and math (STEM) fields. Educational institutions often apply a deficit lens to understanding disproportionate outcomes between minoritized students and those from the cultural majority. Community Cultural Wealth (CCW) is an asset-based framework that…
Descriptors: Undergraduate Students, Minority Group Students, Low Income Students, STEM Education

Peer reviewed
Direct link
