Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Brendan Bartanen; Andrew Kwok – Annenberg Institute for School Reform at Brown University, 2020
Using rich longitudinal data from one of the largest teacher education programs in Texas, we examine the measurement of pre-service teacher (PST) quality and its relationship with entry into the K-12 public school teacher workforce. Drawing on rubric-based observations of PSTs during clinical teaching, we find that little of the variation in…
Descriptors: Longitudinal Studies, Preservice Teachers, Teacher Education Programs, Kindergarten
Maxwell, Bruce; Boon, Helen; Tanchuk, Nicolas; Rauwerda, Bryan – Journal of Moral Education, 2021
This article documents the adaptation, piloting and validation of a measure of teachers' ethical sensitivity. To create the test, we modified a measure from dentistry drawing on literature in teacher professional ethics and drew on the expertise of professional ethics scholars and practitioners. Based on the results of Rasch analysis combined with…
Descriptors: Ethics, Moral Values, Scores, Teacher Education Programs
Wang, Yuqi; Ren, Wei – Language Learning Journal, 2022
L2 pragmatics have explored the effects of different factors on different aspects of learners' pragmatic performance, but often not simultaneously. In addition, syntactic complexity is rarely examined in L2 pragmatics. This cross-sectional study aimed to conduct a multidimensional analysis to explore the effects of proficiency and study-abroad…
Descriptors: Pragmatics, Second Language Learning, Second Language Instruction, English (Second Language)
Li, Zijia; Gooden, Caroline; Toland, Michael D. – Journal of Early Intervention, 2019
This study provides preliminary evidence for reliability and validity of the Hawaii Early Learning Profile Strands 0-3 (HELP Strands 0-3), an assessment instrument for young children. First, the degree of interobserver agreement for a sample of representative HELP items was examined; results indicated that HELP scoring was dependable and…
Descriptors: Measures (Individuals), Psychometrics, Early Childhood Education, Test Reliability
Massar, Michelle M.; McIntosh, Kent; Mercer, Sterett H. – Remedial and Special Education, 2019
Assessing fidelity of implementation of school-based interventions is a critical factor in successful implementation and sustainability. The Tiered Fidelity Inventory (TFI) was developed as a comprehensive measure of all three tiers of School-Wide Positive Behavioral Interventions and Supports (SWPBIS) and is intended to measure the extent to…
Descriptors: Fidelity, Intervention, Program Implementation, Positive Behavior Supports
Banerjee, Rashida; Movahedazarhouligh, Sara; Millen, Kaitlyn; Luckner, John L. – Topics in Early Childhood Special Education, 2018
Valid and evidence-informed practices are critical to help young children with disabilities and their families with highly effective interventions and instruction to reach their potentials. Replication research is critical for appraising research and identifying evidence-based practices. The purpose of this study was to replicate the methods used…
Descriptors: Evidence, Early Childhood Education, Special Education, Replication (Evaluation)
Morris, Darrell; Pennell, Ashley M.; Perney, Jan; Trathen, Woodrow – Reading Psychology, 2018
This study compared reading rate to reading fluency (as measured by a rating scale). After listening to first graders read short passages, we assigned an overall fluency rating (low, average, or high) to each reading. We then used predictive discriminant analyses to determine which of five measures--accuracy, rate (objective); accuracy, phrasing,…
Descriptors: Reading Fluency, Prediction, Grade 1, Elementary School Students
van Rijn, Peter; Graf, Edith Aurora; Arieli-Attali, Meirav; Song, Yi – ETS Research Report Series, 2018
In this study, we explored the extent to which teachers agree on the ordering and separation of levels of two different learning progressions (LPs) in English language arts (ELA) and mathematics. In a panel meeting akin to a standard-setting procedure, we asked teachers to link the items and responses of summative educational assessments to LP…
Descriptors: Teacher Attitudes, Student Evaluation, Summative Evaluation, Language Arts
Musselwhite, Dorothy J.; Wesolowski, Brian C. – Journal of Research in Music Education, 2018
The purpose of this study was to evaluate the psychometric quality (i.e., validity and reliability) of a rating scale to assess pre-service teachers' lesson plan development in the context of secondary-level music performance classrooms. The research questions that guided this study include: (1) What items demonstrate acceptable model fit for the…
Descriptors: Psychometrics, Likert Scales, Preservice Teachers, Lesson Plans
Åhsberg, Elizabeth; Fahlström, Gunilla; Rönnbäck, Eva; Granberg, Ann-Kristin; Almborg, Ann-Helene – Research on Social Work Practice, 2017
Objective: To construct a needs assessment instrument for older people using a standardized terminology (International classification of functioning, disability, and health [ICF]) and assess its psychometrical properties. Method: An instrument was developed comprising questions to older people regarding their perceived care needs. The instrument's…
Descriptors: Caseworkers, Social Work, Older Adults, Needs Assessment
Edmunds, Sarah R.; Rozga, Agata; Li, Yin; Karp, Elizabeth A.; Ibanez, Lisa V.; Rehg, James M.; Stone, Wendy L. – Journal of Autism and Developmental Disorders, 2017
Children with autism spectrum disorder (ASD) show reduced gaze to social partners. Eye contact during live interactions is often measured using stationary cameras that capture various views of the child, but determining a child's precise gaze target within another's face is nearly impossible. This study compared eye gaze coding derived from…
Descriptors: Young Children, Autism, Pervasive Developmental Disorders, Eye Movements
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators
McGough, David J. – AERA Online Paper Repository, 2017
This paper describes the implementation of an inter-rater reliability measure for assessing portfolio scores in a teacher education program. The reliability coefficient for the portfolio scores from completers of a newly revised program were compared with the reliability coefficient of the scores from a second set of reviewers who discussed the…
Descriptors: Interrater Reliability, Teacher Education Programs, Program Evaluation, Portfolio Assessment
Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020
Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…
Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus
Chan, Stephanie W. Y.; Cheung, Wai Ming; Huang, Yanli; Lam, Wai-Ip; Lin, Chin-Hsi – Language Testing, 2020
Demand for second-language (L2) Chinese education for kindergarteners has grown rapidly, but little is known about these kindergarteners' L2 skills, with existing studies focusing on school-age populations and alphabetic languages. Accordingly, we developed a six-subtest Chinese character acquisition assessment to measure L2 kindergarteners'…
Descriptors: Chinese, Second Language Learning, Second Language Instruction, Written Language

Peer reviewed
Direct link
