Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
He, Tung-hsien – SAGE Open, 2019
This study employed a mixed-design approach and the Many-Facet Rasch Measurement (MFRM) framework to investigate whether rater bias occurred between the onscreen scoring (OSS) mode and the paper-based scoring (PBS) mode. Nine human raters analytically marked scanned scripts and paper scripts using a six-category (i.e., six-criterion) rating…
Descriptors: Computer Assisted Testing, Scoring, Item Response Theory, Essays
Jeong, Heejeong – Language Testing in Asia, 2019
In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects of a binary scale and an analytic scale across teacher raters and expert raters. The purpose of the study is to find out how different scale…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Cooke, Nancy L.; Slee, Jill M.; Young, Cheryl A. – Reading Improvement, 2020
There is some evidence that reading and spelling are complementary processes. The purpose of this study was to investigate the extent to which contextualized spelling (i.e., spelling activities within the context of reading instruction) is used to support reading in first-grade core reading programs. Analysis of 75 lessons across five programs…
Descriptors: Spelling, Reading Instruction, Teaching Methods, Reading Programs
Su, King-Dow – Journal of Baltic Science Education, 2020
To be familiar with micro and symbolic performances, students could work out more effective approaches of innovated techniques known as five hierarchical designs in chemistry equilibrium. However, the most frequently reported problem in students' assessment of chemistry study is attributed to their poor skill recognizing basic concepts. The aim of…
Descriptors: Science Instruction, Scientific Concepts, Concept Formation, Chemistry
Ahmadi, Alireza – Taiwan Journal of TESOL, 2020
Rater subjectivity has long been an intriguing topic. The use of discussion as a resolution method is a practical way to reduce this subjectivity. However, the efficacy of discussion depends on whether different raters get equally engaged in it or one rater tends to dominate others. This study investigated whether and how rater dominance occurs in…
Descriptors: Evaluators, Interrater Reliability, Discussion, Discourse Analysis
Floman, James L.; Hagelskamp, Carolin; Brackett, Marc A.; Rivers, Susan E. – Journal of Psychoeducational Assessment, 2017
Classroom observations increasingly inform high-stakes decisions and research in education, including the allocation of school funding and the evaluation of school-based interventions. However, trends in rater scoring tendencies over time may undermine the reliability of classroom observations. Accordingly, the present investigations, grounded in…
Descriptors: Observation, Bias, Psychological Patterns, Grade 5
Caspersen, Janna R.; Van Holt, Tracy; Johnson, Jeffrey C. – Field Methods, 2017
This article offers a way to measure agreement in participatory mapping. We asked subject matter experts (SMEs) to draw where Sudanese ethnic groups were located on a map. We then used an eigenanalysis approach to determine whether SMEs agreed on the location of ethnic groups. We used minimum residual factor analysis to assess the extent of…
Descriptors: Measurement Techniques, Expertise, Maps, Ethnic Groups
Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017
Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…
Descriptors: Automation, Scoring, Comparative Analysis, Test Items
Wainer, Allison L.; Berger, Natalie I.; Ingersoll, Brooke R. – Journal of Autism and Developmental Disorders, 2017
Despite the expansion of early intervention approaches for young children with ASD, investigators have struggled to identify measures capable of assessing social communication change in response to these interventions. Addressing recent calls for efficient, sensitive, and reliable social communication measures, the current paper outlines the…
Descriptors: Psychometrics, Pervasive Developmental Disorders, Autism, Communication Disorders
Curtis, Mary D.; Green, Ambra L. – Social Studies, 2021
Progressing through schools may be challenging for some students, especially those with learning disabilities (LD). In social studies, for example, students grapple with increasingly complex texts, independent work, direct instruction, critical thinking, analysis, and other learning demands. As students transition from elementary schools where…
Descriptors: Social Studies, Teaching Methods, Evidence Based Practice, Students with Disabilities
Kovalkov, Anastasia; Paassen, Benjamin; Segal, Avi; Gal, Kobi; Pinkwart, Niels – International Educational Data Mining Society, 2021
Promoting creativity is considered an important goal of education, but creativity is notoriously hard to define and measure. In this paper, we make the journey from defining a formal creativity and applying the measure in a practical domain. The measure relies on core theoretical concepts in creativity theory, namely fluency, flexibility, and…
Descriptors: Creativity, Theory Practice Relationship, Evaluators, Specialists
Takeda, Kazuya; Tanabe, Shigeo; Koyama, Soichiro; Nagai, Tomoko; Sakurai, Hiroaki; Kanada, Yoshikiyo; Shomoto, Koji – Measurement in Physical Education and Exercise Science, 2018
The aim of this study was to clarify the intra- and inter-rater reliability of the rate of force development in hip abductor muscle force measurements using a hand-held dynamometer. Thirty healthy adults were separately assessed by two independent raters on two separate days. Rate of force development was calculated from the slope of the…
Descriptors: Interrater Reliability, Human Body, Measurement Equipment, Handheld Devices
Mandy, William; Clarke, Kiri; McKenner, Michele; Strydom, Andre; Crabtree, Jason; Lai, Meng-Chuan; Allison, Carrie; Baron-Cohen, Simon; Skuse, David – Journal of Autism and Developmental Disorders, 2018
We developed a brief, informant-report interview for assessing autism spectrum conditions (ASC) in adults, called the Developmental, Dimensional and Diagnostic Interview-Adult Version (3Di-Adult); and completed a preliminary evaluation. Informant reports were collected for participants with ASC (n = 39), a non-clinical comparison group (n = 29)…
Descriptors: Autism, Pervasive Developmental Disorders, Adults, Diagnostic Tests
Duijm, Klaartje; Schoonen, Rob; Hulstijn, Jan H. – Language Testing, 2018
It is general practice to use rater judgments in speaking proficiency testing. However, it has been shown that raters' knowledge and experience may influence their ratings, both in terms of leniency and varied focus on different aspects of speech. The purpose of this study is to identify raters' relative responsiveness to fluency and linguistic…
Descriptors: Language Fluency, Accuracy, Second Languages, Language Tests
Mason, Kazlin N.; Pua, Eshan; Perry, Jamie L. – International Journal of Language & Communication Disorders, 2018
Background: Posterior nasal fricatives are a learned compensatory articulation error and commonly substituted for oral fricatives. Treatment of such articulation errors requires the modification or teaching of skilled movements. A motor-based approach is designed to teach the complex motor skill movement sequences required in the production of…
Descriptors: Speech Language Pathology, Intervention, Psychomotor Skills, Articulation Impairments

Peer reviewed
Direct link
