Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Rojahn, Johannes; Tasse, Marc J.; Sturmey, Peter – American Journal on Mental Retardation, 1997
Development of the Stereotyped Behavior Scale for adolescents and adults with mental retardation is described. Use with 600 individuals resulted in refinement and a 26-item scale with an internal consistency alpha of 0.88, test-retest reliability of p=0.90, and interrater reliability of p=0.76. (DB)
Descriptors: Adolescents, Adults, Behavior Patterns, Behavior Rating Scales
Peer reviewedFalchikov, Nancy; Magin, Douglas – Assessment & Evaluation in Higher Education, 1997
Discusses concerns and research about college student peer evaluation, particularly with regard to gender bias. Reports a study using blind marking, and examines results in relation to task and other contextual variables. Concludes that the blind marking technique contributes to the reliability of peer assessment, and outlines additional…
Descriptors: College Instruction, College Students, Group Dynamics, Higher Education
Peer reviewedDreman, Solly; Ronen-Eliav, Hagar – Journal of Marriage and the Family, 1997
Investigates divorced mothers' (N=119) perceptions of family cohesion and adaptability in relation to children's self-reports of behavioral problems. Results indicate that most mothers believed their children had the fewest behavior problems when family cohesion and adaptability were perceived as high and the most behavioral problems when these…
Descriptors: Adjustment (to Environment), Adolescents, Behavior Problems, Children
Peer reviewedKe, Chunaren – Modern Language Journal, 1996
Investigated the relationship between Chinese character recognition and production by second-language learners. Subjects were 47 first-year Chinese language students in the United States. (15 references) (Author/CK)
Descriptors: Chinese, College Students, Data Collection, Ideography
Peer reviewedPerkins, William H. – Journal of Speech and Hearing Disorders, 1990
A response is presented to commentaries (EC 232 375-377) on two papers (EC 232 373 and EC 232 374), focusing on research methodology on stuttering, the impact of improving intrajudge and interjudge agreement, the importance of studying stuttering as a private experience rather than an acoustical event, and speakers' experience of loss of control…
Descriptors: Auditory Perception, Clinical Diagnosis, Definitions, Evaluation
Peer reviewedUpshur, John A.; Turner, Carolyn E. – ELT Journal, 1995
Reviews the place of rating scales in second-language measurement and summarizes some of the problems associated with them. Standard and alternative scales were studied. High agreement among raters can be achieved even under conditions not favorable to high interrater reliability. The full range of score categories are effectively utilized. (17…
Descriptors: Evaluation Problems, Interrater Reliability, Language Tests, Measurement Techniques
Peer reviewedSmitherman, Geneva – Language and Education, 1992
Analysis of nearly 1,800 essays written by 17-year-old African-American students were examined in terms of the frequency and distribution of Black English Vernacular (BEV) and the covariance of BEV with rater scores. Results suggests that BEV has converged with Edited American/Standard English and that students were not penalized for BEV in…
Descriptors: Black Dialects, Black Students, Essays, Interrater Reliability
Peer reviewedEngelhard, George, Jr. – Journal of Educational Measurement, 1994
Rater errors (rater severity, halo effect, central tendency, and restriction of range) are described, and criteria are presented for evaluating rating quality based on a many-faceted Rasch (FACETS) model. Ratings of 264 compositions from the Eighth Grade Writing Test in Georgia by 15 raters illustrate the discussion. (SLD)
Descriptors: Criteria, Educational Assessment, Elementary Education, Elementary School Students
Peer reviewedWenrich, Marjorie D.; And Others – Academic Medicine, 1993
In a survey, 1,851 registered nurses evaluated 232 internists' humanistic qualities, communication skills, and selected aspects of their clinical skills. Their ratings corresponded moderately with peer physician evaluations and had a common structure but were lower for several humanistic qualities. A reliable assessment required 11-15 nurses'…
Descriptors: Communication Skills, Higher Education, Hospitals, Internal Medicine
Peer reviewedGeisinger, Kurt F. – Educational Measurement: Issues and Practice, 1991
Ways to use standard-setting data to adjust cutoff scores on examinations are reviewed. Ten sources of information to be used in determining standards are listed. The decision to modify passing scores should be based on these types of information and consideration of adverse impact or rating process irregularities. (SLD)
Descriptors: Cutting Scores, Evaluation Utilization, Evaluators, Interrater Reliability
Peer reviewedRuiz-Funes, Marcela – Foreign Language Annals, 2001
Explored how third-year-level university students represented an assigned reading-to-write task as indicated by the type of papers they produced, and the relationship between the linguistic quality of those papers and the type of task representation. Findings suggest that the ability to interpret a reading-to-write task appropriately is dependent…
Descriptors: College Students, Grammar, Higher Education, Interrater Reliability
Robbins, Robyn; Merrell, Kenneth W. – Diagnostique, 1998
Social behavior of 122 students in grades 6-8 who participate in a prevention program for behaviorally at-risk youth was rated by a parent, general education teacher, and at-risk program teacher. Parents and special program teachers were more likely to provide positive ratings of behaviors than their general education teachers. (Author/CR)
Descriptors: Antisocial Behavior, Behavior Disorders, Behavior Rating Scales, High Risk Students
Yarrough, Jamie L.; Skinner, Christopher H.; Lee, Young Ju; Lemmons, Cathy – Journal of Applied School Psychology, 2004
Campbell and Skinner used an A-B design to evaluate the effects of the Timely Transitions Game (TTG) on room-to-room transitions in a sixth-grade classroom. The TTG incorporated explicit timing, publicly posted feedback, and an interdependent group contingency with randomly selected transitions and criteria. The purpose of the current study was to…
Descriptors: Grade 2, Research Design, Integrity, Interrater Reliability
Beuttler, Marybeth Grant; Leininger, Peter M.; Palisano, Robert J. – Physical & Occupational Therapy in Pediatrics, 2004
Purpose: The purpose of this study was to examine the test-retest and inter-rater reliability of a measure of muscle extensibility developed by Tardieu, de la Tour, Bret, and Tardieu (1982) in fullterm and preterm newborns. Method: Twenty-one fullterm infants and twenty preterm infants were examined by two physical therapists. Each physical…
Descriptors: Premature Infants, Neonates, Human Body, Motor Development
Coniam, David – Language Assessment Quarterly, 2005
This article describes a study that emerged as a result of the severe acute respiratory syndrome crisis that struck Hong Kong in 2003. One outcome of the severe acute respiratory syndrome crisis was that all personnel in all educational institutions in Hong Kong were compelled to wear face masks for the period April-August 2003. Consequently, the…
Descriptors: Schools, Examiners, Foreign Countries, Grade 11

Direct link
