Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Peer reviewedHellawell, D. J.; Signorini, D. F. – International Journal of Rehabilitation Research, 1997
Describes pilot studies of the Edinburgh Extended Glasgow Outcome Scale (EEGOS), designed to retain the advantages of the GOS (a measure commonly used in head injury research) but to allow comparison of recovery patterns in behavioral, cognitive, and physical function. Studies show that the interrater reliability of the EEGOS is comparable to that…
Descriptors: Head Injuries, Interrater Reliability, Neurological Impairments, Outcomes of Treatment
Peer reviewedPenny, Jim; Johnson, Robert L.; Gordon, Belita – Journal of Experimental Education, 2000
Used an analytic rubric to score 120 writing samples from Georgia's 11th grade writing assessment. Raters augmented scores by adding a "+" or "-" to the score. Results indicate that this method of augmentation tends to improve most indices of interrater reliability, although the percentage of exact and adjacent agreement…
Descriptors: High School Students, High Schools, Interrater Reliability, Scoring Rubrics
Peer reviewedCanivez, Gary L.; Watkins, Marley W.; Schaefer, Barbara A. – Psychology in the Schools, 2002
Investigation of interrater agreement for the Adjustment Scales for Children and Adolescents (ASCA) discriminant classifications is reported. Two teaching professionals provided independent ratings of the same child using the ASCA. A total of 119 students ranging in age from 7 to 18 years were independently rated. Results indicated significant and…
Descriptors: Adolescents, Children, Elementary Secondary Education, Interrater Reliability
Peer reviewedKaufman, James C.; Gentile, Claudia A.; Baer, John – Gifted Child Quarterly, 2005
Little research has been conducted on how gifted novices compare to experts in their judgments of creative writing. If novices and experts assign similar ratings, it could be argued that gifted novices are able to offer their peers feedback of a similar quality to that provided by experts. Such a finding would support the use of collaborative…
Descriptors: Psychologists, Literary Genres, Interrater Reliability, Feedback
Raghavan, R.; Marshall, M.; Lockwood, A.; Duggan, L. – Journal of Intellectual Disability Research, 2004
People with learning disability (LD) experience a range of mental health problems. They are a complex population, whose needs are not well understood. This study focuses on the development of a systematic process of needs assessment for this population. The Cardinal Needs Schedule used in general psychiatry was adapted for people with learning…
Descriptors: Psychiatry, Needs Assessment, Mental Disorders, Interrater Reliability
van der Schaaf, Marieke; Stokking, Karel; Verloop, Nico – Studies in Educational Evaluation, 2005
Portfolios are frequently used to assess teachers' competences. In portfolio assessment, the issue of rater reliability is a notorious problem. To improve the quality of assessments insight into raters' judgment processes is crucial. Using a mixed quantitative and qualitative approach we studied cognitive processes underlying raters' judgments and…
Descriptors: Portfolios (Background Materials), Systems Approach, Cognitive Processes, Portfolio Assessment
Jobes, David A.; Nelson, Kathryn N.; Peterson, Erin M.; Pentiuc, Daniel; Downing, Vanessa; Francini, Kristen; Kiernan, Amy – Suicide and Life-Threatening Behavior, 2004
Given the incidence and seriousness of suicidality in clinical practice, the need for new and better ways to assess suicide risk is clear. While there are many published assessment instruments in the literature, survey data suggest that these measure are not widely used. One possible explanation is that current quantitatively developed assessment…
Descriptors: Patients, Research Methodology, Interrater Reliability, Suicide
Livingston, Samuel A. – Journal of Educational and Behavioral Statistics, 2004
A performance assessment consisting of 10 separate exercises was scored with a randomized scoring procedure. All responses to each exercise were rated once; in addition, a randomly selected subset of the responses to each exercise received an independent second rating. Each second rating was averaged with the corresponding first rating before the…
Descriptors: Scoring, Performance Based Assessment, Interrater Reliability, Computation
Beckung, E.; Carlsson, G.; Carlsdotter, S.; Uvebrant, P. – Developmental Medicine & Child Neurology, 2007
The aim of this study was to explore motor development in children with cerebral palsy (CP) using developmental curves for CP, subtypes, and the five severity levels of the Gross Motor Function Classification System (GMFCS). The Gross Motor Function Measure (GMFM) and the GMFCS were applied to 317 children (145 females, 172 males) with CP, aged…
Descriptors: Age Differences, Classification, Motor Development, Cerebral Palsy
Keenan, Kate; Wakschlag, Lauren S.; Danis, Barbara; Hill, Carri; Humphries, Marisha; Duax, Jeanne; Donald, Radiah – Journal of the American Academy of Child & Adolescent Psychiatry, 2007
Objective: To test the reliability and validity of DSM-IV oppositional defiant and conduct disorders (ODD and CD) and symptoms using the Kiddie Disruptive Behavior Disorders Schedule and generate data on the manifestation of symptoms of ODD and CD in 3- to 5-year-old children. Method: One hundred twenty-three consecutive referrals to a child and…
Descriptors: Psychiatry, Psychopathology, Test Validity, Preschool Children
Zhou, Zheng; Xin, Tao – Psychology in the Schools, 2007
The traditional kappa statistic in assessing interrater agreement is not adequate when multiraters and multiattributes are involved. In this article, latent trait models are proposed to assess the multirater multiattribute (MRMA) agreement. Data from the Third International Mathematics and Science Studies (TIMSS) are used to illustrate the…
Descriptors: Intervention, School Psychology, Interrater Reliability, Item Response Theory
Hogan, Thomas P.; Murphy, Gavin – Applied Measurement in Education, 2007
We determined the recommendations for preparing and scoring constructed-response (CR) test items in 25 sources (textbooks and chapters) on educational and psychological measurement. The project was similar to Haladyna's (2004) analysis for multiple-choice items. We identified 12 recommendations for preparing CR items given by multiple sources,…
Descriptors: Test Items, Scoring, Test Construction, Educational Indicators
Trickett, Susan Bell; Trafton, J. Gregory – Cognitive Science, 2007
The term "conceptual simulation" refers to a type of everyday reasoning strategy commonly called "what if" reasoning. It has been suggested in a number of contexts that this type of reasoning plays an important role in scientific discovery; however, little "direct" evidence exists to support this claim. This article proposes that conceptual…
Descriptors: Logical Thinking, Scientists, Inferences, Models
Ingvarsson, Einar T.; Tiger, Jeffrey H.; Hanley, Gregory P.; Stephenson, Kasey M. – Journal of Applied Behavior Analysis, 2007
Four preschool children (with and without disabilities), who often responded inappropriately to questions, participated in the current study. Pretest results were used to create sets of questions that the children either did or did not answer correctly (i.e., known and unknown questions). We then sequentially taught two different responses to a…
Descriptors: Disabilities, Preschool Children, Questioning Techniques, Responses
Alexander, Jennifer K.; Scherer, Robert F.; Lecoutre, Marc – Journal of Education for Business, 2007
The authors compared business journal ranking systems from 6 countries. Results revealed a low degree of agreement among the systems, and a low to moderate relationship between pairs of systems. In addition, the French and United Kingdom ranking systems were different from each other and from the systems in Australia, Germany, Hong Kong, and the…
Descriptors: Foreign Countries, Comparative Education, Journal Articles, Business

Direct link
