Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Peer reviewedEpstein, Michael H.; Cullinan, Douglas; Harniss, Mark K.; Ryser, Gail – Behavioral Disorders, 1999
Three studies are reported addressing the reliability of the Scale for Assessing Emotional Disturbance (SAED), a standardized, norm-reference measure linked to the federal definition of emotional disturbance (ED). Results indicate the SAED possesses acceptable test-retest reliability and reasonable interrater reliability and can assist in the…
Descriptors: Disability Identification, Elementary Secondary Education, Eligibility, Emotional Disturbances
Peer reviewedFitzpatrick, Anne R.; Ercikan, Kadriye; Yen, Wendy M.; Ferrara, Steven – Applied Measurement in Education, 1998
The consistency between raters over three years of a high-stakes performance assessment was examined in two studies involving a total of approximately 3,000 students in grades three, five, and eight. Results show that raters in different years differ in severity, with raters in mathematics most consistent, and those in language arts least…
Descriptors: Elementary Education, Elementary School Students, High Stakes Tests, Interrater Reliability
Peer reviewedGillberg, Christopher; Gillberg, Carina; Rastam, Maria; Wentz, Elisabeth – Autism: The International Journal of Research and Practice, 2001
The development of the Asperger Syndrome (and high-functioning autism) Diagnostic Interview (ASDI) is described. Preliminary data from a clinical study of 20 individuals (ages 6-55) suggest that interrater reliability and test-retest stability may be excellent, with kappas exceeding 0.90 in both instances. The validity appears to be relatively…
Descriptors: Adults, Asperger Syndrome, Autism, Children
Peer reviewedMelby, Janet M.; And Others – Journal of Marriage and the Family, 1995
Multiple observer ratings of 424 families were obtained across 2 observational task situations using the Iowa Family Interaction Rating Scales. Observer ratings and family member reports were assessed simultaneously through structural equation modeling. Findings support the reliability of the global assessments of warm/supportive marital…
Descriptors: Affective Behavior, Higher Education, Interpersonal Relationship, Interrater Reliability
Spreat, Scott; Connelly, Lisa – American Journal on Mental Retardation, 1996
Reliability analysis of the Motivation Assessment Scale was conducted on subscales completed by staff members working with 47 institutionalized adults with severe to profound mental retardation and self-injurious behavior problems. Internal consistency was found to be superior to interrater reliability. The instrument's internal consistency…
Descriptors: Adults, Behavior Problems, Institutionalized Persons, Interrater Reliability
Peer reviewedCongdon, Peter J.; McQueen, Joy – Journal of Educational Measurement, 2000
Studied the stability of rater severity over an extended rating period by applying multifaceted Rasch analysis to ratings of 16 raters of writing performances of 8,285 elementary school students. Findings cast doubt on the practice of using a single calibration of rate severity as the basis for adjustment of person measures. (SLD)
Descriptors: Educational Assessment, Elementary Education, Elementary School Students, Interrater Reliability
Peer reviewedBaird, Christopher; Wagner, Dennis; Healy, Theresa; Johnson, Kristen – Child Welfare, 1999
Compared reliability of three widely used child protective service risk-assessment models (one actuarial, two consensus based). Found that, although no system approached 100% interrater reliability, raters employing the actuarial model made consistent estimates of risk for a high percentage of cases they assessed. Interrater reliability for the…
Descriptors: At Risk Persons, Child Welfare, Children, Comparative Analysis
Peer reviewedStoker, J. I.; Van der Heijden, B. I. J. M. – Journal of Career Development, 2001
In study 1, 313 supervisor/supervisee pairs rated supervisees' professional expertise; supervisees gave themselves higher ratings. Study 2 compared 63 team leaders' and 593 team members' ratings of leaders, finding different perceptions of competence. Results suggest the use of self-other ratings can be improved through feedback, joint training…
Descriptors: Competence, Foreign Countries, Interrater Reliability, Personnel Evaluation
Read, Barbara; Francis, Becky; Robson, Jocelyn – Assessment and Evaluation in Higher Education, 2005
This paper reports on findings relating to a project on gender and essay assessment in HE. It focuses on one aspect of the study: the assessment of and feedback given to two sample essays by 50 historians based at universities in England and Wales. We found considerable variation both as to the classification awarded to the essays and to positive…
Descriptors: Foreign Countries, Historians, Feedback, Gender Issues
McLeod, Bryce D.; Weisz, John R. – Journal of Consulting and Clinical Psychology, 2005
The authors describe psychometric characteristics of the new Therapy Process Observational Coding System-Alliance scale (TPOCS-A; B. D. McLeod, 2001) and illustrate its use in the study of treatment as usual. The TPOCS-A uses session observation to assess child-therapist and parent-therapist alliance. Both child and parent forms showed acceptable…
Descriptors: Measures (Individuals), Therapy, Psychometrics, Depression (Psychology)
Feldman, M. A.; Atkinson, L.; Foti-Gervais, L.; Condillac, R. – Journal of Intellectual Disability Research, 2004
Although effective, humane treatments exist for persons with intellectual disabilities (ID) who have challenging behaviour, little research has examined the extent to which clients receive formal, documented vs. undocumented interventions. Caregivers (of 625 persons with ID living in community and institutional residences in Ontario, Canada) were…
Descriptors: Foreign Countries, Supervision, Intervention, Incidence
Oriogun, Peter K.; Cook, John – American Journal of Distance Education, 2003
In this article, we extend previous work with respect to interrater reliability measure of computer-mediated conferencing and suggest coding categories relevant to problem-based learning. Calculating interrater reliability agreement by using a Transcript Reliability Cleaning Percentage (TRCP) approach is simple for academics with limited…
Descriptors: Problem Based Learning, Interrater Reliability, Teleconferencing, Discourse Analysis
Sung, Yao-Ting; Lin, Chen-Shan; Lee, Chi-Lung; Chang, Kuo-En – Teaching of Psychology, 2003
Thirty-four undergraduates used Web-based self- and peer-assessment procedures for evaluating proposals in experimental psychology courses. Students presented their proposals and commented on the proposals of others on the Web. Results indicated that proposal observation and peer interaction enhanced the quality of students' proposals. These…
Descriptors: Internet, Interrater Reliability, Experimental Psychology, Undergraduate Students
Azrin, Nathan H.; Kellen, Michael J.; Ehle, Christopher T.; Brooks, Jeannie S. – Behavior Modification, 2006
Studies of self-induced vomiting of retarded persons have found that the rate of eating and the amount eaten alter this problem. The present study attempted to determine whether this same relationship was exhibited by the nonretarded bulimic. A nonretarded bulimic woman provided her subjective ratings of her desire to vomit after eating her taboo…
Descriptors: Eating Disorders, Females, Food, Research Design
Sharma, Anu; Botzet, Andria M.; Sechrist, Rebecca A. J.; Arthur, Nikki; Winters, Ken C. – Journal of Child and Adolescent Substance Abuse, 2006
This study reports on norms developed for the Minnesota Institute of Public Health's (1999) Community Readiness Survey. Prevention experts from ten states and the Red Lake Nation sorted data from 50 communities into high and low readiness groups using a Q-sort process. High inter-rater agreement was achieved on communities sorted. Tests of…
Descriptors: State Norms, Community Surveys, Interrater Reliability, Substance Abuse

Direct link
