Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Peer reviewedBannister, Brendan D.; And Others – Educational and Psychological Measurement, 1987
To control for response bias in student ratings of college teachers, an index of rater error was used that was theoretically independent of actual performance. Partialing out the effects of this extraneous response bias enhanced validity, but partialing out overall effectiveness resulted in reduced convergent and discriminant validities.…
Descriptors: Error of Measurement, Higher Education, Interrater Reliability, Response Style (Tests)
Peer reviewedCooke, Robert A.; And Others – Educational and Psychological Measurement, 1987
Lafferty's Life Styles Inventory was completed by 556 managers (Level I, Self-Description) and by 2,922 peers, subordinates, and supervisors (Level II, Description by Others). Factor analysis revealed the same three factors in both ratings. Coworkers generally agreed with each others' ratings, but correlations between self and coworker ratings…
Descriptors: Administrator Evaluation, Adults, Behavior Rating Scales, Cognitive Style
Peer reviewedWatson, Wilbur H. – International Journal of Aging and Human Development, 1988
Analyzed interprofessional agreements (N=26) between nurses and social workers when rating older patients on their physical self-maintenance abilities, mental status, and dispositions to social interaction with other residents of a home for the aged. Findngs showed significant intercorrelations of physical self-maintenance abilities and mental…
Descriptors: Evaluation, Evaluation Criteria, Institutionalized Persons, Interrater Reliability
Peer reviewedBrown, Ronald T. – Educational and Psychological Measurement, 1985
Elementary school children with attention deficit disorder were classified according to the Diagnostic and Statistical Manual of Mental Disorders (DSM-III) as hyperactive or not. Teachers' ratings indicated hyperactive children were more problematic. Teachers were able to distinguish between the groups, using the Abbreviated Conners' Rating Scale…
Descriptors: Attention Deficit Disorders, Behavior Rating Scales, Educational Diagnosis, Elementary Education
Peer reviewedMahoney, Gerald; And Others – Topics in Early Childhood Special Education, 1986
Independent ratings of videotaped sessions in which mothers (N=60) interacted with their mentally retarded children (ages 1-3) suggested that potentially important components of maternal behavior (child orientedness/pleasure and control) may be assessed with the seven-item short form of the Maternal Behavior Rating Scale. (JW)
Descriptors: Behavior Patterns, Behavior Rating Scales, Downs Syndrome, Interaction Process Analysis
Peer reviewedMeskauskas, John A. – Evaluation and the Health Professions, 1986
Two new indices of stability of content-referenced standard-setting results are presented, relating variability of judges' decisions to the variability of candidate scores and to the reliability of the test. These indices are used to indicate whether scores resulting from a standard-setting study are of sufficient precision. (Author/LMO)
Descriptors: Certification, Credentials, Error of Measurement, Generalizability Theory
Solano-Flores, Guillermo; Shavelson, Richard J.; Ruiz-Primo, Maria Araceli; Schults, Susan Elise; Wiley, Edward W.; Brown, Janet H. – 1997
In a project on the development of performance assessments in science, researchers have developed four types of assessment: comparative, component identification, classification, and observation. They have noted that assessments belonging to the same type of task can be scored for the same performance properties. This paper reports on the…
Descriptors: Classification, Elementary School Students, Grade 5, Intermediate Grades
Gilbert, Sharon L. – 1997
This study examined whether variations in the Developmental Observation Checklist (DC) format influences congruence of scores among both parents and the child's teacher. The DC was varied by adding pictorial illustrations and examples and having three response categories instead of two. Results from 100 sets of participants were evaluated with…
Descriptors: Check Lists, Developmental Delays, Early Intervention, Fathers
Flanders, Anne K.; Wick, John – 1998
This paper examines whether the peer-review process of the North Central Association (NCA) is reliable and valid. Reliance on peer judgments has been a part of NCA accreditation, but confidence in the use of peer decisions to certify a school's readiness to implement the improvement plan--Outcomes Accreditation (OA)--was weak. The study focused on…
Descriptors: Accreditation (Institutions), Educational Assessment, Educational Improvement, Elementary Secondary Education
Peer reviewedWebster-Stratton, Carolyn – Journal of Consulting and Clinical Psychology, 1988
Mothers (N=120) and fathers (N=85) of young children with conduct problems completed measures of child adjustment, personal adjustment, a Life Experience Survey, and were observed with child. Teachers (N=107) completed Behar Preschool Questionnaire. Found that fathers' perceptions of children's behaviors were significantly correlated with…
Descriptors: Adjustment (to Environment), Behavior Problems, Fathers, Interrater Reliability
Ulrich, Dale A.; And Others – American Journal on Mental Retardation, 1989
The study of variance in observers assessing movement control in children with mild mental retardation found that observers who received informal training needed to observe twice as many trials to reach an acceptable standard of reliability compared to the observers who received competency-based training. (Author/DB)
Descriptors: Classroom Observation Techniques, Competency Based Teacher Education, Elementary Secondary Education, Interrater Reliability
Peer reviewedBurchard, Kenneth W.; And Others – Academic Medicine, 1995
A study measured interrater reliability among 140 United States and Canadian surgery exam raters and the influences of age, years in practice, and experience as an examiner on individual scores. Results indicate three aspects of examinee performance influenced scores: verbal style, dress, and content of answers. No rater characteristic…
Descriptors: Higher Education, Hygiene, Individual Characteristics, Interrater Reliability
Peer reviewedLee, Steven W.; And Others – Behavioral Disorders, 1994
The Child Behavior Checklist and related forms were completed for 171 boys referred for school-based assessment resulting from academic and/or behavioral problems. Adolescents consistently underreported behavioral problems relative to parents and teachers regardless of subsequent diagnosis. Implications of these discrepancies in school-based…
Descriptors: Adolescents, Behavior Problems, Disability Identification, Educational Diagnosis
Peer reviewedYarbrough, Cornelia; And Others – Bulletin of the Council for Research in Music Education, 1994
Reports on a study of 614 experienced music teachers, non-music teachers, college-level music students, and non-music students on the effect of sequential patterns and different modes of presentation of music teaching. Finds that experienced teachers' evaluations were significantly higher than those of university students. (CFR)
Descriptors: Educational Strategies, Evaluative Thinking, Evaluators, Higher Education
Peer reviewedShaw, Darlene L.; And Others – Academic Medicine, 1995
A study found that interviewers of medical school applicants (n=471) were influenced in their ratings of applicants' noncognitive attributes by grade point average and Medical College Admission Test scores, when available, and by gender and race in accordance with affirmative action goals. Only moderate reliability across interviewers was found.…
Descriptors: Affirmative Action, College Admission, College Applicants, Higher Education


