Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Peer reviewedCalhoun, Judith G.; And Others – Evaluation and the Health Professions, 1988
Second-year medical students (N=187) evaluated their own videotaped performances during physical assessment examinations. Peers and expert evaluators also rated the performances. Significant differences emerged across types of raters. Results show that these students do not accurately assess their own or peers' skills performances. (TJH)
Descriptors: Clinical Diagnosis, Interrater Reliability, Medical Care Evaluation, Medical Students
Peer reviewedPeterson, Donovan – Educational Research Quarterly, 1986
This article describes procedures to be followed in developing a system for observation of teachers in the classroom and use of the observation to evaluate teachers. A list of criteria is presented, including various types of validity, measurement characteristics, and practicality characteristics for observation systems. (Author/LMO)
Descriptors: Achievement Gains, Classroom Observation Techniques, Educational Assessment, Elementary Secondary Education
Peer reviewedSeibert, Jeffrey M.; And Others – Merrill-Palmer Quarterly, 1986
Tests hypotheses that social and object skills can be equated for cognitive complexity and that correlations between measures organized according to levels are high for all samples spanning the developmental range. Cognitive levels of both object and social skills were assessed for 50 normal and 34 handicapped children. Hypotheses were confirmed.…
Descriptors: Chronological Age, Cognitive Development, Day Care, Disabilities
Peer reviewedSmith, Peter K.; Vollstedt, Ralph – Child Development, 1985
Five common play criteria were applied by subjects to a videotape of nursery school children's behavior, rated separately for occurrence of play. The idea that play is best predicted by a combination of criteria was supported by the finding that, when more criteria occurred simultaneously, the more certainly a judgment of play was implied.…
Descriptors: Criteria, Day Care, Definitions, Early Childhood Education
Peer reviewedMahler, Charles A. – School Psychology Review, 1986
A cross-age tutoring program designed to improve school performance of tutors (adolescents classified as emotionally disturbed) and tutees (children classified as educable mentally retarded) was replicated in an urban public school district. Results showed that both tutors and tutees improved on academic and social measures of school performance.…
Descriptors: Academic Achievement, Adolescents, Children, Cross Age Teaching
Corcoran, Kevin J.; White, Lyle J.; Michels, Jennifer L.; Gilbert, David G. – 1997
Recently, a great deal of attention has been focused on the development of a system of relational diagnosis to be incorporated into the American Psychiatric Association's diagnostic system, that of the Diagnostic and Statistical Manual (DSM). One of the more intriguing components of this effort is the Global Assessment of Relational Functioning…
Descriptors: Adults, Diagnostic Tests, Emotional Response, Graduate Students
Gulek, Cengiz – 1999
This study explored the possibility of using the Multi-Trait Multi-Method (MTMM) approach of D. Campbell and D. Fiske (1959) to examine the educational ecology of classrooms. Using the MTMM approach, the study focuses on the extent to which multiple methods of inquiry (surveys, drawings, and videos) are valid indicators of classroom teaching and…
Descriptors: Educational Environment, Elementary Education, Evaluation Methods, Freehand Drawing
Peer reviewedWaters, L. K.; And Others – Perceptual and Motor Skills, 1982
Multitrait-multimethod analysis was performed on instructors' ratings from behaviorally anchored rating scales, graphic rating scales, and mixed standard scales. Two samples of 100 undergraduate students were distinguished on the basis of whether the statements on the mixed-standard scale were behaviorally specific or more generic descriptions of…
Descriptors: Behavior Rating Scales, Discriminant Analysis, Higher Education, Interrater Reliability
Peer reviewedSchor, Nina F.; And Others – Academic Medicine, 1997
A study investigated whether medical school faculty can arrive at consistent, non-idiosyncratic grades in a problem-based learning course. Analysis of grades given by three teachers, based on seven performance categories, to 16 groups of nine students in a seven-week University of Pittsburgh (Pennsylvania) course revealed that given specific…
Descriptors: Academic Achievement, Curriculum Design, Grading, Higher Education
Peer reviewedKwan, Kam-por; Leung, Roberta – Assessment & Evaluation in Higher Education, 1996
Performance in a simulation exercise of 96 third-year college students studying the hotel and tourism industries was assessed separately by teacher and peers using an identical checklist. Although results showed some agreement between teacher and peers, when averaged marks were converted into grades, agreement occurred in under half the cases.…
Descriptors: Comparative Analysis, Evaluation Criteria, Evaluation Methods, Higher Education
Peer reviewedPeacock, Matthew – ELT Journal, 1997
Investigates whether authentic materials increase student motivation in the classroom. Findings reveal that while on-task behavior and observed motivation increased significantly when authentic materials were used, self-reported motivation only increased over the last 12 days of the study. Students reported authentic materials to be less…
Descriptors: Class Activities, Data Collection, English (Second Language), Instructional Materials
Peer reviewedDelandshere, Ginette; Petrosky, Anthony R. – Educational Researcher, 1994
Discusses the role and consistency of judges' interpretations of teacher performance as part of an evaluative scheme for complex performance, with reference to the ideological framework of professional standards. The tension between assessment decisions and the recognition that assessment involves interpretation is explored. (SLD)
Descriptors: Decision Making, Educational Assessment, Epistemology, Evaluators
Peer reviewedMontgomery, Michelle S. – Journal of Learning Disabilities, 1994
Sixth-, seventh-, and eighth-grade students (n=135) were administered the Multidimensional Self-Concept Scale. Teachers and parents also rated the children's self-concepts. Teachers generally underestimated the self-concepts of average students and students with learning disabilities but overestimated high achieving students' self-concepts.…
Descriptors: High Achievement, Intermediate Grades, Interrater Reliability, Junior High Schools
Merrell, Kenneth W.; Popinga, Monique R. – Diagnostique, 1994
Analysis of social competence and problem behaviors for 164 special education students in grades K-3, using the Social Skills Rating System, found weak to moderate relationships between parents' and teachers' ratings. Gender comparisons revealed mixed findings. Results provide evidence for the instrument's convergent validity and suggest that…
Descriptors: Behavior Problems, Behavior Rating Scales, Disabilities, Emotional Problems
Peer reviewedSweedler-Brown, Carol O. – Journal of Second Language Writing, 1993
A study compared the influences of rhetorical and sentence-level features on holistic essay scores assigned by raters who are experienced writing instructors but not trained in English-as-a-Second-Language (ESL) instruction. In scoring six university-level essays, these raters placed emphasis on ESL sentence-level errors far more than on essays'…
Descriptors: Discourse Analysis, English (Second Language), Error Correction, Essays


