Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Peer reviewedCampion, Michael A.; And Others – Personnel Psychology, 1988
Proposes a highly structured six-step employment interviewing technique which includes asking the same questions, consistently administering the process to all candidates, and having an interview panel. Results of a field study of 243 job applicants using this technique demonstrated interrater reliability, predictive validity, test fairness for…
Descriptors: Employment Interviews, Interrater Reliability, Job Applicants, Measures (Individuals)
Peer reviewedPhelps, Le Adelle; And Others – Journal of Teacher Education, 1986
A performance-based student teacher evaluation process was investigated to see if halo and leniency errors could be eliminated. Results are presented. (MT)
Descriptors: Cooperating Teachers, Evaluation Criteria, Higher Education, Interrater Reliability
Peer reviewedCollier, Michael – Assessment and Evaluation in Higher Education, 1986
A study revealing wide variation in the grading of electronics engineering test items by different evaluators has implications for evaluator and test item selection, analysis and manipulation of grades, and the use of numerical methods of assessment. (MSE)
Descriptors: Electronics, Engineering Education, Evaluation Methods, Evaluators
Peer reviewedWilson, F. Robert; Griswold, Mary Lynn – Measurement and Evaluation in Counseling and Development, 1985
Type and comprehensiveness of training were experimentally manipulated (N=128) to study their effects on the reliability and validity of rated counselor empathy. Implications for observer training are discussed. (Author)
Descriptors: College Students, Counselor Characteristics, Empathy, Interrater Reliability
Hertz, Norman R.; Chinn, Roberta N. – 2002
Nearly all of the research on standard setting focuses on different standard setting methods rather than the interaction of group members and the instructions given to group members. This study explored the effect of deliberation style and the requirement to reach consensus on the passing score, on rater satisfaction, and on postdecision…
Descriptors: Decision Making, Evaluation Methods, Evaluators, Interaction
O'Neill, Thomas R.; Lunz, Mary E. – 1997
This paper illustrates a method to study rater severity across exam administrations. A multi-facet Rasch model defined the ratings as being dominated by four facets: examinee ability, rater severity, project difficulty, and task difficulty. Ten years of data from administrations of a histotechnology performance assessment were pooled and analyzed…
Descriptors: Ability, Comparative Analysis, Equated Scores, Interrater Reliability
Taherbhai, Husein; Young, Michael James – 2000
This empirical study used data from the Reading: Basic Understanding section of the New Standards English Language Arts Examination. Data were collected for 3,200 high school students randomly selected from those who took the examination. The resulting sample had 16 raters who scored 200 students each, with each student rated by only 1 rater. The…
Descriptors: Evaluators, High School Students, High Schools, Interrater Reliability
Wang, Ning; Wiser, Randall F.; Newman, Larry S. – 2001
This paper provides both logical and empirical evidence to justify the use of an item mapping method for establishing passing scores for multiple-choice licensure and certification examinations. After describing the item-mapping standard setting process, the paper discusses the theoretical basis and rationale for this newly developed method and…
Descriptors: Certification, Cutting Scores, Interrater Reliability, Item Response Theory
Peer reviewedFleishman, Rachel; And Others – Evaluation Review, 1996
An interjudge reliability test was conducted to evaluate questionnaires used in the surveillance of residential care institutions in Israel. Results from 32 institutions (evaluated by two surveyor teams--one social worker and 1 nurse per team) and the variance in reliability were used to improve the questionnaires and their administration. (SLD)
Descriptors: Evaluators, Foreign Countries, Institutional Characteristics, Interrater Reliability
Peer reviewedLombard, Matthew; Snyder-Duch, Jennifer; Bracken, Cheryl Campanella – Human Communication Research, 2002
Reviews the importance of intercoder agreement for content analysis in mass communication research. Describes several indices for calculating this type of reliability (varying in appropriateness, complexity, and apparent prevalence of use). Presents a content analysis of content analyses reported in communication journals to establish how…
Descriptors: Communication Research, Content Analysis, Higher Education, Interrater Reliability
Peer reviewedKolevzon, Michael S.; And Others – Journal of Marital and Family Therapy, 1988
Employed triangulation strategy for assessing family interaction, involving family members, therapist, and coders independently viewing videotapes. Found weak agreement between paired assessments within family triad, and within therapist-coder dyad. Findings suggest that methodological and/or scaling strategies designed to maximize agreement may…
Descriptors: Counselor Attitudes, Evaluation Criteria, Evaluation Methods, Evaluation Problems
Peer reviewedSagi, Abraham; And Others – Developmental Psychology, 1994
Interviewed Israeli students to assess the Adult Attachment Interview's test-retest reliability and effects of the interviewers on the interview itself. Information about subjects' memory and intellectual abilities was obtained from external sources. Found a high degree of interrater and test-retest reliabilities, irrespective of interviewers.…
Descriptors: Foreign Countries, Intelligence, Interrater Reliability, Memory
Peer reviewedColliver, Jerry R.; And Others – Journal of Academic Medicine, 1991
Case means and case failures in performance-based medical student evaluations were examined to evaluate the consistency of ratings made by two or more standardized patients (SPs) simulating the same case. Results demonstrate a need for caution in interpreting scores obtained from a case checklist completed by multiple SPs. (Author/MSE)
Descriptors: Evaluation Methods, Higher Education, Interrater Reliability, Medical Education
Peer reviewedKorner, Anneliese F.; And Others – Child Development, 1991
The Neurobehavioral Assessment of the Preterm Infant instrument was developed by means of pilot, exploratory, and validation studies. The validation study tested the generalizability of results for different cohorts, test versions, hospitals, and examiners. Seven stable functions were identified: motor development; scarf sign; popliteal angle;…
Descriptors: Behavior Development, Cluster Analysis, Cohort Analysis, Interrater Reliability
Peer reviewedBradley, Clare – Assessment and Evaluation in Higher Education, 1993
Analysis of a study of sex bias in undergraduate student project evaluations revealed evidence of bias that was overlooked by the researchers. Research methodology and interpretation are discussed further. (MSE)
Descriptors: College Students, Higher Education, Interrater Reliability, Research Methodology


