Publication Date
| In 2026 | 0 |
| Since 2025 | 13 |
| Since 2022 (last 5 years) | 48 |
| Since 2017 (last 10 years) | 151 |
| Since 2007 (last 20 years) | 301 |
Descriptor
| Interrater Reliability | 503 |
| Test Reliability | 503 |
| Test Validity | 260 |
| Test Construction | 106 |
| Foreign Countries | 103 |
| Psychometrics | 91 |
| Evaluation Methods | 90 |
| Scores | 67 |
| Correlation | 62 |
| Scoring | 61 |
| Rating Scales | 58 |
| More ▼ | |
Source
Author
| Epstein, Michael H. | 7 |
| Johnson, Evelyn S. | 4 |
| Matson, Johnny L. | 4 |
| Tasse, Marc J. | 4 |
| Aman, Michael G. | 3 |
| Canivez, Gary L. | 3 |
| Capie, William | 3 |
| Conroy, Maureen A. | 3 |
| Crawford, Angela R. | 3 |
| Lecavalier, Luc | 3 |
| McLeod, Bryce D. | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 41 |
| Practitioners | 8 |
| Administrators | 3 |
| Teachers | 3 |
| Counselors | 1 |
Location
| Turkey | 11 |
| Canada | 10 |
| Australia | 9 |
| United Kingdom | 9 |
| Pennsylvania | 7 |
| Florida | 6 |
| Netherlands | 6 |
| Sweden | 5 |
| United Kingdom (England) | 5 |
| China | 4 |
| Illinois | 4 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 2 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Nadeau, Luc; Richard, Jean-Francois; Godbout, Paul – Physical Education and Sport Pedagogy, 2008
Background: Coaches and physical educators must obtain valid data relating to the contribution of each of their players in order to assess their level of performance in team sport competition. This information must also be collected and used in real game situations to be more valid. Developed initially for a physical education class context, the…
Descriptors: Physical Education, Team Sports, Observation, Performance Based Assessment
Peer reviewedSagi, Abraham; And Others – Developmental Psychology, 1994
Interviewed Israeli students to assess the Adult Attachment Interview's test-retest reliability and effects of the interviewers on the interview itself. Information about subjects' memory and intellectual abilities was obtained from external sources. Found a high degree of interrater and test-retest reliabilities, irrespective of interviewers.…
Descriptors: Foreign Countries, Intelligence, Interrater Reliability, Memory
Peer reviewedKorner, Anneliese F.; And Others – Child Development, 1991
The Neurobehavioral Assessment of the Preterm Infant instrument was developed by means of pilot, exploratory, and validation studies. The validation study tested the generalizability of results for different cohorts, test versions, hospitals, and examiners. Seven stable functions were identified: motor development; scarf sign; popliteal angle;…
Descriptors: Behavior Development, Cluster Analysis, Cohort Analysis, Interrater Reliability
Powell, Thomas W. – Clinical Linguistics & Phonetics, 2006
The third edition of the "Boston Diagnostic Aphasia Examination" (Goodglass, Kaplan, and Barresi) introduced standardized procedures for coding discourse samples elicited using the well known Cookie Theft illustration. To evaluate the reliability of this discourse coding procedure, a transcribed sample was coded by 14 novice examiners…
Descriptors: Examiners, Interrater Reliability, Test Reliability, Aphasia
Arnold, Margery E. – 1996
It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Descriptors: Estimation (Mathematics), Generalizability Theory, Heuristics, Interrater Reliability
Peer reviewedPowers, Stephen; And Others – Educational and Psychological Measurement, 1985
Results of an administration of the Language Proficiency Measure indicated that the interrater reliability was adequate, internal-consistency reliability estimates were high, concurrent validity coefficients were adequate, and the classification validity was acceptable. (Author/LMO)
Descriptors: Elementary Education, Interrater Reliability, Language Proficiency, Language Tests
Peer reviewedHenk, William A.; Selders, Mary L. – Reading Teacher, 1984
Shows that synonymic scoring of cloze tests is highly variable--that the score seems to appear simply on who grades the test. (FL)
Descriptors: Cloze Procedure, Interrater Reliability, Reading Instruction, Reading Research
Peer reviewedJohnson, Brian W. – Educational and Psychological Measurement, 1983
Regression analyses indicated that the Coopersmith Self-Esteem Inventory has convergent validity with regard to the Piers-Harris Children's Self-Concept Scale and the Coopersmith Behavioral Academic Assessment Scale, has discriminant validity with regard to the Children's Social Desirability Scale, is sensitive to differences in achievement level,…
Descriptors: Academic Achievement, Intermediate Grades, Interrater Reliability, Self Concept Measures
Peer reviewedIngham, Roger J.; Cordes, Anne K. – Journal of Speech, Language, and Hearing Research, 1997
Stuttering self-judgments from 15 adults who stutter, judgments of each others' stuttering, and the judgments of a panel of 10 stuttering researchers were compared. Results found substantial differences in stuttering judgments across speakers, judges, and judgment conditions, but across-task comparisons were complicated by low self-agreement among…
Descriptors: Adults, Interrater Reliability, Measurement Techniques, Self Evaluation (Individuals)
Peer reviewedAbedi, Jamal – Multivariate Behavioral Research, 1996
The Interrater/Test Reliability System (ITRS) is described. The ITRS is a comprehensive computer tool used to address questions of interrater reliability that computes several different indices of interrater reliability and the generalizability coefficient over raters and topics. The system is available in IBM compatible or Macintosh format. (SLD)
Descriptors: Computer Software, Computer Software Evaluation, Evaluation Methods, Evaluators
Gilbride, Dennis; Vandergoot, David; Golden, Kristie; Stensrud, Robert – Rehabilitation Counseling Bulletin, 2006
This study describes the four-phase process used in developing the "Employer Openness Survey" (EOS). The EOS is an 18-item instrument designed to measure the openness of employers to hiring, accommodating, and promoting workers with disabilities. During the first phase, the authors generated potential questions and pilot-tested them with…
Descriptors: Test Validity, Rehabilitation Counseling, Placement, Interrater Reliability
Johnson, Robert L.; Penny, Jim; Fisher, Steve; Kuhs, Therese – Applied Measurement in Education, 2003
When raters assign different scores to a performance task, a method for resolving rating differences is required to report a single score to the examinee. Recent studies indicate that decisions about examinees, such as pass/fail decisions, differ across resolution methods. Previous studies also investigated the interrater reliability of…
Descriptors: Test Reliability, Test Validity, Scores, Interrater Reliability
Peer reviewedStewart, Krista J. – Psychology in the Schools, 1987
Evaluated the technical aspects of three Wechsler Intelligence Scale for Children-Revised (WISC-R) administrations of five psychology graduate students using the WISC-R Administration Observational Checklist (WAOC) to evaluate interrater agreement. Students performed significantly better on the second than on the first observation, with…
Descriptors: Educational Diagnosis, Error Patterns, Examiners, Graduate Students
Peer reviewedMuris, Peter; Steerneman, Pim; Ratering, Elise – Journal of Autism and Developmental Disorders, 1997
A study of 10 children (ages 3-6) with pervasive developmental disorders investigated the interrater reliability of the Psychoeducational Profile (PEP). Results show good interrater reliability for the developmental items, indicating that the PEP can be used to evaluate progress in development of children with pervasive developmental disorders.…
Descriptors: Child Development, Children, Evaluation Methods, Foreign Countries
Peer reviewedConroy, Maureen A.; And Others – Education and Training in Mental Retardation and Developmental Disabilities, 1996
This study assessed the intra-rater and inter-rater reliability of the Motivation Assessment Scale as used with 20 adults with mental retardation, expanding the results of previous research by evaluating across additional time and administrations. Results from 19 raters indicated variable moderate-to-low intra-rater and inter-rater reliability.…
Descriptors: Adults, Behavior Problems, Interrater Reliability, Measures (Individuals)

Direct link
