Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 2 |
Descriptor
| Interrater Reliability | 11 |
| Test Reliability | 11 |
| Test Use | 11 |
| Test Validity | 7 |
| Scoring | 6 |
| Test Construction | 5 |
| Educational Assessment | 4 |
| Testing | 4 |
| Student Evaluation | 3 |
| Test Format | 3 |
| Elementary School Students | 2 |
| More ▼ | |
Source
| Academic Medicine | 1 |
| Applied Measurement in… | 1 |
| Education and Training in… | 1 |
| International Journal of… | 1 |
| Journal of Consulting and… | 1 |
| New York State Education… | 1 |
Author
Publication Type
| Journal Articles | 5 |
| Reports - Evaluative | 4 |
| Reports - Research | 4 |
| Speeches/Meeting Papers | 3 |
| Guides - Non-Classroom | 2 |
| Guides - General | 1 |
| Numerical/Quantitative Data | 1 |
Education Level
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Grade 3 | 1 |
| Grade 4 | 1 |
| Grade 5 | 1 |
| Grade 6 | 1 |
| Grade 7 | 1 |
| Grade 8 | 1 |
| High Schools | 1 |
| Intermediate Grades | 1 |
| Junior High Schools | 1 |
| More ▼ | |
Audience
| Administrators | 1 |
| Practitioners | 1 |
| Teachers | 1 |
Location
| New York | 1 |
Laws, Policies, & Programs
| Individuals with Disabilities… | 1 |
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
New York State Education Department, 2014
This technical report provides an overview of the New York State Alternate Assessment (NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and implement the NYSAA program, and Stakeholder involvement in those processes. The purpose of this report is to document the technical aspects of the 2013-14 NYSAA.…
Descriptors: Alternative Assessment, Educational Assessment, State Departments of Education, Student Evaluation
Peer reviewedMeier, Augustine; Boivin, Micheline – Journal of Consulting and Clinical Psychology, 1986
The Client Verbal Response Category System classifies client responses into Temporal, Directional and Experiential categories. The categories with their subcategories are defined, interjudge reliability data is presented, and the instrument's utility in psychotherapy process research is demonstrated. Initial results indicate that the instrument is…
Descriptors: Client Characteristics (Human Services), Interrater Reliability, Psychotherapy, Research Tools
Peer reviewedConroy, Maureen A.; And Others – Education and Training in Mental Retardation and Developmental Disabilities, 1996
This study assessed the intra-rater and inter-rater reliability of the Motivation Assessment Scale as used with 20 adults with mental retardation, expanding the results of previous research by evaluating across additional time and administrations. Results from 19 raters indicated variable moderate-to-low intra-rater and inter-rater reliability.…
Descriptors: Adults, Behavior Problems, Interrater Reliability, Measures (Individuals)
Reckase, Mark D. – 1997
This paper argues that special procedures for constructing assessment tools containing performance assessment tasks are unnecessary and that current test methodology can easily be generalized to complex performance assessment tasks without destroying the desirable characteristics of those tasks. Reasonable statistical requirements for sound…
Descriptors: Educational Assessment, Generalizability Theory, High Stakes Tests, Interrater Reliability
Peer reviewedSmith, Richard Merrill – Academic Medicine, 1993
A University of Hawaii study compared objective and subjective assessments of the three-step triple jump examination which tests medical students' clinical problem-solving processes. Subjects were 58 first-year students. Results found the subjective assessments were more consistent across problems of varying difficulty level than were objective…
Descriptors: Case Studies, Difficulty Level, Higher Education, Interrater Reliability
Alderson, J. Charles; And Others – 1995
The guide is intended for teachers who must construct language tests and for other professionals who may need to construct, evaluate, or use the results of language tests. Most examples are drawn from the field of English-as-a-Second-Language instruction in the United Kingdom, but the principles and practices described may be applied to the…
Descriptors: Educational Trends, English (Second Language), Interrater Reliability, Language Tests
Peer reviewedDunbar, Stephen B.; And Others – Applied Measurement in Education, 1991
Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)
Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques
Wolfe, Edward W. – 1996
Although portfolio assessment is becoming increasingly popular, it may not survive unless portfolio scoring can meet the demands of large-scale assessment standards. The results of studies of interrater reliability with large-scale portfolio assessments have been mixed. This paper reports the scoring results of a nationwide portfolio pilot in…
Descriptors: Decision Making, Generalizability Theory, Interrater Reliability, Language Arts
Gearhart, Maryl; Novak, John R.; Herman, Joan L. – 1994
Technical questions regarding the reliability and validity of large-scale portfolio assessment were studied which focused on: (1) whether raters can score collections of writing reliably with rubrics designed for single samples; (2) whether ratings derived from different frameworks differ in their capacities to support technically sound…
Descriptors: Educational Assessment, Elementary Education, Elementary School Students, Essay Tests
Florida State Dept. of Education, Tallahassee. Div. of Vocational, Adult, and Community Education. – 1991
This packet contains a manual and a workbook for developing performance tests in vocational education. The manual gives an in-depth description of how to develop, score, and use performance tests. It includes the following sections: definitions of performance testing, steps in developing a performance test, selecting a performance development…
Descriptors: Interrater Reliability, Performance Tests, Postsecondary Education, Scoring

Direct link
