Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Scope, Alison; Empson, Janet; McHale, Sue; Nabuzoka, Dabie – Emotional & Behavioural Difficulties, 2007
The objective of this paper is to report the development and use of an observation checklist to identify typically developing children with behavioural manifestations associated with inattention, hyperactivity and impulsivity. This measure is termed the Scope Classroom Observation Checklist (SCOC). The SCOC was developed, assessed for reliability…
Descriptors: Check Lists, Conceptual Tempo, Classroom Observation Techniques, Academic Achievement
Engelhard, George, Jr.; And Others – 1994
A set of procedures is described for constructing an assessment network composed of a connected system of rater and writing task banks within the context of large-scale assessments of written composition. The calibration of the assessment tasks and the measurement of individuals are viewed as separate, although complementary, activities. The…
Descriptors: Data Collection, Educational Assessment, Interrater Reliability, Item Banks
Plake, Barbara S.; Impara, James C. – 1996
This study investigated the intrajudge consistency of Angoff-based item performance estimates. The examination used was a certification examination in an emergency medicine specialty. Ten expert panelists rated the same 24 items twice during an operational standard setting study. Results indicate that the panelists were highly consistent, in terms…
Descriptors: Cutting Scores, Interrater Reliability, Licensing Examinations (Professions), Performance Based Assessment
Piper, David Warren – 1994
This book offers a descriptive account of the world of the British external examiner: who they are, what they do, how they are appointed, and what problems they face. Findings are based on a questionnaire survey of and personal interviews with examiners. The external examiner is independent of the university awarding the degree and quite…
Descriptors: Examiners, Experimenter Characteristics, Foreign Countries, Higher Education
Wheeler, Wendy; And Others – 1994
The Communication and Symbolic Behavior Scales (CSBS) were developed to standardize communication sampling procedures in assessing preverbal and early verbal stage children. To assess interrater reliability in using the scales with children with developmental disabilities, eight children (ages 33-113 months) with developmental disabilities were…
Descriptors: Behavior Rating Scales, Communication Skills, Developmental Disabilities, Early Childhood Education
Ferguson, Gibson; Maclean, Joan – Edinburgh Working Papers in Linguistics, 1991
This study is the first stage of a wider enquiry into alternative ways of assessing the readability of specialist texts. The interest in assessing these texts arose from the need to grade 60 medical journal articles for an individualized English-as-a-Foreign-Language reading scheme for doctors. The study reports on an investigation of subjective…
Descriptors: Difficulty Level, English (Second Language), Foreign Countries, Interrater Reliability
Lumley, Tom; McNamara, T. F. – 1993
Recent developments in multi-faceted Rasch measurement (Linacre, 1989) have made possible new kinds of investigations of aspects of performance assessments. Bias analysis, interactions between elements of any facet, can also be analyzed, which permits investigation of the way a particular aspect of the test situation may elicit a consistently…
Descriptors: English (Second Language), Experimenter Characteristics, Foreign Countries, Interrater Reliability
Schael, Jocelyne; Dionne, Jean-Paul – 1991
The basis of agreement or disagreement among judges/evaluators when applying a coding scheme to concurrent verbal protocols was studied. The sample included 20 university graduates, from varied backgrounds; 10 subjects had and 10 subjects did not have experience in protocol analysis. The total sample was divided into four balanced groups according…
Descriptors: Adults, College Graduates, Comparative Analysis, Encoding (Psychology)
Rogoff, Barbara; And Others – 1985
Examined are developmental changes in infants' strategies for using adults instrumentally to achieve goals. Data were derived from longitudinal observations of 1 girl and 1 boy twin individually interacting with 21 somewhat or totally unfamiliar adults at 2- or 3-week intervals from the age of 4 to 15 months, inclusive. Videotapes of interactions…
Descriptors: Adult Child Relationship, Adults, Criteria, Individual Development
Harker, Jill K.; Cope, Ronald T. – 1988
Cut scores obtained for licensure tests using different judgmental methods of standard setting (holistic, test blueprint, Angoff, and modified Angoff) were compared. Nineteen educators and practitioners participated in this study as judges. Pre- and post-test feedback (feedback of total- and low-group item p-value) ratings were obtained under the…
Descriptors: Cutting Scores, Feedback, Holistic Evaluation, Interrater Reliability
Barnwell, David – 1986
A study examined inter-rater reliability on the American Council on the Teaching of Foreign Languages/Educational Testing Service (ACTFL/ETS) oral language proficiency rating scale. Seven raters, all elementary or intermediate college Spanish teachers given only brief formal training in the use of the scale, evaluated recorded interviews with…
Descriptors: College Faculty, Higher Education, Interrater Reliability, Language Teachers
Rose, Janet S.; Huynh, Huynh – 1984
As part of a new teacher evaluation program initiated by the local school board, the Charleston County School District (South Carolina) adopted the Assessments of Performance in Teaching (APT) as a major evaluation tool to assess the teaching performance of annual contract teachers. Since evaluation procedures can ultimately lead to teacher…
Descriptors: Classroom Observation Techniques, Elementary Secondary Education, Evaluation Methods, Interrater Reliability
Jackson, Christine; Levine, Douglas W. – 1983
This study assessed the interchangeability of the Matthews Youth Test for Health (MYTH) and Hunter-Wolf A-B Rating Scale. Data from 25 elementary teachers and 300 of their students showed these scales to be weakly correlated, and the concordance of their A-B classifications to be only slightly above that expected by chance. Weak agreement was…
Descriptors: Behavior Rating Scales, Correlation, Elementary Education, High Schools
Christine, Charles T.; And Others – 1982
Thirty-two children aged 7 to 12 participated in a study to determine the reliability of the Ekwall Reading Inventory (ERI) and the Classroom Reading Inventory (CRI). The children were randomly assigned to take one of the two inventories, which were administered by four different specially trained teachers. The study used a test-retest design, in…
Descriptors: Comparative Analysis, Elementary Secondary Education, Informal Reading Inventories, Interrater Reliability
Deno, Stanley L.; And Others – 1983
Using instructional variables identified by the literature as important in predicting classroom achievement, a bi-polar rating scale was designed to assess the structure of instruction in resource rooms. The data for 158 elementary school children in four school districts were analyzed. The scale evidenced good reliability, both in terms of…
Descriptors: Academic Achievement, Classroom Environment, Elementary Education, Factor Structure

Peer reviewed
Direct link
