ERIC - Search Results

Publication Date

In 2026	0
Since 2025	58
Since 2022 (last 5 years)	284
Since 2017 (last 10 years)	780
Since 2007 (last 20 years)	2042

Descriptor

Interrater Reliability	3124
Foreign Countries	655
Test Reliability	503
Evaluation Methods	502
Test Validity	410
Correlation	401
Scoring	347
Comparative Analysis	327
Scores	324
Validity	310
Student Evaluation	308
Measures (Individuals)	298
Evaluators	295
Rating Scales	282
Statistical Analysis	268
Higher Education	264
Psychometrics	241
Reliability	231
Observation	229
Scoring Rubrics	216
Test Construction	212
English (Second Language)	211
Teaching Methods	208
Writing Evaluation	206
Intervention	200
More ▼

Education Level

Higher Education	574
Postsecondary Education	420
Elementary Education	282
Secondary Education	180
Early Childhood Education	145
Elementary Secondary Education	120
Middle Schools	109
High Schools	86
Preschool Education	72
Junior High Schools	65
Adult Education	59
Primary Education	57
Kindergarten	45
Grade 4	41
Grade 5	40
Intermediate Grades	40
Grade 1	37
Grade 6	35
Grade 8	32
Grade 3	31
Grade 2	27
Grade 7	27
Grade 10	13
Grade 9	11
Two Year Colleges	8
More ▼

Audience

Researchers	130
Practitioners	42
Teachers	22
Administrators	11
Counselors	3
Policymakers	2

Location

Australia	56
Turkey	53
United Kingdom	46
Canada	45
Netherlands	40
China	38
California	37
United States	30
United Kingdom (England)	25
Taiwan	23
Germany	22
Japan	22
Pennsylvania	22
Florida	21
Sweden	21
Iran	19
North Carolina	19
Hong Kong	17
South Korea	17
Texas	17
Georgia	16
Israel	15
New Zealand	14
South Africa	14
Washington	14
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	13
Individuals with Disabilities…	7
Elementary and Secondary…	3
Race to the Top	3
Elementary and Secondary…	2
American Recovery and…	1
Americans with Disabilities…	1
Education Consolidation…	1
Education for All Handicapped…	1
Every Student Succeeds Act…	1
Improving Americas Schools…	1
Individuals with Disabilities…	1
Individuals with Disabilities…	1
Pell Grant Program	1
Rehabilitation Act 1973…	1
Stewart B McKinney Homeless…	1
Temporary Assistance for…	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	3
Meets WWC Standards with or without Reservations	3
Does not meet standards	3

Showing 1,951 to 1,965 of 3,124 results Save | Export

The Effects of Order of Stimulus Presentation on Ratings of Counseling Performance.

Peer reviewed

Newman, Jody L.; Fuqua, Dale R. – Counselor Education and Supervision, 1986

Examined the effects of order of stimulus presentation on observer ratings of counseling performance. Results revealed a statistically significant interaction between quality of performance and the order in which the performances were rated. (Author/ABB)

Descriptors: Counselor Evaluation, Counselor Performance, Interrater Reliability, Observation

International Bias Detected in Judging Gymnastic Competition at the 1984 Olympic Games.

Peer reviewed

Ansorge, Charles J.; Scheer, John K. – Research Quarterly for Exercise and Sport, 1988

Analysis of gymnastics judges scores of their own and other countries' gymnasts' performance during the 1984 Olympic Games indicated that the judges were biased in favor of their own country's gymnasts. (Author/CB)

Descriptors: Bias, Competition, Gymnastics, International Relations

Diagnostic Accuracy of the Halstead-Reitan and Luria-Nebraska Neuropsychological Batteries: Performance of Clinical Raters.

Peer reviewed

Kane, Robert L.; And Others – Journal of Consulting and Clinical Psychology, 1987

Three experienced neuropsychologists rated brain damaged and control subjects for brain damage using the Halstead-Reitan Battery and the Luria-Nebraska Neuropsychological Battery. Using either battery, raters were accurate in judging the presence of brain damage. There was a high degree of consistency between raters and test batteries when both…

Descriptors: Interrater Reliability, Neurological Impairments, Psychological Testing, Psychometrics

A Computer Program for Assessing the Reliability of Nominal Scales Using Varying Sets of Multiple Raters.

Peer reviewed

Cicchetti, Domenic V.; And Others – Educational and Psychological Measurement, 1984

This program computes multiple judge reliability levels under the following conditions. (1) different sets of judges perform the ratings; (2) the number of judges is a constant; and (3) the scale of measurement is nominal. (Author)

Descriptors: Computer Software, Interrater Reliability, Judgment Analysis Technique, Test Reliability

Interscorer Reliability of the Minnesota Percepto-Diagnostic Test-Revised.

Peer reviewed

Vance, B.; And Others – Psychology in the Schools, 1983

Investigated the interscorer reliability between a novice and a professional psychologist for the Minnesota Percepto-Diagnostic Test-Revised (MPDT-R), using a sample of 30 individuals. Results indicated that for three of the four MPDT-R scores there was a significant positive correlation between expert and novice scoring criteria. (JAC)

Descriptors: Experimenter Characteristics, Interrater Reliability, Psychological Evaluation, Psychologists

Free-Marginal Multirater Kappa (multirater K[free]): An Alternative to Fleiss' Fixed-Marginal Multirater Kappa

Download full text

Randolph, Justus J. – Online Submission, 2005

Fleiss' popular multirater kappa is known to be influenced by prevalence and bias, which can lead to the paradox of high agreement but low kappa. It also assumes that raters are restricted in how they can distribute cases across categories, which is not a typical feature of many agreement studies. In this article, a free-marginal, multirater…

Descriptors: Multivariate Analysis, Statistical Distributions, Statistical Bias, Interrater Reliability

Conceptualizing Interrater Agreement as Testing the Existence of Extra Variation in the Multinomial Model.

Peer reviewed

Bartfay, Emma – International Journal of Testing, 2003

Used Monte Carlo simulation to compare the properties of a goodness-of-fit (GOF) procedure and a test statistic developed by E. Bartfay and A. Donner (2001) to the likelihood ratio test in assessing the existence of extra variation. Results show the GOF procedure possess satisfactory Type I error rate and power. (SLD)

Descriptors: Goodness of Fit, Interrater Reliability, Monte Carlo Methods, Simulation

Assessing Reliability of Measurements with Generalizability Theory: An Application to Inter-Rater Reliability.

Peer reviewed

VanLeeuwen, Dawn M. – Journal of Agricultural Education, 1997

Generalizability Theory can be used to assess reliability in the presence of multiple sources and different types of error. It provides a flexible alternative to Classical Theory and can handle estimation of interrater reliability with any number of raters. (SK)

Descriptors: Error of Measurement, Generalizability Theory, Interrater Reliability, Measurement Techniques

Psychodynamic Formulation, Consensual Response Method, and Interpersonal Problems.

Peer reviewed

Horowitz, Leonard M.; And Others – Journal of Consulting and Clinical Psychology, 1989

Developed method for aggregating psychodynamic formulations of independent clinicians. Panels of clinicians observed videotaped interviews of patients and wrote individual formulations which were combined into consensual formulation. Other clinical raters read each consensual formulation and judged whether each problem was apt to be distressing…

Descriptors: Clinical Diagnosis, Interpersonal Relationship, Interrater Reliability, Psychological Evaluation

Multiple Assessment of Managerial Effectiveness: Interrater Agreement and Consensus in Effectiveness Models.

Peer reviewed

Tsui, Anne S.; Ohlott, Patricia – Personnel Psychology, 1988

To test model of general managerial effectiveness, superiors (N=271), subordinates (N=605), and peers (N=469) rated 344 managers. Study designed to test three specific hypotheses on criterion type and criterion weights found consensus in effectiveness models of superiors, subordinates, and peers. Consensus among different raters was high on both…

Descriptors: Administrator Effectiveness, Congruence (Psychology), Evaluation Problems, Interrater Reliability

Bivariate Coefficients of Agreement among Any Number of Observers.

Peer reviewed

Fabbris, Luigi; Gallo, Francesca – Educational and Psychological Measurement, 1993

New coefficients of agreement are suggested for the measure of intraclass consistency between observations on two variables. The coefficients are derived from a general coefficient for measuring intraclass dependence in a bivariate analysis context. Various coefficients for the univariate agreement analysis are shown to be cases of the suggested…

Descriptors: Correlation, Equations (Mathematics), Interrater Reliability, Judges

Influence of Psychoactive Substance Use on the Reliability of Psychiatric Diagnosis.

Peer reviewed

Corty, Eric; And Others – Journal of Consulting and Clinical Psychology, 1993

Examined interrater reliability of diagnoses made on basis of structured interview for psychiatric patients with and without psychoactive substance use disorders (PSUDs). Results from 47 pairs of ratings by 9 clinical interviewers revealed that interrater reliability for non-PSUD psychiatric diagnoses was quite high when patient had no diagnosable…

Descriptors: Clinical Diagnosis, Interrater Reliability, Patients, Psychiatric Hospitals

Determining the Level of Reflective Thinking from Students' Written Journals Using a Coding Scheme Based on the Work of Mezirow.

Peer reviewed

Kember, David; Jones, Alice; Loke, Alice; McKay, Jan; Sinclair, Kit; Tse, Harrison; Webb, Celia; Wong, Frances; Wong, Marian; Yeung, Ella – International Journal of Lifelong Education, 1999

A coding method for measuring reflective thinking in student journals was tested twice, demonstrating acceptable reliability among evaluators and supporting the precision of the guidelines for coding. Coding categories were as follows: habitual action, introspection, thoughtful action, content reflection, process reflection, content and process…

Descriptors: Adult Education, Coding, Evaluation Methods, Interrater Reliability

Interrater Reliability of the Ruff Figural Fluency Test.

Peer reviewed

Berning, Lisa C.; Weed, Nathan C.; Aloia, Mark S. – Assessment, 1998

To examine the interrater reliability of the Ruff Figural Fluency Test (RFFT) (R. Ruff, 1988), 124 college students completed the measure and scored RFFT test protocols. Results indicated substantial interscorer reliability on the RFFT, particularly for number of unique designs. Reliability was lower for scoring perseverative errors and error…

Descriptors: College Students, Higher Education, Interrater Reliability, Scoring

An Experimental Study of Job Evaluation and Comparable Worth.

Peer reviewed

Arnault, E. Jane; Gordon, Louis; Joines, Douglas H.; Phillips, G. Michael – Industrial and Labor Relations Review, 2001

Three commercial job evaluation firms rated the same set of 27 jobs. Statistical analysis indicated that evaluators differed in which job traits they used to evaluate inherent job worth. Comparable worth may thus be sensitive to the choice of evaluator. (Contains 24 references.) (Author/SK)

Descriptors: Comparable Worth, Evaluation Problems, Evaluators, Interrater Reliability

« Previous Page | Next Page »

Pages: 1 | ... | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | ... | 209

Lunz, Mary E.	10
Wind, Stefanie A.	10
Engelhard, George, Jr.	8
Epstein, Michael H.	8
Ingham, Roger J.	8
Johnson, Evelyn S.	8
Matson, Johnny L.	7
McLeod, Bryce D.	7
Moylan, Laura A.	7
Cason, Carolyn L.	6
Cordes, Anne K.	6
Jaeger, Richard M.	6
Johnson, Robert L.	6
Lecavalier, Luc	6
Plake, Barbara S.	6
Tasse, Marc J.	6
Wyse, Adam E.	6
Zheng, Yuzhu	6
Aman, Michael G.	5
Barton, Erin E.	5
Cason, Gerald J.	5
Coniam, David	5
Conroy, Maureen A.	5
Crawford, Angela R.	5
More ▼

Journal Articles	2555
Reports - Research	2243
Reports - Evaluative	515
Speeches/Meeting Papers	272
Reports - Descriptive	163
Tests/Questionnaires	162
Information Analyses	130
Dissertations/Theses -…	89
Opinion Papers	61
Numerical/Quantitative Data	31
Guides - Non-Classroom	11
Books	7
Collected Works - General	3
Guides - Classroom - Teacher	3
Non-Print Media	3
Book/Product Reviews	2
Collected Works - Serials	2
Dissertations/Theses	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - General	2
Reports - General	2
Collected Works - Proceedings	1
Reference Materials -…	1
Reference Materials - General	1
More ▼

Test of English as a Foreign…	30
Child Behavior Checklist	18
National Assessment of…	14
Vineland Adaptive Behavior…	14
Autism Diagnostic Observation…	13
Strengths and Difficulties…	11
Woodcock Johnson Tests of…	10
Peabody Picture Vocabulary…	9
SAT (College Admission Test)	9
Wechsler Intelligence Scale…	9
Behavior Assessment System…	8
Dynamic Indicators of Basic…	8
Early Childhood Environment…	8
Graduate Record Examinations	8
International English…	7
Teacher Performance…	6
ACT Assessment	5
Advanced Placement…	5
Behavioral and Emotional…	5
Childhood Autism Rating Scale	5
Classroom Assessment Scoring…	5
Conners Teacher Rating Scale	5
Draw a Person Test	5
Raven Progressive Matrices	5
ACTFL Oral Proficiency…	4
More ▼

ProQuest LLC	86
Journal of Speech, Language,…	62
Educational and Psychological…	61
Journal of Autism and…	56
Grantee Submission	40
Language Testing	39
Online Submission	35
International Journal of…	34
Assessment & Evaluation in…	33
Research in Developmental…	31
Applied Measurement in…	28
Advances in Health Sciences…	26
Assessment for Effective…	26
ETS Research Report Series	25
Journal of Educational…	25
Educational Measurement:…	23
Measurement in Physical…	20
Language Assessment Quarterly	19
Psychology in the Schools	19
Topics in Early Childhood…	19
Psychological Assessment	18
Educational Assessment	16
Autism: The International…	15
Journal of Consulting and…	15
Personnel Psychology	15
More ▼