ERIC - Search Results

Publication Date

In 2026	0
Since 2025	58
Since 2022 (last 5 years)	284
Since 2017 (last 10 years)	780
Since 2007 (last 20 years)	2042

Descriptor

Interrater Reliability	3124
Foreign Countries	655
Test Reliability	503
Evaluation Methods	502
Test Validity	410
Correlation	401
Scoring	347
Comparative Analysis	327
Scores	324
Validity	310
Student Evaluation	308
Measures (Individuals)	298
Evaluators	295
Rating Scales	282
Statistical Analysis	268
Higher Education	264
Psychometrics	241
Reliability	231
Observation	229
Scoring Rubrics	216
Test Construction	212
English (Second Language)	211
Teaching Methods	208
Writing Evaluation	206
Intervention	200
More ▼

Education Level

Higher Education	574
Postsecondary Education	420
Elementary Education	282
Secondary Education	180
Early Childhood Education	145
Elementary Secondary Education	120
Middle Schools	109
High Schools	86
Preschool Education	72
Junior High Schools	65
Adult Education	59
Primary Education	57
Kindergarten	45
Grade 4	41
Grade 5	40
Intermediate Grades	40
Grade 1	37
Grade 6	35
Grade 8	32
Grade 3	31
Grade 2	27
Grade 7	27
Grade 10	13
Grade 9	11
Two Year Colleges	8
More ▼

Audience

Researchers	130
Practitioners	42
Teachers	22
Administrators	11
Counselors	3
Policymakers	2

Location

Australia	56
Turkey	53
United Kingdom	46
Canada	45
Netherlands	40
China	38
California	37
United States	30
United Kingdom (England)	25
Taiwan	23
Germany	22
Japan	22
Pennsylvania	22
Florida	21
Sweden	21
Iran	19
North Carolina	19
Hong Kong	17
South Korea	17
Texas	17
Georgia	16
Israel	15
New Zealand	14
South Africa	14
Washington	14
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	13
Individuals with Disabilities…	7
Elementary and Secondary…	3
Race to the Top	3
Elementary and Secondary…	2
American Recovery and…	1
Americans with Disabilities…	1
Education Consolidation…	1
Education for All Handicapped…	1
Every Student Succeeds Act…	1
Improving Americas Schools…	1
Individuals with Disabilities…	1
Individuals with Disabilities…	1
Pell Grant Program	1
Rehabilitation Act 1973…	1
Stewart B McKinney Homeless…	1
Temporary Assistance for…	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	3
Meets WWC Standards with or without Reservations	3
Does not meet standards	3

Showing 1,891 to 1,905 of 3,124 results Save | Export

An Empirical Foundation for a Taxonomy of Humor.

Download full text

Froman, Richard L., Jr. – 1988

The reliability of a taxonomy of humor was tested in two studies. The first study involved rater identification of nine categories for humorous incidents excerpted from television comedy programs (wordplay, exaggeration/understatement, contrast, audience knowledge, aggression, emotion, taboo, pratfall/slapstick, and repetition). The second study,…

Descriptors: Classification, Humor, Interrater Reliability, Psychometrics

Congeneric Modeling of an Interrater Reliability Problem Using Censored Variables.

Download full text

Brown, R. L. – 1987

This paper explores the use of K. G. Joreskog's (1970) congeneric modeling approach to reliability using censored quantitative variables. Two Monte Carlo studies were conducted. The first explored the robustness of Normal Theory Generalized Least-Squares (NTGLS) estimates for a single-factor congeneric model across several sample sizes…

Descriptors: Interrater Reliability, Monte Carlo Methods, Sample Size

Interrater Agreement for Journal Manuscript Reviews.

Peer reviewed

Whitehurst, Grover, J. – American Psychologist, 1984

Holds that interrater agreement for journal manuscript reviews has seemed unacceptably low because it has been assessed using techniques such as the intraclass correlation, which compares error variance with the variance due to manuscripts. Describes and recommends an alternative approach for computing interrater agreement. (GC)

Descriptors: Interrater Reliability, Periodicals, Psychological Studies, Statistical Analysis

Kappa, Measures of Marginal Symmetry and Intraclass Correlations.

Peer reviewed

Collis, Glyn M. – Educational and Psychological Measurement, 1985

Some suggestions for measuring marginal symmetry in agreement matrices for categorical data are discussed, together with measures of item-by-item agreement conditional on marginal asymmetry. Connections with intraclass correlations for dichotomous data are noted. (Author)

Descriptors: Correlation, Interrater Reliability, Item Analysis, Matrices

Generalizability Theory Applied to Categorical Data.

Peer reviewed

Li, Mao-Neng Fred; Lautenschlager, Gary – Educational and Psychological Measurement, 1997

lllustrates a link between the multiple-rater kappa of J. Fleiss (1971) or other analogues and the generalizability (G) coefficient for a single facet design, and discusses the use and interpretation of G theory in the study of interrater agreement when data are measured on a nominal scale. (SLD)

Descriptors: Classification, Generalizability Theory, Interrater Reliability, Research Design

IASGA: A SAS MACRO Program for Interrater Agreement Studies of Qualitative Data via a Generalizability Approach.

Peer reviewed

Li, Mao-Neng Fred; Lautenschlager, Gary J. – Educational and Psychological Measurement, 1999

Describes a Statistical Analysis System (SAS) MACRO for computing various indices of interrater agreement, including a new generalizability coefficient, for categorical data in a single-facet, crossed design. (Author/SLD)

Descriptors: Classification, Generalizability Theory, Interrater Reliability, Qualitative Research

A Revised Index of Interrater Agreement for Multi-Item Ratings of a Single Target.

Peer reviewed

Lindell, Michael K.; Brandt, Christina J.; Whitney, David J. – Applied Psychological Measurement, 1999

Proposes a revised index of interrater agreement for multi-item ratings of a single target. This index is an inverse linear function of the ratio of the average obtained variance to the variance of the uniformly distributed random error. Discusses the importance of sample size for the index. (SLD)

Descriptors: Error of Measurement, Interrater Reliability, Sample Size

Dispersion-Weighted Kappa: An Integrative Framework for Metric and Nominal Scale Agreement Coefficients

Peer reviewed

Direct link

Schuster, Christof; Smith, David A. – Psychometrika, 2005

The rater agreement literature is complicated by the fact that it must accommodate at least two different properties of rating data: the number of raters (two versus more than two) and the rating scale level (nominal versus metric). While kappa statistics are most widely used for nominal scales, intraclass correlation coefficients have been…

Descriptors: Psychometrics, Statistics, Rating Scales, Correlation

What Makes Marking Reliable? Experiments with UK Examinations

Peer reviewed

Direct link

Baird, Jo-Anne; Greatorex, Jackie; Bell, John F. – Assessment in Education Principles Policy and Practice, 2004

Marking reliability is purported to be produced by having an effective community of practice. No experimental research has been identified which attempts to verify empirically the aspects of a community of practice that have been observed to produce marking reliability. This research outlines what that community of practice might entail and…

Descriptors: Foreign Countries, Grades (Scholastic), Grading, Interrater Reliability

The Influence of Multiple Presentations on Judgments of Children's Phonetic Accuracy

Peer reviewed

Direct link

Munson, Benjamin; Brinkman, Kayla N. – American Journal of Speech-Language Pathology, 2004

Two experiments examined whether listening to multiple presentations of recorded speech stimuli influences the reliability and accuracy of judgments of children's speech production accuracy. In Experiment 1, 10 listeners phonetically transcribed words produced by children with phonological impairments after a single presentation and after the word…

Descriptors: Speech, Children, Phonetics, Speech Impairments

Interobserver Agreement on First-Stage Conversation Analytic Transcription

Peer reviewed

Direct link

Roberts, Felicia; Robinson, Jeffrey D. – Human Communication Research, 2004

This investigation assesses interobserver agreement on conversation analytic (CA) transcription. Four professional CA transcribers spent a maximum of 3 hours transcribing 2.5 minutes of a previously unknown, naturally occurring, mundane telephone call. Researchers unitized transcripts into words, sounds, silences, inbreaths, outbreaths, and laugh…

Descriptors: Interrater Reliability, Discourse Analysis, Semantics, Pragmatics

A Comparison of Two Methods of Determining Interrater Reliability

Peer reviewed

Direct link

Fleming, Judith A.; Taylor, Janeen McCracken; Carran, Deborah – Assessment for Effective Intervention, 2004

This article offers an alternative methodology for practitioners and researchers to use in establishing interrater reliability for testing purposes. The majority of studies on interrater reliability use a traditional methodology where by two raters are compared using a Pearson product-moment correlation. This traditional method of estimating…

Descriptors: Interrater Reliability, Methods, Correlation, Evaluation Methods

Estimating with a Latent Class Model the Reliability of Nominal Judgments upon Which Two Raters Agree

Peer reviewed

Direct link

Schuster, Christof; Smith, David A. – Educational and Psychological Measurement, 2006

Because nominal-scale judgments cannot directly be aggregated into meaningful composites, the addition of a second rater is usually motivated by a desire to estimate the quality of a single rater's classifications rather than to improve reliability. When raters agree, the aggregation problem does not arise. Nevertheless, a proportion of this…

Descriptors: Models, Interrater Reliability, Measures (Individuals), Evaluation Criteria

Comparison of Transition-Related IEP Content for Young Adults with Disabilities Who Do or Do Not Have a Legal Guardian

Peer reviewed

Direct link

Millar, Dorothy Squatrito – Education and Training in Developmental Disabilities, 2009

IEP transition-related content was compared between young adults with developmental disabilities who had or did not have legal guardians. It was found that students with guardians were more likely to earn a certificate of completion, and wanted to remain living with their families, in comparison to students without guardians who were more likely…

Descriptors: Developmental Disabilities, Young Adults, Individualized Education Programs, Self Determination

Experimenting with a Computer Essay-Scoring Program Based on ESL Student Writing Scripts

Peer reviewed

Direct link

Coniam, David – ReCALL, 2009

This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…

Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability

« Previous Page | Next Page »

Pages: 1 | ... | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | ... | 209

ProQuest LLC	86
Journal of Speech, Language,…	62
Educational and Psychological…	61
Journal of Autism and…	56
Grantee Submission	40
Language Testing	39
Online Submission	35
International Journal of…	34
Assessment & Evaluation in…	33
Research in Developmental…	31
Applied Measurement in…	28
Advances in Health Sciences…	26
Assessment for Effective…	26
ETS Research Report Series	25
Journal of Educational…	25
Educational Measurement:…	23
Measurement in Physical…	20
Language Assessment Quarterly	19
Psychology in the Schools	19
Topics in Early Childhood…	19
Psychological Assessment	18
Educational Assessment	16
Autism: The International…	15
Journal of Consulting and…	15
Personnel Psychology	15
More ▼

Lunz, Mary E.	10
Wind, Stefanie A.	10
Engelhard, George, Jr.	8
Epstein, Michael H.	8
Ingham, Roger J.	8
Johnson, Evelyn S.	8
Matson, Johnny L.	7
McLeod, Bryce D.	7
Moylan, Laura A.	7
Cason, Carolyn L.	6
Cordes, Anne K.	6
Jaeger, Richard M.	6
Johnson, Robert L.	6
Lecavalier, Luc	6
Plake, Barbara S.	6
Tasse, Marc J.	6
Wyse, Adam E.	6
Zheng, Yuzhu	6
Aman, Michael G.	5
Barton, Erin E.	5
Cason, Gerald J.	5
Coniam, David	5
Conroy, Maureen A.	5
Crawford, Angela R.	5
More ▼

Journal Articles	2555
Reports - Research	2243
Reports - Evaluative	515
Speeches/Meeting Papers	272
Reports - Descriptive	163
Tests/Questionnaires	162
Information Analyses	130
Dissertations/Theses -…	89
Opinion Papers	61
Numerical/Quantitative Data	31
Guides - Non-Classroom	11
Books	7
Collected Works - General	3
Guides - Classroom - Teacher	3
Non-Print Media	3
Book/Product Reviews	2
Collected Works - Serials	2
Dissertations/Theses	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - General	2
Reports - General	2
Collected Works - Proceedings	1
Reference Materials -…	1
Reference Materials - General	1
More ▼

Test of English as a Foreign…	30
Child Behavior Checklist	18
National Assessment of…	14
Vineland Adaptive Behavior…	14
Autism Diagnostic Observation…	13
Strengths and Difficulties…	11
Woodcock Johnson Tests of…	10
Peabody Picture Vocabulary…	9
SAT (College Admission Test)	9
Wechsler Intelligence Scale…	9
Behavior Assessment System…	8
Dynamic Indicators of Basic…	8
Early Childhood Environment…	8
Graduate Record Examinations	8
International English…	7
Teacher Performance…	6
ACT Assessment	5
Advanced Placement…	5
Behavioral and Emotional…	5
Childhood Autism Rating Scale	5
Classroom Assessment Scoring…	5
Conners Teacher Rating Scale	5
Draw a Person Test	5
Raven Progressive Matrices	5
ACTFL Oral Proficiency…	4
More ▼