ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	13

Descriptor

Interrater Reliability	13
Probability	13
Accuracy	4
Comparative Analysis	4
Correlation	3
Cues	3
English (Second Language)	3
Models	3
Scores	3
Second Language Learning	3
Calculus	2
Classification	2
Classroom Environment	2
Computation	2
Computer Assisted Testing	2
Computer Software	2
Content Analysis	2
Electronic Learning	2
Elementary Education	2
English	2
Error of Measurement	2
Evaluation Methods	2
Evaluators	2
Foreign Countries	2
Generalizability Theory	2
More ▼

Source

ETS Research Report Series	1
Educational and Psychological…	1
IEEE Transactions on Learning…	1
International Association for…	1
International Journal of…	1
International Journal of…	1
Journal of Learning Analytics	1
Journal of MultiDisciplinary…	1
Journal of Speech, Language,…	1
Language Assessment Quarterly	1
Online Submission	1
Roeper Review	1
Sociological Methods &…	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	10
Collected Works - Proceedings	1
Opinion Papers	1
Reports - Descriptive	1

Education Level

Adult Education	2
Elementary Education	2
Higher Education	2
Elementary Secondary Education	1
Middle Schools	1
Postsecondary Education	1
Preschool Education	1

Audience

Location

United States	2
Arizona	1
Asia	1
Australia	1
Brazil	1
Canada	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Florida	1
Germany	1
Greece	1
Hawaii	1
Ireland	1
Israel	1
Italy	1
Japan	1
Kazakhstan	1
Netherlands	1
Norway	1
Ohio	1
Pakistan	1
Pennsylvania	1
Philippines	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Expressive One Word Picture…	1
Mean Length of Utterance	1
Peabody Picture Vocabulary…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Investigating Constructed-Response Scoring over Time: The Effects of Study Design on Trend Rescore Statistics. Research Report. ETS RR-22-15

Peer reviewed
PDF on ERIC

Download full text

Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022

When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…

Descriptors: Item Response Theory, Test Construction, Scoring, Testing

Metrics for Discrete Student Models: Chance Levels, Comparisons, and Use Cases

Peer reviewed
PDF on ERIC

Download full text

Bosch, Nigel; Paquette, Luc – Journal of Learning Analytics, 2018

Metrics including Cohen's kappa, precision, recall, and F[subscript 1] are common measures of performance for models of discrete student states, such as a student's affect or behaviour. This study examined discrete model metrics for previously published student model examples to identify situations where metrics provided differing perspectives on…

Descriptors: Models, Comparative Analysis, Prediction, Probability

Sources of Variance in the Accuracy of Interviewer Observations

Peer reviewed

Direct link

West, Brady T.; Li, Dan – Sociological Methods & Research, 2019

In face-to-face surveys, interviewer observations are a cost-effective source of paradata for nonresponse adjustment of survey estimates and responsive survey designs. Unfortunately, recent studies have suggested that the accuracy of these observations can vary substantially among interviewers, even after controlling for household-, area-, and…

Descriptors: Observation, Interviews, Error of Measurement, Accuracy

Kappa and Rater Accuracy: Paradigms and Parameters

Peer reviewed

Direct link

Conger, Anthony J. – Educational and Psychological Measurement, 2017

Drawing parallels to classical test theory, this article clarifies the difference between rater accuracy and reliability and demonstrates how category marginal frequencies affect rater agreement and Cohen's kappa. Category assignment paradigms are developed: comparing raters to a standard (index) versus comparing two raters to one another…

Descriptors: Interrater Reliability, Evaluators, Accuracy, Statistical Analysis

Coordinating Multiple Representations in a Reform Calculus Textbook

Peer reviewed

Direct link

Chang, Briana L.; Cromley, Jennifer G.; Tran, Nhi – International Journal of Science and Mathematics Education, 2016

Coordination of multiple representations (CMR) is widely recognized as a critical skill in mathematics and is frequently demanded in reform calculus textbooks. However, little is known about the prevalence of coordination tasks in such textbooks. We coded 707 instances of CMR in a widely used reform calculus textbook and analyzed the distributions…

Descriptors: Calculus, Textbooks, Teaching Methods, Mathematics Instruction

Item Response Theory for Peer Assessment

Peer reviewed

Direct link

Uto, Masaki; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2016

As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…

Descriptors: Item Response Theory, Peer Evaluation, Bayesian Statistics, Simulation

Utilizing Generalizability Theory to Investigate the Reliability of the Grades Assigned to Undergraduate Research Papers

Peer reviewed

Direct link

Gugiu, Mihaiela R.; Gugiu, Paul C.; Baldus, Robert – Journal of MultiDisciplinary Evaluation, 2012

Background: Educational researchers have long espoused the virtues of writing with regard to student cognitive skills. However, research on the reliability of the grades assigned to written papers reveals a high degree of contradiction, with some researchers concluding that the grades assigned are very reliable whereas others suggesting that they…

Descriptors: Grades (Scholastic), Grading, Scoring Rubrics, Research Design

Rater Experience, Rating Scale Length, and Judgments of L2 Pronunciation: Revisiting Research Conventions

Peer reviewed

Direct link

Isaacs, Talia; Thomson, Ron I. – Language Assessment Quarterly, 2013

This mixed-methods study examines the effects of rating scale length and rater experience on listeners' judgments of second-language (L2) speech. Twenty experienced and 20 novice raters, who were randomly assigned to 5-point or 9-point rating scale conditions, judged speech samples of 38 newcomers to Canada on numerical rating scales for…

Descriptors: Foreign Countries, Adults, Second Language Learning, English (Second Language)

IRR (Inter-Rater Reliability) of a COP (Classroom Observation Protocol)--A Critical Appraisal

Download full text

Rui, Ning; Feldman, Jill M. – Online Submission, 2012

Notwithstanding broad utility of COPs (classroom observation protocols), there has been limited documentation of the psychometric properties of even the most popular COPs. This study attempted to fill this void by closely examining the item and domain-level IRR (inter-rater reliability) of a COP that was used in a federally funded striving readers…

Descriptors: Classroom Observation Techniques, Interrater Reliability, Correlation, Psychometrics

Misunderstanding and Misrepresentation: A Reply to Hutchison and Schagen

Peer reviewed

Direct link

Gorard, Stephen – International Journal of Research & Method in Education, 2009

The author previously published a paper discussing how to conduct an analysis based on a cluster sample. In that paper, the author outlined several widely adopted alternative approaches, and pointed out that such approaches are anyway not needed for population figures, and not possible for non-probability samples. Thus, the author queried the…

Descriptors: Probability, Misconceptions, Reader Response, Research Methodology

Factors that Influence Fast Mapping in Children Exposed to Spanish and English

Peer reviewed

Direct link

Alt, Mary; Meyers, Christina; Figueroa, Cecilia – Journal of Speech, Language, and Hearing Research, 2013

Purpose: The purpose of this study was to determine whether children exposed to 2 languages would benefit from the phonotactic probability cues of a single language in the same way as monolingual peers and to determine whether crosslinguistic influence would be present in a fast-mapping task. Method: Two groups of typically developing children…

Descriptors: Regression (Statistics), Spanish, Cues, Task Analysis

The Gifted Rating Scales-School Form: A Validation Study Based on Age, Gender, and Race

Peer reviewed

Direct link

Pfeiffer, Steven; Petscher, Yaacov; Kumtepe, Alper – Roeper Review, 2008

This study examined the internal consistency and validity of a new rating scale to identify gifted students, the Gifted Rating Scales-School Form (GRS-S). The study explored the effect of gender, race/ethnicity, age, and rater familiarity on GRS-S ratings. One hundred twenty-two students in first to eighth grade from elementary and middle schools…

Descriptors: Ethnicity, Middle Schools, Academically Gifted, Talent

Proceedings of the International Association for Development of the Information Society (IADIS) International Conference on Cognition and Exploratory Learning in Digital Age (CELDA) (Madrid, Spain, October 19-21, 2012)

Download full text

International Association for Development of the Information Society, 2012

The IADIS CELDA 2012 Conference intention was to address the main issues concerned with evolving learning processes and supporting pedagogies and applications in the digital age. There had been advances in both cognitive psychology and computing that have affected the educational arena. The convergence of these two disciplines is increasing at a…

Descriptors: Academic Achievement, Academic Persistence, Academic Support Services, Access to Computers

Alt, Mary	1
Baldus, Robert	1
Bosch, Nigel	1
Chang, Briana L.	1
Conger, Anthony J.	1
Cromley, Jennifer G.	1
Donoghue, John R.	1
Feldman, Jill M.	1
Figueroa, Cecilia	1
Gorard, Stephen	1
Gugiu, Mihaiela R.	1
Gugiu, Paul C.	1
Hess, Melinda R.	1
Isaacs, Talia	1
Kumtepe, Alper	1
Li, Dan	1
McClellan, Catherine A.	1
Meyers, Christina	1
Paquette, Luc	1
Petscher, Yaacov	1
Pfeiffer, Steven	1
Rui, Ning	1
Thomson, Ron I.	1
Tran, Nhi	1
Ueno, Maomi	1
More ▼