Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 2 |
Descriptor
| Interrater Reliability | 2 |
| Reliability | 2 |
| Scoring | 2 |
| Accuracy | 1 |
| Automation | 1 |
| College Entrance Examinations | 1 |
| Correlation | 1 |
| Data Analysis | 1 |
| Error of Measurement | 1 |
| Essay Tests | 1 |
| Evaluation Methods | 1 |
| More ▼ | |
Source
| Language Testing | 2 |
Publication Type
| Journal Articles | 2 |
| Reports - Evaluative | 1 |
| Reports - Research | 1 |
Education Level
| Higher Education | 1 |
| Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013
Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…
Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests

Peer reviewed
Direct link
