Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 4 |
Descriptor
| Accuracy | 4 |
| Interrater Reliability | 4 |
| Evaluators | 3 |
| Expertise | 2 |
| Foreign Countries | 2 |
| Generalizability Theory | 2 |
| Language Tests | 2 |
| Scores | 2 |
| Ability | 1 |
| Certification | 1 |
| Communicative Competence… | 1 |
| More ▼ | |
Source
| Language Testing | 4 |
Author
| Attali, Yigal | 1 |
| Duijm, Klaartje | 1 |
| Hulstijn, Jan H. | 1 |
| Lin, Chih-Kai | 1 |
| Oostdam, Ron J. | 1 |
| Schoonen, Rob | 1 |
| de Jong, Nivja H. | 1 |
| van Batenburg, Eline S. L. | 1 |
| van Gelderen, Amos J. S. | 1 |
Publication Type
| Journal Articles | 4 |
| Reports - Research | 4 |
| Tests/Questionnaires | 1 |
Education Level
| Secondary Education | 1 |
Audience
Location
| Netherlands | 2 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Duijm, Klaartje; Schoonen, Rob; Hulstijn, Jan H. – Language Testing, 2018
It is general practice to use rater judgments in speaking proficiency testing. However, it has been shown that raters' knowledge and experience may influence their ratings, both in terms of leniency and varied focus on different aspects of speech. The purpose of this study is to identify raters' relative responsiveness to fluency and linguistic…
Descriptors: Language Fluency, Accuracy, Second Languages, Language Tests
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018
This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…
Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability
Attali, Yigal – Language Testing, 2016
A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…
Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators

Peer reviewed
Direct link
