Publication Date
| In 2026 | 0 |
| Since 2025 | 3 |
| Since 2022 (last 5 years) | 6 |
| Since 2017 (last 10 years) | 18 |
| Since 2007 (last 20 years) | 32 |
Descriptor
Source
| Language Testing | 39 |
Author
| Attali, Yigal | 2 |
| Iasonas Lamprianou | 2 |
| Knoch, Ute | 2 |
| Reeta Neittaanmäki | 2 |
| Schoonen, Rob | 2 |
| Wind, Stefanie A. | 2 |
| Yan, Xun | 2 |
| de Jong, Nivja H. | 2 |
| Alanen, Riikka | 1 |
| Barkhuizen, Gary | 1 |
| Bosker, Hans Rutger | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 39 |
| Reports - Research | 32 |
| Reports - Evaluative | 6 |
| Information Analyses | 2 |
| Tests/Questionnaires | 2 |
| Reports - Descriptive | 1 |
Education Level
Audience
Location
| Netherlands | 5 |
| Finland | 3 |
| Japan | 2 |
| Arizona | 1 |
| China | 1 |
| Europe | 1 |
| Georgia | 1 |
| Hong Kong | 1 |
| Illinois (Urbana) | 1 |
| India | 1 |
| Japan (Tokyo) | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
| Test of English as a Foreign… | 3 |
| Graduate Record Examinations | 1 |
| Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Schaefer, Edward – Language Testing, 2008
The present study employed multi-faceted Rasch measurement (MFRM) to explore the rater bias patterns of native English-speaker (NES) raters when they rate EFL essays. Forty NES raters rated 40 essays written by female Japanese university students on a single topic adapted from the TOEFL Test of Written English (TWE). The essays were assessed using…
Descriptors: Writing Evaluation, Writing Tests, Program Effectiveness, Essays
Peer reviewedSchoonen, Rob; And Others – Language Testing, 1997
Reports on three studies conducted in the Netherlands about the reading reliability of lay and expert readers in rating content and language usage of students' writing performances in three kinds of writing assignments. Findings reveal that expert readers are more reliable in rating usage, whereas both lay and expert readers are reliable raters of…
Descriptors: Foreign Countries, Interrater Reliability, Language Usage, Models
Peer reviewedBrown, Annie – Language Testing, 2003
Examines the question of variation among interviewers of oral language proficiency interviews in the ways that they elicit demonstrations of communicative ability and the impact of this variation on candidate performance and raters' perceptions of candidate ability. A discourse analysis of two interviews involving the same candidate with two…
Descriptors: Discourse Analysis, Interrater Reliability, Interviews, Language Proficiency
Peer reviewedHenning, Grant – Language Testing, 1996
Analyzes simulated performance ratings on a six-point scale by two independent raters to account for nonsystematic error in performance ratings. Results suggest that rater agreement or covariance is not always a dependable estimate of score reliability and that the practice of seeking additional raters for adjudication of discrepant ratings is not…
Descriptors: Correlation, Error Patterns, Interrater Reliability, Language Tests
Kozaki ,Y. – Language Testing, 2004
This article presents a standard-setting procedure for performance assessment in a foreign language, through which some of the major problems facing performance assessment in criterion-referenced testing can be addressed. The procedure, which was geared to revealing and accommodating inter-judge variability, employed the synergy of multiple…
Descriptors: Data Analysis, Testing, Performance Tests, Generalizability Theory
Elder, Catherine; Barkhuizen, Gary; Knoch, Ute; von Randow, Janet – Language Testing, 2007
The use of online rater self-training is growing in popularity and has obvious practical benefits, facilitating access to training materials and rating samples and allowing raters to reorient themselves to the rating scale and self monitor their behaviour at their own convenience. However there has thus far been little research into rater…
Descriptors: Writing Evaluation, Writing Tests, Scoring Rubrics, Rating Scales
Peer reviewedGrant, Leslie – Language Testing, 1997
Describes current procedures used for testing bilingual teachers in the United States and focuses on one means of assessment used in Arizona. Examinee questionnaire responses, teacher questionnaire responses and test section analysis all contributed evidence for validity. (33 references) (Author/CK)
Descriptors: Bilingualism, Criterion Referenced Tests, Interrater Reliability, Language Teachers
Van Moere, Alistair – Language Testing, 2006
This article investigates a group oral test as administered at a university in Japan to find if it is appropriate to use scores for higher stakes decision making. It is one component of an in-house English proficiency test used for placing students, evaluating their progress, and making informed decisions for the development of the English…
Descriptors: Foreign Countries, Generalizability Theory, Achievement Tests, English (Second Language)
Peer reviewedWigglesworth, Gillian – Language Testing, 1997
In this study, planning time was manipulated as a variable in a trial administration of a semi-direct oral interaction test. Discourse analytic techniques were used to determine the nature and/or significance of difference in the elicited discourse across two conditions in terms of complexity and accuracy. Findings suggest that planning time may…
Descriptors: Cognitive Development, Communicative Competence (Languages), Comparative Analysis, Discourse Analysis

Direct link
