ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	6

Source

Language Testing

Author

Iasonas Lamprianou	2
Reeta Neittaanmäki	2
Chuang, Ping-Lin	1
Erik Voss	1
Hiroaki Yamada	1
Ping-Lin Chuang	1
Takenobu Tokunaga	1
Yan, Xun	1
Yasuyo Sawaki	1
Yutaka Ishii	1

Publication Type

Journal Articles	6
Reports - Research	6

Education Level

Higher Education	2
Postsecondary Education	2

Audience

Location

Finland	2
Illinois (Urbana)	1
Japan (Tokyo)	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Comparison of Traditional Machine Learning and Neural Network Approaches for Automated Scoring of Second Language English Essays

Peer reviewed

Direct link

Erik Voss – Language Testing, 2025

An increasing number of language testing companies are developing and deploying deep learning-based automated essay scoring systems (AES) to replace traditional approaches that rely on handcrafted feature extraction. However, there is hesitation to accept neural network approaches to automated essay scoring because the features are automatically…

Descriptors: Artificial Intelligence, Automation, Scoring, English (Second Language)

Communal Factors in Rater Severity and Consistency over Time in High-Stakes Oral Assessment

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…

Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory

Examining the Consistency of Instructor versus Large Language Model Ratings on Summary Content: Toward Checklist-Based Feedback Provision with Second Language Writers

Peer reviewed

Direct link

Yasuyo Sawaki; Yutaka Ishii; Hiroaki Yamada; Takenobu Tokunaga – Language Testing, 2025

This study examined the consistency between instructor ratings of learner-generated summaries and those estimated by a large language model (LLM) on summary content checklist items designed for undergraduate second language (L2) writing instruction in Japan. The effects of the LLM prompt design on the consistency between the two were also explored…

Descriptors: Interrater Reliability, Writing Teachers, College Faculty, Artificial Intelligence

All Types of Experience Are Equal, but Some Are More Equal: The Effect of Different Types of Experience on Rater Severity and Rater Consistency

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated…

Descriptors: Foreign Countries, Interrater Reliability, Error of Measurement, Experience

Do Source Use Features Impact Raters' Judgment of Argumentation? An Experimental Study

Peer reviewed

Direct link

Ping-Lin Chuang – Language Testing, 2025

This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…

Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources

"How Do Raters Learn to Rate?" Many-Facet Rasch Modeling of Rater Performance over the Course of a Rater Certification Program

Peer reviewed

Direct link

Yan, Xun; Chuang, Ping-Lin – Language Testing, 2023

This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program.…

Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Certification

Interrater Reliability	6
English (Second Language)	3
Evaluators	3
Foreign Countries	3
Second Language Learning	3
Artificial Intelligence	2
College Faculty	2
Essay Tests	2
Item Response Theory	2
Language Tests	2
Training	2
Writing Evaluation	2
Writing Tests	2
Achievement Rating	1
Automation	1
Causal Models	1
Certification	1
Check Lists	1
Error of Measurement	1
Evaluation Criteria	1
Evaluation Methods	1
Evaluative Thinking	1
Experience	1
Feedback (Response)	1
Finno Ugric Languages	1
More ▼