ERIC - Search Results

Publication Date

In 2026	0
Since 2025	6
Since 2022 (last 5 years)	30
Since 2017 (last 10 years)	83
Since 2007 (last 20 years)	153

Descriptor

Interrater Reliability	196
Second Language Learning	196
English (Second Language)	139
Foreign Countries	100
Language Tests	91
Second Language Instruction	83
Language Proficiency	56
Evaluators	50
Oral Language	47
Scores	46
Comparative Analysis	42
Writing Evaluation	42
Correlation	38
Scoring	36
Statistical Analysis	34
College Students	32
Rating Scales	29
Teaching Methods	28
Computer Assisted Testing	26
Essays	25
Language Teachers	25
Undergraduate Students	24
Scoring Rubrics	22
Test Validity	22
Native Speakers	21
More ▼

Publication Type

Journal Articles	171
Reports - Research	159
Tests/Questionnaires	31
Reports - Evaluative	11
Information Analyses	9
Reports - Descriptive	9
Dissertations/Theses -…	8
Speeches/Meeting Papers	8
Opinion Papers	2
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
Reports - General	1
More ▼

Education Level

Higher Education	71
Postsecondary Education	56
Secondary Education	11
Adult Education	7
Elementary Education	7
Early Childhood Education	5
Primary Education	5
Elementary Secondary Education	3
High Schools	3
Kindergarten	3
Grade 11	2
Grade 2	2
Grade 5	2
Grade 1	1
Grade 3	1
Intermediate Grades	1
Middle Schools	1
Preschool Education	1
More ▼

Audience

Practitioners	3
Researchers	1
Teachers	1

Location

Iran	13
Turkey	12
China	11
Japan	10
Hong Kong	6
Arizona	4
Netherlands	4
Saudi Arabia	4
South Korea	4
Australia	3
Canada	3
Europe	3
Germany	3
India	3
Philippines	3
Spain	3
Denmark	2
Iran (Tehran)	2
Israel	2
Ohio	2
Pennsylvania	2
Switzerland	2
Taiwan	2
Thailand	2
Turkey (Istanbul)	2
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	20
International English…	6
ACTFL Oral Proficiency…	3
Peabody Picture Vocabulary…	2
Expressive One Word Picture…	1
Flesch Kincaid Grade Level…	1
Graduate Record Examinations	1
Kaufman Assessment Battery…	1
Mean Length of Utterance	1
Modern Language Aptitude Test	1
Oral and Written Language…	1
Reading Miscue Inventory	1
Test of English for…	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 196 results Save | Export

Comparison of Traditional Machine Learning and Neural Network Approaches for Automated Scoring of Second Language English Essays

Peer reviewed

Direct link

Erik Voss – Language Testing, 2025

An increasing number of language testing companies are developing and deploying deep learning-based automated essay scoring systems (AES) to replace traditional approaches that rely on handcrafted feature extraction. However, there is hesitation to accept neural network approaches to automated essay scoring because the features are automatically…

Descriptors: Artificial Intelligence, Automation, Scoring, English (Second Language)

Artificial Intelligence in International English Language Testing System Writing Assessments: A Comparative Study of Human Ratings and DeepAI

Peer reviewed
PDF on ERIC

Download full text

Somayeh Fathali; Fatemeh Mohajeri – Technology in Language Teaching & Learning, 2025

The International English Language Testing System (IELTS) is a high-stakes exam where Writing Task 2 significantly influences the overall scores, requiring reliable evaluation. While trained human raters perform this task, concerns about subjectivity and inconsistency have led to growing interest in artificial intelligence (AI)-based assessment…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Artificial Intelligence

Do Source Use Features Impact Raters' Judgment of Argumentation? An Experimental Study

Peer reviewed

Direct link

Ping-Lin Chuang – Language Testing, 2025

This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…

Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources

ChatGPT4o as an AI Peer Assessor in EFL Speaking Classrooms: Examining Scoring Reliability and Feedback Effectiveness

Peer reviewed

Direct link

Junfei Li; Jinyan Huang; Thomas Sheeran – SAGE Open, 2025

This study investigated the role of ChatGPT4o as an AI peer assessor in English-as-a-foreign-language (EFL) speaking classrooms, with a focus on its scoring reliability and the effectiveness of its feedback. The research involved 40 first-year English major students from two parallel classes at a Chinese university. Twenty from one class served as…

Descriptors: Artificial Intelligence, Technology Uses in Education, Peer Evaluation, English (Second Language)

Examining AI-Based Accuracy Assessment in L2 Learners' Writing

Peer reviewed

Direct link

On-Soon Lee – Journal of Pan-Pacific Association of Applied Linguistics, 2024

Despite the increasing interest in using AI tools as assistant agents in instructional settings, the effectiveness of ChatGPT, the generative pretrained AI, for evaluating the accuracy of second language (L2) writing has been largely unexplored in formative assessment. Therefore, the current study aims to examine how ChatGPT, as an evaluator,…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

The Rashomon Effect: Which Features of a Speaker's Talk Do Listeners Notice?

Peer reviewed

Direct link

Seedhouse, Paul; Satar, Müge – Classroom Discourse, 2023

The same L2 speaking performance may be analysed and evaluated in very different ways by different teachers or raters. We present a new, technology-assisted research design which opens up to investigation the trajectories of convergence and divergence between raters. We tracked and recorded what different raters noticed when, whilst grading a…

Descriptors: Language Tests, English (Second Language), Second Language Learning, Oral Language

Using Inter-Rater Discourse to Trace the Origins of Disagreement: Towards Collective Reflective Practice in L2 Assessment

Peer reviewed

Direct link

Matthews, Joshua – RELC Journal: A Journal of Language Teaching and Research, 2023

This article explores how the analysis of inter-rater discourse can be used to support collective reflective practice in second language (L2) assessment. To demonstrate, a focused case of the discourse between two experienced language teachers as they negotiate assessment decisions on L2 written texts is presented. Of particular interest was the…

Descriptors: Interrater Reliability, Discourse Analysis, Student Evaluation, Second Language Learning

Examining Rater Reliability When Using an Analytical Rubric for Oral Presentation Assessments

Peer reviewed
PDF on ERIC

Download full text

Sasithorn Limgomolvilas; Patsawut Sukserm – LEARN Journal: Language Education and Acquisition Research Network, 2025

The assessment of English speaking in EFL environments can be inherently subjective and influenced by various factors beyond linguistic ability, including choice of assessment criteria, and even the rubric type. In classroom assessment, the type of rubric recommended for English speaking tasks is the analytical rubric. Driven by three aims, this…

Descriptors: Oral Language, Speech Communication, English (Second Language), Second Language Learning

Automated Sign Language Vocabulary Assessment: Comparing Human and Machine Ratings and Studying Learner Perceptions

Peer reviewed

Direct link

Franz Holzknecht; Sandrine Tornay; Alessia Battisti; Aaron Olaf Batty; Katja Tissi; Tobias Haug; Sarah Ebling – Language Assessment Quarterly, 2024

Although automated spoken language assessment is rapidly growing, such systems have not been widely developed for signed languages. This study provides validity evidence for an automated web application that was developed to assess and give feedback on handshape and hand movement of L2 learners' Swiss German Sign Language signs. The study shows…

Descriptors: Sign Language, Vocabulary Development, Educational Assessment, Automation

Impact of Self-Construal on Rater Severity in Peer Assessments of Oral Presentations

Peer reviewed

Direct link

Tanaka, Mitsuko; Ross, Steven J. – Assessment in Education: Principles, Policy & Practice, 2023

Raters vary from each other in their severity and leniency in rating performance. This study examined the factors affecting rater severity in peer assessments of oral presentations in English as a Foreign Language (EFL), focusing on peer raters' self-construal and presentation abilities. Japanese university students enrolled in EFL classes…

Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Peer Evaluation

Examining Rater Biases of Peer Assessors in Different Assessment Environments

Peer reviewed
PDF on ERIC

Download full text

Yesilçinar, Sabahattin; Sata, Mehmet – International Journal of Psychology and Educational Studies, 2021

The current study employed many-facet Rasch measurement (MFRM) to explain the rater bias patterns of EFL student teachers (hereafter students) when they rate the teaching performance of their peers in three assessment environments: online, face-to-face, and anonymous. Twenty-four students and two instructors rated 72 micro-teachings performed by…

Descriptors: Peer Evaluation, Preservice Teachers, English (Second Language), Second Language Learning

"How Do Raters Learn to Rate?" Many-Facet Rasch Modeling of Rater Performance over the Course of a Rater Certification Program

Peer reviewed

Direct link

Yan, Xun; Chuang, Ping-Lin – Language Testing, 2023

This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program.…

Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Certification

How Many Raters Can Be Enough: G Theory Applied to Assessment and Measurement of L2 Speech Perception

Peer reviewed
PDF on ERIC

Download full text

Kevin Hirschi; Okim Kang – Language Teaching Research Quarterly, 2023

This paper extends the use of Generalizability Theory to the measurement of extemporaneous L2 speech through the lens of speech perception. Using six datasets of previous studies, it reports on "G studies"--a method of breaking down measurement variance--and "D studies"--a predictive study of the impact on reliability when…

Descriptors: Evaluators, Generalization, Evaluation Methods, Speech Communication

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Examining Consistency among Different Rubrics for Assessing Writing

Peer reviewed

Direct link

Shabani, Enayat A.; Panahi, Jaleh – Language Testing in Asia, 2020

The literature on using scoring rubrics in writing assessment denotes the significance of rubrics as practical and useful means to assess the quality of writing tasks. This study tries to investigate the agreement among rubrics endorsed and used for assessing the essay writing tasks by the internationally recognized tests of English language…

Descriptors: Writing Evaluation, Scoring Rubrics, Scores, Interrater Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 14

Language Testing	21
Language Assessment Quarterly	10
ETS Research Report Series	8
Foreign Language Annals	7
ProQuest LLC	7
English Language Teaching	6
Online Submission	5
Studies in Second Language…	4
System	4
Assessing Writing	3
Iranian Journal of Language…	3
Language Testing in Asia	3
Modern Language Journal	3
SAGE Open	3
Advances in Language and…	2
Cogent Education	2
ELT Journal	2
Education and Information…	2
International Journal of…	2
JALT CALL Journal	2
Journal of Pan-Pacific…	2
Journal of Speech, Language,…	2
Language Learning Journal	2
Language Teaching Research…	2
Language, Speech, and Hearing…	2
More ▼

Coniam, David	3
Ahmadi, Alireza	2
Aydin, Selami	2
Davis, Larry	2
Derrick, Deirdre J.	2
Elder, Catherine	2
Gersten, Russell	2
Grant, Leslie	2
Knoch, Ute	2
McNamara, T. F.	2
Polat, Murat	2
Saito, Kazuya	2
Wigglesworth, Gillian	2
de Jong, Nivja H.	2
Aaron Olaf Batty	1
Adams, R. J.	1
Afzali, Katayoon	1
Ahmadi Shirazi, Masoumeh	1
Ahmed Alkhateeb	1
Ahour, Touran	1
Alanen, Riikka	1
Alavi, Sahar Zahed	1
Albudoor, Nahar	1
Alessia Battisti	1
More ▼