The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2.

Osama Koraishi

Notes FAQ Contact Us

Back to results

Peer reviewed
PDF on ERIC

Download full text

ERIC Number: EJ1457168

Record Type: Journal

Publication Date: 2024

Pages: 21

Abstractor: As Provided

ISBN: N/A

ISSN: N/A

EISSN: EISSN-2667-6753

Available Date: N/A

The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2

Osama Koraishi

Language Teaching Research Quarterly, v43 p22-42 2024

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability measures such as Cohen's weighted kappa and intraclass correlation. The results revealed a high agreement in means and substantial reliability between the two grading methods on the level of the majority of texts. However, individual discrepancies and outliers were also identified, underscoring the nuanced nature of grading. While ChatGPT demonstrated efficiency and general alignment with human grading, the study concludes that it should not replace human judgment, particularly due to these observed inconsistencies. The findings contribute valuable insights into the potential and limitations of AI in educational grading and emphasize the importance of a comprehensive quantitative evaluation.

Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence, Computer Software, Writing Evaluation, Writing Tests, Comparative Analysis, Grading, Efficiency, Technology Uses in Education, Correlation, Reliability, Evaluators, Natural Language Processing, Language Proficiency

European Knowledge Development (EUROKD). e-mail: editorial@eurokd.com; Web site: https://www.eurokd.com/journal/jd/1

Publication Type: Journal Articles; Reports - Research

Education Level: N/A

Audience: N/A

Language: English

Sponsor: N/A

Authoring Institution: N/A

Identifiers - Assessments and Surveys: International English Language Testing System

Grant or Contract Numbers: N/A

Author Affiliations: N/A