Publication Date
| In 2026 | 0 |
| Since 2025 | 10 |
| Since 2022 (last 5 years) | 24 |
| Since 2017 (last 10 years) | 33 |
| Since 2007 (last 20 years) | 43 |
Descriptor
| Grading | 43 |
| Natural Language Processing | 43 |
| Artificial Intelligence | 27 |
| Automation | 18 |
| Computer Assisted Testing | 14 |
| Technology Uses in Education | 14 |
| Feedback (Response) | 12 |
| Foreign Countries | 10 |
| Student Evaluation | 10 |
| Educational Technology | 9 |
| Intelligent Tutoring Systems | 8 |
| More ▼ | |
Source
Author
| Cai, Zhiqiang | 3 |
| Hang Li | 2 |
| Hu, Xiangen | 2 |
| Jiliang Tang | 2 |
| Joseph Krajcik | 2 |
| Kaiqi Yang | 2 |
| Yasemin Copur-Gencturk | 2 |
| Yucheng Chu | 2 |
| Abdulkadir Kara | 1 |
| Abubakir Siedahmed | 1 |
| Agustín Garagorry Guerra | 1 |
| More ▼ | |
Publication Type
Education Level
Audience
| Administrators | 1 |
| Researchers | 1 |
| Students | 1 |
| Teachers | 1 |
Location
| Australia | 2 |
| Canada | 2 |
| India | 2 |
| Slovenia | 2 |
| Africa | 1 |
| Brazil | 1 |
| California | 1 |
| China | 1 |
| Finland | 1 |
| France | 1 |
| Georgia (Atlanta) | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
| International English… | 1 |
| Program for International… | 1 |
| Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Da-Wei Zhang; Melissa Boey; Yan Yu Tan; Alexis Hoh Sheng Jia – npj Science of Learning, 2024
This study evaluates the ability of large language models (LLMs) to deliver criterion-based grading and examines the impact of prompt engineering with detailed criteria on grading. Using well-established human benchmarks and quantitative analyses, we found that even free LLMs achieve criterion-based grading with a detailed understanding of the…
Descriptors: Artificial Intelligence, Natural Language Processing, Criterion Referenced Tests, Grading
Michel C. Desmarais; Arman Bakhtiari; Ovide Bertrand Kuichua Kandem; Samira Chiny Folefack Temfack; Chahé Nerguizian – International Educational Data Mining Society, 2025
We propose a novel method for automated short answer grading (ASAG) designed for practical use in real-world settings. The method combines LLM embedding similarity with a nonlinear regression function, enabling accurate prediction from a small number of expert-graded responses. In this use case, a grader manually assesses a few responses, while…
Descriptors: Grading, Automation, Artificial Intelligence, Natural Language Processing
Owen Henkel; Libby Hills; Bill Roberts; Joshua McGrane – International Journal of Artificial Intelligence in Education, 2025
Formative assessment plays a critical role in improving learning outcomes by providing feedback on student mastery. Open-ended questions, which require students to produce multi-word, nontrivial responses, are a popular tool for formative assessment as they provide more specific insights into what students do and do not know. However, grading…
Descriptors: Artificial Intelligence, Grading, Reading Comprehension, Natural Language Processing
Smitha S. Kumar; Michael A. Lones; Manuel Maarek; Hind Zantout – ACM Transactions on Computing Education, 2025
Programming demands a variety of cognitive skills, and mastering these competencies is essential for success in computer science education. The importance of formative feedback is well acknowledged in programming education, and thus, a diverse range of techniques has been proposed to generate and enhance formative feedback for programming…
Descriptors: Automation, Computer Science Education, Programming, Feedback (Response)
Abdulkadir Kara; Eda Saka Simsek; Serkan Yildirim – Asian Journal of Distance Education, 2024
Evaluation is an essential component of the learning process when discerning learning situations. Assessing natural language responses, like short answers, takes time and effort. Artificial intelligence and natural language processing advancements have led to more studies on automatically grading short answers. In this review, we systematically…
Descriptors: Automation, Natural Language Processing, Artificial Intelligence, Grading
Putnikovic, Marko; Jovanovic, Jelena – IEEE Transactions on Learning Technologies, 2023
Automatic grading of short answers is an important task in computer-assisted assessment (CAA). Recently, embeddings, as semantic-rich textual representations, have been increasingly used to represent short answers and predict the grade. Despite the recent trend of applying embeddings in automatic short answer grading (ASAG), there are no…
Descriptors: Automation, Computer Assisted Testing, Grading, Natural Language Processing
Abubakir Siedahmed; Jaclyn Ocumpaugh; Zelda Ferris; Dinesh Kodwani; Eamon Worden; Neil Heffernan – International Educational Data Mining Society, 2025
Recent advances in AI have opened the door for the automated scoring of open-ended math problems, which were previously much more difficult to assess at scale. However, we know that biases still remain in some of these algorithms. For example, recent research on the automated scoring of student essays has shown that certain varieties of English…
Descriptors: Artificial Intelligence, Automation, Scoring, Mathematics Tests
Yucheng Chu; Hang Li; Kaiqi Yang; Harry Shomer; Yasemin Copur-Gencturk; Leonora Kaldaras; Kevin Haudek; Joseph Krajcik; Namsoo Shin; Hui Liu; Jiliang Tang – International Educational Data Mining Society, 2025
Open-text responses provide researchers and educators with rich, nuanced insights that multiple-choice questions cannot capture. When reliably assessed, such responses have the potential to enhance teaching and learning. However, scaling and consistently capturing these nuances remain significant challenges, limiting the widespread use of…
Descriptors: Grading, Automation, Artificial Intelligence, Natural Language Processing
Yucheng Chu; Peng He; Hang Li; Haoyu Han; Kaiqi Yang; Yu Xue; Tingting Li; Yasemin Copur-Gencturk; Joseph Krajcik; Jiliang Tang – International Educational Data Mining Society, 2025
Short answer assessment is a vital component of science education, allowing evaluation of students' complex three-dimensional understanding. Large language models (LLMs) that possess human-like ability in linguistic tasks are increasingly popular in assisting human graders to reduce their workload. However, LLMs' limitations in domain knowledge…
Descriptors: Artificial Intelligence, Science Education, Technology Uses in Education, Natural Language Processing
Naima Debbar – International Journal of Contemporary Educational Research, 2024
Intelligent systems of essay grading constitute important tools for educational technologies. They can significantly replace the manual scoring efforts and provide instructional feedback as well. These systems typically include two main parts: a feature extractor and an automatic grading model. The latter is generally based on computational and…
Descriptors: Test Scoring Machines, Computer Uses in Education, Artificial Intelligence, Essay Tests
Mengqian Wang; Wenge Guo – ECNU Review of Education, 2025
This review compares generative artificial intelligence with five representative educational technologies in history and concludes that AI technology can become a knowledge producer and thus can be utilized as educative AI to enhance teaching and learning outcomes. From a historical perspective, each technological breakthrough has affected…
Descriptors: Artificial Intelligence, Man Machine Systems, Natural Language Processing, History
Lixiang Yan; Lele Sha; Linxuan Zhao; Yuheng Li; Roberto Martinez-Maldonado; Guanliang Chen; Xinyu Li; Yueqiao Jin; Dragan Gaševic – British Journal of Educational Technology, 2024
Educational technology innovations leveraging large language models (LLMs) have shown the potential to automate the laborious process of generating and analysing textual content. While various innovations have been developed to automate a range of educational tasks (eg, question generation, feedback provision, and essay grading), there are…
Descriptors: Educational Technology, Artificial Intelligence, Natural Language Processing, Educational Innovation
Jussi S. Jauhiainen; Agustín Garagorry Guerra – Innovations in Education and Teaching International, 2025
The study highlights ChatGPT-4's potential in educational settings for the evaluation of university students' open-ended written examination responses. ChatGPT-4 evaluated 54 written responses, ranging from 24 to 256 words in English. It assessed each response using five criteria and assigned a grade on a six-point scale from fail to excellent,…
Descriptors: Artificial Intelligence, Technology Uses in Education, Student Evaluation, Writing Evaluation
Condor, Aubrey; Litster, Max; Pardos, Zachary – International Educational Data Mining Society, 2021
We explore how different components of an Automatic Short Answer Grading (ASAG) model affect the model's ability to generalize to questions outside of those used for training. For supervised automatic grading models, human ratings are primarily used as ground truth labels. Producing such ratings can be resource heavy, as subject matter experts…
Descriptors: Automation, Grading, Test Items, Generalization
Schneider, Johannes; Richner, Robin; Riser, Micha – International Journal of Artificial Intelligence in Education, 2023
Autograding short textual answers has become much more feasible due to the rise of NLP and the increased availability of question-answer pairs brought about by a shift to online education. Autograding performance is still inferior to human grading. The statistical and black-box nature of state-of-the-art machine learning models makes them…
Descriptors: Grading, Natural Language Processing, Computer Assisted Testing, Ethics

Peer reviewed
Direct link
