Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 3 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 3 |
Descriptor
Author
| Agustín Garagorry Guerra | 1 |
| Hang Li | 1 |
| Haoyu Han | 1 |
| Jiliang Tang | 1 |
| Joseph Krajcik | 1 |
| Jussi S. Jauhiainen | 1 |
| Kaiqi Yang | 1 |
| Peng He | 1 |
| Richner, Robin | 1 |
| Riser, Micha | 1 |
| Schneider, Johannes | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 3 |
| Journal Articles | 2 |
| Speeches/Meeting Papers | 1 |
Education Level
| Higher Education | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Yucheng Chu; Peng He; Hang Li; Haoyu Han; Kaiqi Yang; Yu Xue; Tingting Li; Yasemin Copur-Gencturk; Joseph Krajcik; Jiliang Tang – International Educational Data Mining Society, 2025
Short answer assessment is a vital component of science education, allowing evaluation of students' complex three-dimensional understanding. Large language models (LLMs) that possess human-like ability in linguistic tasks are increasingly popular in assisting human graders to reduce their workload. However, LLMs' limitations in domain knowledge…
Descriptors: Artificial Intelligence, Science Education, Technology Uses in Education, Natural Language Processing
Jussi S. Jauhiainen; Agustín Garagorry Guerra – Innovations in Education and Teaching International, 2025
The study highlights ChatGPT-4's potential in educational settings for the evaluation of university students' open-ended written examination responses. ChatGPT-4 evaluated 54 written responses, ranging from 24 to 256 words in English. It assessed each response using five criteria and assigned a grade on a six-point scale from fail to excellent,…
Descriptors: Artificial Intelligence, Technology Uses in Education, Student Evaluation, Writing Evaluation
Schneider, Johannes; Richner, Robin; Riser, Micha – International Journal of Artificial Intelligence in Education, 2023
Autograding short textual answers has become much more feasible due to the rise of NLP and the increased availability of question-answer pairs brought about by a shift to online education. Autograding performance is still inferior to human grading. The statistical and black-box nature of state-of-the-art machine learning models makes them…
Descriptors: Grading, Natural Language Processing, Computer Assisted Testing, Ethics

Peer reviewed
Direct link
