ERIC Number: EJ1487671
Record Type: Journal
Publication Date: 2025-Oct
Pages: 34
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1069-4730
EISSN: EISSN-2168-9830
Available Date: 2025-08-31
Analysis of Student Understanding in Short-Answer Explanations to Concept Questions Using a Human-Centered AI Approach
Harpreet Auby1; Namrata Shivagunde2; Vijeta Deshpande2; Anna Rumshisky2; Milo D. Koretsky1,3
Journal of Engineering Education, v114 n4 e70032 2025
Background: Analyzing student short-answer written justifications to conceptually challenging questions has proven helpful to understand student thinking and improve conceptual understanding. However, qualitative analyses are limited by the burden of analyzing large amounts of text. Purpose: We apply dense and sparse Large Language Models (LLMs) to explore how machine learning can automate coding for responses in engineering mechanics and thermodynamics. Design/Method: We first identify the cognitive resources students use through human coding of seven questions. We then compare the performance of four dense LLMs and a sparse Mixture of Experts (Mixtral) model to automate coding. Finally, we investigate the extent to which domain-specific training is necessary for accurate coding. Findings: In a sample question, we analyze 904 responses to identify 48 unique cognitive resources, which we then organize into six themes. In contrast to recommendations in the literature, students who activate molecular resources were less likely to answer correctly. This example illustrates the usefulness of qualitatively analyzing large datasets. Of the LLMs, Mixtral and Llama-3 performed best at within the same-dataset, in-domain coding tasks, especially as the training set size increases. Phi-3.5-mini, while effective in mechanics, shows inconsistent improvements with additional data and struggles in thermodynamics. In contrast, GPT-4 and GPT-4o-mini stand out for their robust generalization across in- and cross-domain tasks. Conclusions: Open-source models like Mixtral have the potential to perform well when coding short-answer justifications to challenging concept questions. However, more fine-tuning is needed so that they can be robust enough to be utilized with a resources-based framing.
Descriptors: Student Evaluation, Thinking Skills, Test Format, Cognitive Processes, Coding, Artificial Intelligence, Natural Language Processing, Automation, Accuracy
Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www-wiley-com.bibliotheek.ehb.be/en-us
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Authoring Institution: N/A
Grant or Contract Numbers: 2226553; 2226601
Author Affiliations: 1Department of Chemical & Biological Engineering, Tufts University, Medford, Massachusetts, USA; 2Department of Computer Science, University of Massachusetts Lowell, Lowell, Massachusetts, USA; 3Department of Education, Tufts University, Medford, Massachusetts, USA

Peer reviewed
Direct link
