NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ1488317
Record Type: Journal
Publication Date: 2025
Pages: 40
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1560-4292
EISSN: EISSN-1560-4306
Available Date: 2024-09-13
Neural Networks or Linguistic Features? -- Comparing Different Machine-Learning Approaches for Automated Assessment of Text Quality Traits among L1- and L2-Learners' Argumentative Essays
International Journal of Artificial Intelligence in Education, v35 n3 p1178-1217 2025
Recent investigations in automated essay scoring research imply that hybrid models, which combine feature engineering and the powerful tools of deep neural networks (DNNs), reach state-of-the-art performance. However, most of these findings are from holistic scoring tasks. In the present study, we use a total of four prompts from two different corpora consisting of both L1 and L2 learner essays annotated with trait scores (e.g., content, organization, and language quality). In our main experiments, we compare three variants of trait-specific models using different inputs: (1) models based on 220 linguistic features, (2) models using essay-level contextual embeddings from the distilled version of the pre-trained transformer BERT (DistilBERT), and (3) a hybrid model using both types of features. Results imply that when trait-specific models are trained based on a single resource, the feature-based models slightly outperform the embedding-based models. These differences are most prominent for the organization traits. The hybrid models outperform the single-resource models, indicating that linguistic features and embeddings indeed capture partially different aspects relevant for the assessment of essay traits. To gain more insights into the interplay between both feature types, we run addition and ablation tests for individual feature groups. Trait-specific addition tests across prompts indicate that the embedding-based models can most consistently be enhanced in content assessment when combined with morphological complexity features. Most consistent performance gains in the organization traits are achieved when embeddings are combined with length features, and most consistent performance gains in the assessment of the language traits when combined with lexical complexity, error, and occurrence features. Cross-prompt scoring again reveals slight advantages for the feature-based models.
Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link-springer-com.bibliotheek.ehb.be/
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: 1Kiel University, Institute for Psychology of Learning and Instruction, Kiel, Germany; 2University of Hildesheim, Hildesheim, Germany; 3University of Applied Sciences and Arts Northwestern Switzerland, Basel, Switzerland; 4University of Teacher Education Zürich, Zürich, Switzerland; 5Leibniz Institute for Science and Mathematics Education at Kiel University, Kiel, Germany; 6FernUniversität in Hagen, CATALPA, Hagen, Germany