Publication Date
| In 2026 | 0 |
| Since 2025 | 6 |
| Since 2022 (last 5 years) | 32 |
| Since 2017 (last 10 years) | 59 |
| Since 2007 (last 20 years) | 86 |
Descriptor
Source
Author
| Heffernan, Neil | 3 |
| McNamara, Danielle S. | 3 |
| Ramineni, Chaitanya | 3 |
| Andreea Dutulescu | 2 |
| Attali, Yigal | 2 |
| Baral, Sami | 2 |
| Breyer, F. Jay | 2 |
| Bridgeman, Brent | 2 |
| Crossley, Scott A. | 2 |
| Denis Dumas | 2 |
| Lan, Andrew | 2 |
| More ▼ | |
Publication Type
| Reports - Research | 98 |
| Journal Articles | 76 |
| Speeches/Meeting Papers | 15 |
| Tests/Questionnaires | 9 |
Education Level
Audience
| Researchers | 1 |
Location
| Japan | 3 |
| Netherlands | 3 |
| Australia | 2 |
| China | 2 |
| Iran | 2 |
| United Kingdom | 2 |
| Algeria | 1 |
| Arizona | 1 |
| Europe | 1 |
| Florida | 1 |
| Germany | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Selcuk Acar; Peter Organisciak; Denis Dumas – Journal of Creative Behavior, 2025
In this three-study investigation, we applied various approaches to score drawings created in response to both Form A and Form B of the Torrance Tests of Creative Thinking-Figural (broadly TTCT-F) as well as the Multi-Trial Creative Ideation task (MTCI). We focused on TTCT-F in Study 1, and utilizing a random forest classifier, we achieved 79% and…
Descriptors: Scoring, Computer Assisted Testing, Models, Correlation
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Rebecka Weegar; Peter Idestam-Almquist – International Journal of Artificial Intelligence in Education, 2024
Machine learning methods can be used to reduce the manual workload in exam grading, making it possible for teachers to spend more time on other tasks. However, when it comes to grading exams, fully eliminating manual work is not yet possible even with very accurate automated grading, as any grading mistakes could have significant consequences for…
Descriptors: Grading, Computer Assisted Testing, Introductory Courses, Computer Science Education
Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024
Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…
Descriptors: Scoring, Essays, Writing Evaluation, Memory
Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle S. McNamara – Grantee Submission, 2025
The assessment of student responses to learning-strategy prompts, such as self-explanation, summarization, and paraphrasing, is essential for evaluating cognitive engagement and comprehension. However, manual scoring is resource-intensive, limiting its scalability in educational settings. This study investigates the use of Large Language Models…
Descriptors: Scoring, Computational Linguistics, Computer Software, Artificial Intelligence
Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle McNamara – International Educational Data Mining Society, 2025
The assessment of student responses to learning-strategy prompts, such as self-explanation, summarization, and paraphrasing, is essential for evaluating cognitive engagement and comprehension. However, manual scoring is resource-intensive, limiting its scalability in educational settings. This study investigates the use of Large Language Models…
Descriptors: Scoring, Computational Linguistics, Computer Software, Artificial Intelligence
Andrea Horbach; Joey Pehlke; Ronja Laarmann-Quante; Yuning Ding – International Journal of Artificial Intelligence in Education, 2024
This paper investigates crosslingual content scoring, a scenario where scoring models trained on learner data in one language are applied to data in a different language. We analyze data in five different languages (Chinese, English, French, German and Spanish) collected for three prompts of the established English ASAP content scoring dataset. We…
Descriptors: Contrastive Linguistics, Scoring, Learning Analytics, Chinese
Eran Hadas; Arnon Hershkovitz – Journal of Learning Analytics, 2025
Creativity is an imperative skill for today's learners, one that has important contributions to issues of inclusion and equity in education. Therefore, assessing creativity is of major importance in educational contexts. However, scoring creativity based on traditional tools suffers from subjectivity and is heavily time- and labour-consuming. This…
Descriptors: Creativity, Evaluation Methods, Computer Assisted Testing, Artificial Intelligence
Zhang, Mengxue; Baral, Sami; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2022
Automatic short answer grading is an important research direction in the exploration of how to use artificial intelligence (AI)-based tools to improve education. Current state-of-the-art approaches use neural language models to create vectorized representations of students responses, followed by classifiers to predict the score. However, these…
Descriptors: Grading, Mathematics Instruction, Artificial Intelligence, Form Classes (Languages)
Kunal Sareen – Innovations in Education and Teaching International, 2024
This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…
Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software
Kevin C. Haudek; Xiaoming Zhai – International Journal of Artificial Intelligence in Education, 2024
Argumentation, a key scientific practice presented in the "Framework for K-12 Science Education," requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging…
Descriptors: Accuracy, Persuasive Discourse, Artificial Intelligence, Learning Management Systems
Uto, Masaki; Okano, Masashi – IEEE Transactions on Learning Technologies, 2021
In automated essay scoring (AES), scores are automatically assigned to essays as an alternative to grading by humans. Traditional AES typically relies on handcrafted features, whereas recent studies have proposed AES models based on deep neural networks to obviate the need for feature engineering. Those AES models generally require training on a…
Descriptors: Essays, Scoring, Writing Evaluation, Item Response Theory
Yishen Song; Qianta Zhu; Huaibo Wang; Qinhua Zheng – IEEE Transactions on Learning Technologies, 2024
Manually scoring and revising student essays has long been a time-consuming task for educators. With the rise of natural language processing techniques, automated essay scoring (AES) and automated essay revising (AER) have emerged to alleviate this burden. However, current AES and AER models require large amounts of training data and lack…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…
Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests
Reagan Mozer; Luke Miratrix; Jackie Eunjung Relyea; James S. Kim – Journal of Educational and Behavioral Statistics, 2024
In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the hand-coded scores as a measured outcome. This…
Descriptors: Scoring, Evaluation Methods, Writing Evaluation, Comparative Analysis

Peer reviewed
Direct link
