Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 30 |
| Since 2017 (last 10 years) | 56 |
| Since 2007 (last 20 years) | 99 |
Descriptor
Source
Author
| Attali, Yigal | 3 |
| Ramineni, Chaitanya | 3 |
| Bejar, Isaac I. | 2 |
| Breyer, F. Jay | 2 |
| Bridgeman, Brent | 2 |
| Burstein, Jill | 2 |
| Dikli, Semire | 2 |
| Gentile, Claudia | 2 |
| Kantor, Robert | 2 |
| Lee, Yong-Won | 2 |
| McCurry, Doug | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 124 |
| Reports - Research | 76 |
| Reports - Descriptive | 27 |
| Reports - Evaluative | 16 |
| Tests/Questionnaires | 7 |
| Book/Product Reviews | 4 |
| Information Analyses | 3 |
| Guides - General | 1 |
| Opinion Papers | 1 |
Education Level
Audience
| Practitioners | 1 |
| Researchers | 1 |
| Teachers | 1 |
Location
| Australia | 4 |
| Japan | 4 |
| Netherlands | 3 |
| China | 2 |
| Europe | 2 |
| Hong Kong | 2 |
| Iran | 2 |
| South Korea | 2 |
| Turkey | 2 |
| Ukraine | 2 |
| United Kingdom | 2 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Selcuk Acar; Peter Organisciak; Denis Dumas – Journal of Creative Behavior, 2025
In this three-study investigation, we applied various approaches to score drawings created in response to both Form A and Form B of the Torrance Tests of Creative Thinking-Figural (broadly TTCT-F) as well as the Multi-Trial Creative Ideation task (MTCI). We focused on TTCT-F in Study 1, and utilizing a random forest classifier, we achieved 79% and…
Descriptors: Scoring, Computer Assisted Testing, Models, Correlation
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Shermis, Mark D. – Journal of Educational Measurement, 2022
One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…
Descriptors: Scoring, Essays, Validity, Writing Evaluation
Rebecka Weegar; Peter Idestam-Almquist – International Journal of Artificial Intelligence in Education, 2024
Machine learning methods can be used to reduce the manual workload in exam grading, making it possible for teachers to spend more time on other tasks. However, when it comes to grading exams, fully eliminating manual work is not yet possible even with very accurate automated grading, as any grading mistakes could have significant consequences for…
Descriptors: Grading, Computer Assisted Testing, Introductory Courses, Computer Science Education
Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024
Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…
Descriptors: Scoring, Essays, Writing Evaluation, Memory
Andrea Horbach; Joey Pehlke; Ronja Laarmann-Quante; Yuning Ding – International Journal of Artificial Intelligence in Education, 2024
This paper investigates crosslingual content scoring, a scenario where scoring models trained on learner data in one language are applied to data in a different language. We analyze data in five different languages (Chinese, English, French, German and Spanish) collected for three prompts of the established English ASAP content scoring dataset. We…
Descriptors: Contrastive Linguistics, Scoring, Learning Analytics, Chinese
Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023
This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…
Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis
Eran Hadas; Arnon Hershkovitz – Journal of Learning Analytics, 2025
Creativity is an imperative skill for today's learners, one that has important contributions to issues of inclusion and equity in education. Therefore, assessing creativity is of major importance in educational contexts. However, scoring creativity based on traditional tools suffers from subjectivity and is heavily time- and labour-consuming. This…
Descriptors: Creativity, Evaluation Methods, Computer Assisted Testing, Artificial Intelligence
Kunal Sareen – Innovations in Education and Teaching International, 2024
This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…
Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software
Kevin C. Haudek; Xiaoming Zhai – International Journal of Artificial Intelligence in Education, 2024
Argumentation, a key scientific practice presented in the "Framework for K-12 Science Education," requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging…
Descriptors: Accuracy, Persuasive Discourse, Artificial Intelligence, Learning Management Systems
Ecem Kopuz; Galip Kartal – PASAA: Journal of Language Teaching and Learning in Thailand, 2025
The developments in artificial intelligence (AI) have significantly transformed second language (L2) learning and assessment, and the role of AI technologies in L2 assessment have been investigated in recent research. This study presents a bibliosystematic analysis of AI-assisted L2 assessment. Using both systematic analysis and bibliometric…
Descriptors: Artificial Intelligence, Computer Software, Technology Integration, Feedback (Response)
Tahereh Firoozi; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2023
The proliferation of large language models represents a paradigm shift in the landscape of automated essay scoring (AES) systems, fundamentally elevating their accuracy and efficacy. This study presents an extensive examination of large language models, with a particular emphasis on the transformative influence of transformer-based models, such as…
Descriptors: Turkish, Writing Evaluation, Essays, Accuracy
Uto, Masaki; Okano, Masashi – IEEE Transactions on Learning Technologies, 2021
In automated essay scoring (AES), scores are automatically assigned to essays as an alternative to grading by humans. Traditional AES typically relies on handcrafted features, whereas recent studies have proposed AES models based on deep neural networks to obviate the need for feature engineering. Those AES models generally require training on a…
Descriptors: Essays, Scoring, Writing Evaluation, Item Response Theory
Yishen Song; Qianta Zhu; Huaibo Wang; Qinhua Zheng – IEEE Transactions on Learning Technologies, 2024
Manually scoring and revising student essays has long been a time-consuming task for educators. With the rise of natural language processing techniques, automated essay scoring (AES) and automated essay revising (AER) have emerged to alleviate this burden. However, current AES and AER models require large amounts of training data and lack…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Reagan Mozer; Luke Miratrix; Jackie Eunjung Relyea; James S. Kim – Journal of Educational and Behavioral Statistics, 2024
In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the hand-coded scores as a measured outcome. This…
Descriptors: Scoring, Evaluation Methods, Writing Evaluation, Comparative Analysis

Peer reviewed
Direct link
