Publication Date
| In 2026 | 0 |
| Since 2025 | 13 |
| Since 2022 (last 5 years) | 97 |
| Since 2017 (last 10 years) | 218 |
| Since 2007 (last 20 years) | 351 |
Descriptor
| Computer Assisted Testing | 514 |
| Scoring | 514 |
| Test Items | 111 |
| Test Construction | 102 |
| Automation | 95 |
| Essays | 82 |
| Foreign Countries | 81 |
| Scores | 79 |
| Adaptive Testing | 78 |
| Evaluation Methods | 77 |
| Computer Software | 75 |
| More ▼ | |
Source
Author
| Bennett, Randy Elliot | 11 |
| Attali, Yigal | 9 |
| Anderson, Paul S. | 7 |
| Williamson, David M. | 6 |
| Bejar, Isaac I. | 5 |
| Ramineni, Chaitanya | 5 |
| Stocking, Martha L. | 5 |
| Xi, Xiaoming | 5 |
| Zechner, Klaus | 5 |
| Bridgeman, Brent | 4 |
| Davey, Tim | 4 |
| More ▼ | |
Publication Type
Education Level
Location
| Australia | 10 |
| China | 10 |
| New York | 9 |
| Japan | 7 |
| Netherlands | 6 |
| Canada | 5 |
| Germany | 5 |
| Iran | 4 |
| Taiwan | 4 |
| United Kingdom | 4 |
| United Kingdom (England) | 4 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kunal Sareen – Innovations in Education and Teaching International, 2024
This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…
Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software
Dorsey, David W.; Michaels, Hillary R. – Journal of Educational Measurement, 2022
We have dramatically advanced our ability to create rich, complex, and effective assessments across a range of uses through technology advancement. Artificial Intelligence (AI) enabled assessments represent one such area of advancement--one that has captured our collective interest and imagination. Scientists and practitioners within the domains…
Descriptors: Validity, Ethics, Artificial Intelligence, Evaluation Methods
Li, Xu; Ouyang, Fan; Liu, Jianwen; Wei, Chengkun; Chen, Wenzhi – Journal of Educational Computing Research, 2023
The computer-supported writing assessment (CSWA) has been widely used to reduce instructor workload and provide real-time feedback. Interpretability of CSWA draws extensive attention because it can benefit the validity, transparency, and knowledge-aware feedback of academic writing assessments. This study proposes a novel assessment tool,…
Descriptors: Computer Assisted Testing, Writing Evaluation, Feedback (Response), Natural Language Processing
Peter Organisciak; Selcuk Acar; Denis Dumas; Kelly Berthiaume – Grantee Submission, 2023
Automated scoring for divergent thinking (DT) seeks to overcome a key obstacle to creativity measurement: the effort, cost, and reliability of scoring open-ended tests. For a common test of DT, the Alternate Uses Task (AUT), the primary automated approach casts the problem as a semantic distance between a prompt and the resulting idea in a text…
Descriptors: Automation, Computer Assisted Testing, Scoring, Creative Thinking
Zhai, Xiaoming; Shi, Lehong; Nehm, Ross H. – Journal of Science Education and Technology, 2021
Machine learning (ML) has been increasingly employed in science assessment to facilitate automatic scoring efforts, although with varying degrees of success (i.e., magnitudes of machine-human score agreements [MHAs]). Little work has empirically examined the factors that impact MHA disparities in this growing field, thus constraining the…
Descriptors: Meta Analysis, Man Machine Systems, Artificial Intelligence, Computer Assisted Testing
Uto, Masaki; Okano, Masashi – IEEE Transactions on Learning Technologies, 2021
In automated essay scoring (AES), scores are automatically assigned to essays as an alternative to grading by humans. Traditional AES typically relies on handcrafted features, whereas recent studies have proposed AES models based on deep neural networks to obviate the need for feature engineering. Those AES models generally require training on a…
Descriptors: Essays, Scoring, Writing Evaluation, Item Response Theory
Shahid A. Choudhry; Timothy J. Muckle; Christopher J. Gill; Rajat Chadha; Magnus Urosev; Matt Ferris; John C. Preston – Practical Assessment, Research & Evaluation, 2024
The National Board of Certification and Recertification for Nurse Anesthetists (NBCRNA) conducted a one-year research study comparing performance on the traditional continued professional certification assessment, administered at a test center or online with remote proctoring, to a longitudinal assessment that required answering quarterly…
Descriptors: Nurses, Certification, Licensing Examinations (Professions), Computer Assisted Testing
Keith Cochran; Clayton Cohn; Peter Hastings; Noriko Tomuro; Simon Hughes – International Journal of Artificial Intelligence in Education, 2024
To succeed in the information age, students need to learn to communicate their understanding of complex topics effectively. This is reflected in both educational standards and standardized tests. To improve their writing ability for highly structured domains like scientific explanations, students need feedback that accurately reflects the…
Descriptors: Science Process Skills, Scientific Literacy, Scientific Concepts, Concept Formation
Yishen Song; Qianta Zhu; Huaibo Wang; Qinhua Zheng – IEEE Transactions on Learning Technologies, 2024
Manually scoring and revising student essays has long been a time-consuming task for educators. With the rise of natural language processing techniques, automated essay scoring (AES) and automated essay revising (AER) have emerged to alleviate this burden. However, current AES and AER models require large amounts of training data and lack…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022
The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…
Descriptors: Scoring, Test Items, Test Format, Raw Scores
Binglin Chen – ProQuest LLC, 2022
Assessment is a key component of education. Routine grading of students' work, however, is time consuming. Automating the grading process allows instructors to spend more of their time helping their students learn and engaging their students with more open-ended, creative activities. One way to automate grading is through computer-based…
Descriptors: College Students, STEM Education, Student Evaluation, Grading
Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…
Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests
Georgios Zacharis; Stamatios Papadakis – Educational Process: International Journal, 2025
Background/purpose: Generative artificial intelligence (GenAI) is often promoted as a transformative tool for assessment, yet evidence of its validity compared to human raters remains limited. This study examined whether an AI-based rater could be used interchangeably with trained faculty in scoring complex coursework. Materials/methods:…
Descriptors: Artificial Intelligence, Technology Uses in Education, Computer Assisted Testing, Grading
Brandon J. Yik; David G. Schreurs; Jeffrey R. Raker – Journal of Chemical Education, 2023
Acid-base chemistry, and in particular the Lewis acid-base model, is foundational to understanding mechanistic ideas. This is due to the similarity in language chemists use to describe Lewis acid-base reactions and nucleophile-electrophile interactions. The development of artificial intelligence and machine learning technologies has led to the…
Descriptors: Educational Technology, Formative Evaluation, Molecular Structure, Models
Wan, Qian; Crossley, Scott; Allen, Laura; McNamara, Danielle – Grantee Submission, 2020
In this paper, we extracted content-based and structure-based features of text to predict human annotations for claims and nonclaims in argumentative essays. We compared Logistic Regression, Bernoulli Naive Bayes, Gaussian Naive Bayes, Linear Support Vector Classification, Random Forest, and Neural Networks to train classification models. Random…
Descriptors: Persuasive Discourse, Essays, Writing Evaluation, Natural Language Processing

Peer reviewed
Direct link
