NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 93 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bin Tan; Nour Armoush; Elisabetta Mazzullo; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2025
This study reviews existing research on the use of large language models (LLMs) for automatic item generation (AIG). We performed a comprehensive literature search across seven research databases, selected studies based on predefined criteria, and summarized 60 relevant studies that employed LLMs in the AIG process. We identified the most commonly…
Descriptors: Artificial Intelligence, Test Items, Automation, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Hui Jin; Cynthia Lima; Limin Wang – Educational Measurement: Issues and Practice, 2025
Although AI transformer models have demonstrated notable capability in automated scoring, it is difficult to examine how and why these models fall short in scoring some responses. This study investigated how transformer models' language processing and quantification processes can be leveraged to enhance the accuracy of automated scoring. Automated…
Descriptors: Automation, Scoring, Artificial Intelligence, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Ting Wang; Keith Stelter; Thomas O’Neill; Nathaniel Hendrix; Andrew Bazemore; Kevin Rode; Warren P. Newton – Journal of Applied Testing Technology, 2025
Precise item categorisation is essential in aligning exam questions with content domains outlined in assessment blueprints. Traditional methods, such as manual classification or supervised machine learning, are often time-consuming, error-prone, or limited by the need for large training datasets. This study presents a novel approach using…
Descriptors: Test Items, Automation, Classification, Artificial Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
Yanyan Fu – Educational Measurement: Issues and Practice, 2024
The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items.…
Descriptors: Error Correction, Automation, Test Items, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Stella Y. Kim; Sungyeun Kim – Educational Measurement: Issues and Practice, 2025
This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A…
Descriptors: Generalizability Theory, Automation, Test Items, Students
Peer reviewed Peer reviewed
Direct linkDirect link
Said Al Faraby; Adiwijaya Adiwijaya; Ade Romadhony – International Journal of Artificial Intelligence in Education, 2024
Questioning plays a vital role in education, directing knowledge construction and assessing students' understanding. However, creating high-level questions requires significant creativity and effort. Automatic question generation is expected to facilitate the generation of not only fluent and relevant but also educationally valuable questions.…
Descriptors: Test Items, Automation, Computer Software, Input Output Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Becker, Benjamin; Weirich, Sebastian; Goldhammer, Frank; Debeer, Dries – Journal of Educational Measurement, 2023
When designing or modifying a test, an important challenge is controlling its speededness. To achieve this, van der Linden (2011a, 2011b) proposed using a lognormal response time model, more specifically the two-parameter lognormal model, and automated test assembly (ATA) via mixed integer linear programming. However, this approach has a severe…
Descriptors: Test Construction, Automation, Models, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Po-Chun Huang; Ying-Hong Chan; Ching-Yu Yang; Hung-Yuan Chen; Yao-Chung Fan – IEEE Transactions on Learning Technologies, 2024
Question generation (QG) task plays a crucial role in adaptive learning. While significant QG performance advancements are reported, the existing QG studies are still far from practical usage. One point that needs strengthening is to consider the generation of question group, which remains untouched. For forming a question group, intrafactors…
Descriptors: Automation, Test Items, Computer Assisted Testing, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Ben Babcock; Kim Brunnert – Journal of Applied Testing Technology, 2023
Automatic Item Generation (AIG) is an extremely useful tool to construct many high-quality exam items more efficiently than traditional item writing methods. A large pool of items, however, presents challenges like identifying a particular item to meet a specific need. For example, when making a fixed form exam, best practices forbid item stems…
Descriptors: Test Items, Automation, Algorithms, Artificial Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022
This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…
Descriptors: Test Items, Measurement, Psychometrics, Test Construction
Peer reviewed Peer reviewed
Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle S. McNamara – Grantee Submission, 2024
Assessing the difficulty of reading comprehension questions is crucial to educational methodologies and language understanding technologies. Traditional methods of assessing question difficulty rely frequently on human judgments or shallow metrics, often failing to accurately capture the intricate cognitive demands of answering a question. This…
Descriptors: Difficulty Level, Reading Tests, Test Items, Reading Comprehension
Peer reviewed Peer reviewed
Direct linkDirect link
Guher Gorgun; Okan Bulut – Education and Information Technologies, 2024
In light of the widespread adoption of technology-enhanced learning and assessment platforms, there is a growing demand for innovative, high-quality, and diverse assessment questions. Automatic Question Generation (AQG) has emerged as a valuable solution, enabling educators and assessment developers to efficiently produce a large volume of test…
Descriptors: Computer Assisted Testing, Test Construction, Test Items, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Andreea Dutulescu; Stefan Ruseti; Denis Iorga; Mihai Dascalu; Danielle S. McNamara – Grantee Submission, 2025
Automated multiple-choice question (MCQ) generation is valuable for scalable assessment and enhanced learning experiences. How-ever, existing MCQ generation methods face challenges in ensuring plausible distractors and maintaining answer consistency. This paper intro-duces a method for MCQ generation that integrates reasoning-based explanations…
Descriptors: Automation, Computer Assisted Testing, Multiple Choice Tests, Natural Language Processing
Peer reviewed Peer reviewed
Direct linkDirect link
Wesley Morris; Langdon Holmes; Joon Suh Choi; Scott Crossley – International Journal of Artificial Intelligence in Education, 2025
Recent developments in the field of artificial intelligence allow for improved performance in the automated assessment of extended response items in mathematics, potentially allowing for the scoring of these items cheaply and at scale. This study details the grand prize-winning approach to developing large language models (LLMs) to automatically…
Descriptors: Automation, Computer Assisted Testing, Mathematics Tests, Scoring
Jonathan Seiden – Annenberg Institute for School Reform at Brown University, 2025
Direct assessments of early childhood development (ECD) are a cornerstone of research in developmental psychology and are increasingly used to evaluate programs and policies in lower- and middle-income countries. Despite strong psychometric properties, these assessments are too expensive and time consuming for use in large-scale monitoring or…
Descriptors: Young Children, Child Development, Performance Based Assessment, Developmental Psychology
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7