ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	8
Since 2017 (last 10 years)	25
Since 2007 (last 20 years)	59

Descriptor

Computer Software	73
Interrater Reliability	73
Evaluation Methods	21
Computer Assisted Testing	20
Foreign Countries	19
Comparative Analysis	18
Evaluators	17
Scoring	17
Correlation	16
Writing Evaluation	15
Second Language Learning	14
Educational Technology	13
Statistical Analysis	13
Undergraduate Students	13
Models	12
English (Second Language)	11
Essays	11
Scores	11
Accuracy	10
Computer Software Evaluation	10
Artificial Intelligence	9
Scoring Rubrics	9
Feedback (Response)	8
Higher Education	8
Second Language Instruction	8
More ▼

Publication Type

Journal Articles	57
Reports - Research	51
Reports - Evaluative	11
Speeches/Meeting Papers	8
Tests/Questionnaires	6
Reports - Descriptive	5
Dissertations/Theses -…	3
Book/Product Reviews	1
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Information Analyses	1
More ▼

Education Level

Higher Education	24
Postsecondary Education	23
Elementary Secondary Education	10
Secondary Education	7
Elementary Education	5
Middle Schools	3
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Preschool Education	1
More ▼

Audience

Practitioners	1
Researchers	1
Teachers	1

Location

Netherlands	3
Singapore	3
China	2
Egypt	2
Germany	2
Hong Kong	2
Israel	2
Japan	2
Turkey	2
Arizona	1
Asia	1
Australia	1
Brazil	1
Canada	1
Connecticut	1
Cuba	1
Denmark	1
Estonia	1
Florida	1
Greece	1
Hawaii	1
India	1
Ireland	1
Italy	1
Kazakhstan	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

National Assessment of…	2
Test of English as a Foreign…	2
Expressive One Word Picture…	1
Graduate Record Examinations	1
Mean Length of Utterance	1
Peabody Picture Vocabulary…	1
Torrance Tests of Creative…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 73 results Save | Export

An Approach for Being Able to Use the Options of Calculating Inter-Coder Reliability Manually and through Software in Qualitative Research of Education and Training in Sports

Peer reviewed
PDF on ERIC

Download full text

Sevilmis, Ali; Yildiz, Özer – International Journal of Progressive Education, 2021

Reliability that can be proved by numeric indicators in quantitative studies has become a very discussible issue. The reason for this is to be thought that in qualitative researches, reliability is not based on positive perspective and those forming reliability criteria is difficult. However, for testing the reliability of a qualitative study or…

Descriptors: Interrater Reliability, Qualitative Research, Educational Research, Physical Education

Evaluating Quadratic Weighted Kappa as the Standard Performance Metric for Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023

Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…

Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy

Graders of the Future: Comparing the Consistency and Accuracy of GPT4 and Pre-Service Teachers in Physics Essay Question Assessments

Peer reviewed
PDF on ERIC

Download full text

Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025

As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…

Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy

Monitoring Implementation in Program Evaluation with Direct Audio Coding

Peer reviewed
PDF on ERIC

Download full text

Direct link

Farley, Jennifer; Duppong Hurley, Kristin; Aitken, A. Angelique – Grantee Submission, 2020

This project explored the reliability and utility of transcription in coding qualitative data across two studies in a program evaluation context. The first study tested the method of direct audio coding, or coding audio files without transcripts, using qualitative data software. The presence and frequency of codes applied in direct audio coding…

Descriptors: Program Implementation, Audio Equipment, Coding, Usability

Revolutionising Essay Evaluation: A Cutting-Edge Rubric for AI-Assisted Writing

Peer reviewed

Direct link

Hassan Saleh Mahdi; Ahmed Alkhateeb – International Journal of Computer-Assisted Language Learning and Teaching, 2025

This study aims to develop a robust rubric for evaluating artificial intelligence (AI)--assisted essay writing in English as a Foreign Language (EFL) contexts. Employing a modified Delphi technique, we conducted a comprehensive literature review and administered Likert scale questionnaires. This process yielded nine key evaluation criteria,…

Descriptors: Scoring Rubrics, Essays, Writing Evaluation, Artificial Intelligence

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Perceptual and Acoustic Assessment of Strain Using Synthetically Modified Voice Samples

Peer reviewed

Direct link

Park, Yeonggwang; Cádiz, Manuel Díaz; Nagle, Kathleen F.; Stepp, Cara E. – Journal of Speech, Language, and Hearing Research, 2020

Purpose: Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Method: Stimuli were created using recordings of…

Descriptors: Acoustics, Audio Equipment, Auditory Perception, Correlation

Impacts of ChatGPT-Assisted Writing for EFL English Majors: Feasibility and Challenges

Peer reviewed

Direct link

Chung-You Tsai; Yi-Ti Lin; Iain Kelsall Brown – Education and Information Technologies, 2024

To determine the impacts of using ChatGPT to assist English as a foreign language (EFL) English college majors in revising essays and the possibility of leading to higher scores and potentially causing unfairness. A prospective, double-blinded, paired-comparison study was conducted in Feb. 2023. A total of 44 students provided 44 original essays…

Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, English (Second Language)

Vocal Development in a Large-Scale Crosslinguistic Corpus

Peer reviewed

Direct link

Cychosz, Margaret; Cristia, Alejandrina; Bergelson, Elika; Casillas, Marisa; Baudet, Gladys; Warlaumont, Anne S.; Scaff, Camila; Yankowitz, Lisa; Seidl, Amanda – Developmental Science, 2021

This study evaluates whether early vocalizations develop in similar ways in children across diverse cultural contexts. We analyze data from daylong audio recordings of 49 children (1-36 months) from five different language/cultural backgrounds. Citizen scientists annotated these recordings to determine if child vocalizations contained canonical…

Descriptors: Cultural Context, Contrastive Linguistics, Audio Equipment, Cultural Differences

Comparing Machine and Human Reviewers to Evaluate the Risk of Bias in Randomized Controlled Trials

Peer reviewed

Direct link

Armijo-Olivo, Susan; Craig, Rodger; Campbell, Sandy – Research Synthesis Methods, 2020

Background: Evidence from new health technologies is growing, along with demands for evidence to inform policy decisions, creating challenges in completing health technology assessments (HTAs)/systematic reviews (SRs) in a timely manner. Software can decrease the time and burden by automating the process, but evidence validating such software is…

Descriptors: Comparative Analysis, Computer Software, Decision Making, Randomized Controlled Trials

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Comparative Judgement: Assess Student Production without Absolute Judgements

Peer reviewed
PDF on ERIC

Download full text

Sumner, Josh – Research-publishing.net, 2021

Comparative Judgement (CJ) has emerged as a technique that typically makes use of holistic judgement to assess difficult-to-specify constructs such as production (speaking and writing) in Modern Foreign Languages (MFL). In traditional approaches, markers assess candidates' work one-by-one in an absolute manner, assigning scores to different…

Descriptors: Holistic Approach, Student Evaluation, Comparative Analysis, Decision Making

Contextual Definition Generation

Peer reviewed
PDF on ERIC

Download full text

Direct link

Yarbro, Jeffrey T.; Olney, Andrew M. – Grantee Submission, 2021

This paper explores the concept of dynamically generating definitions using a deep-learning model. We do this by creating a dataset that contains definition entries and contexts associated with each definition. We then fine-tune a GPT-2 based model on the dataset to allow the model to generate contextual definitions. We evaluate our model with…

Descriptors: Definitions, Learning Processes, Models, Context Effect

The Use of Semantic Similarity Tools in Automated Content Scoring of Fact-Based Essays Written by EFL Learners

Peer reviewed

Direct link

Wang, Qiao – Education and Information Technologies, 2022

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring

A Human-Centric Automated Essay Scoring and Feedback System for the Development of Ethical Reasoning

Peer reviewed

Direct link

Lee, Alwyn Vwen Yen; Luco, Andrés Carlos; Tan, Seng Chee – Educational Technology & Society, 2023

Although artificial Intelligence (AI) is prevalent and impacts facets of daily life, there is limited research on responsible and humanistic design, implementation, and evaluation of AI, especially in the field of education. Afterall, learning is inherently a social endeavor involving human interactions, rendering the need for AI designs to be…

Descriptors: Essays, Scoring, Writing Evaluation, Computer Software

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Education and Information…	3
Educational and Psychological…	3
International Educational…	3
Journal of Educational…	3
ProQuest LLC	3
Australasian Journal of…	2
Contemporary Issues in…	2
ETS Research Report Series	2
Grantee Submission	2
International Association for…	2
Journal of Interactive…	2
Journal of Speech, Language,…	2
ALT-J: Research in Learning…	1
Advances in Physiology…	1
Assessing Writing	1
CBE - Life Sciences Education	1
Child Language Teaching and…	1
Computers & Education	1
Computers in the Schools	1
Creativity Research Journal	1
Developmental Medicine &…	1
Developmental Science	1
Education Next	1
Educational Technology &…	1
English Language Teaching	1
More ▼

Bahreini, Kiavash	2
Berry, Kenneth J.	2
Bodur, Yasar	2
Mielke, Paul W., Jr.	2
Nadolski, Rob	2
Unal, Aslihan	2
Unal, Zafer	2
Westera, Wim	2
Abedi, Jamal	1
Ahmed Alkhateeb	1
Ai, Wenguo	1
Aitken, A. Angelique	1
Al-Gawhary, Wedad	1
Al-Harthi, Aisha Salim Ali	1
Alabdulqader, Ebtisam	1
Alexander, R. Curby	1
Alt, Mary	1
Amanda Huee-Ping Wong	1
Armijo-Olivo, Susan	1
Baier, Herbert	1
Basu, Anna	1
Baudet, Gladys	1
Bejar, Isaac I.	1
Bergelson, Elika	1
More ▼