ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	13

Descriptor

Computer Software	17
Interrater Reliability	17
Scoring	17
Computer Assisted Testing	10
Correlation	7
English (Second Language)	7
Evaluators	7
Second Language Learning	7
Comparative Analysis	6
Evaluation Methods	6
Undergraduate Students	6
Writing Evaluation	6
Essays	5
Foreign Countries	5
Accuracy	4
Language Tests	4
College Faculty	3
Educational Technology	3
Essay Tests	3
Regression (Statistics)	3
Scores	3
Scoring Rubrics	3
Writing Instruction	3
Writing Tests	3
Artificial Intelligence	2
More ▼

Source

ETS Research Report Series	2
Advances in Physiology…	1
Assessing Writing	1
CBE - Life Sciences Education	1
Education and Information…	1
Educational Technology &…	1
English Language Teaching	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Science Education…	1
Journal of Speech, Language,…	1
ProQuest LLC	1
ReCALL	1
More ▼

Publication Type

Journal Articles	13
Reports - Research	11
Reports - Evaluative	4
Tests/Questionnaires	4
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Reports - Descriptive	1

Education Level

Higher Education	6
Postsecondary Education	6
Elementary Secondary Education	2
Secondary Education	2
Elementary Education	1
High Schools	1
Preschool Education	1

Audience

Researchers

Location

Arizona	1
China	1
Hong Kong	1
Japan	1
Singapore	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Test of English as a Foreign…	2
Expressive One Word Picture…	1
Graduate Record Examinations	1
Mean Length of Utterance	1
Peabody Picture Vocabulary…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

The Use of Semantic Similarity Tools in Automated Content Scoring of Fact-Based Essays Written by EFL Learners

Peer reviewed

Direct link

Wang, Qiao – Education and Information Technologies, 2022

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring

A Human-Centric Automated Essay Scoring and Feedback System for the Development of Ethical Reasoning

Peer reviewed

Direct link

Lee, Alwyn Vwen Yen; Luco, Andrés Carlos; Tan, Seng Chee – Educational Technology & Society, 2023

Although artificial Intelligence (AI) is prevalent and impacts facets of daily life, there is limited research on responsible and humanistic design, implementation, and evaluation of AI, especially in the field of education. Afterall, learning is inherently a social endeavor involving human interactions, rendering the need for AI designs to be…

Descriptors: Essays, Scoring, Writing Evaluation, Computer Software

Comparison of Automatic and Expert Teachers' Rating of Computerized English Listening-Speaking Test

Peer reviewed
PDF on ERIC

Download full text

Linlin, Cao – English Language Teaching, 2020

Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…

Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

Human vs. Computer Diagnosis of Students' Natural Selection Knowledge: Testing the Efficacy of Text Analytic Software

Peer reviewed

Direct link

Nehm, Ross H.; Haertig, Hendrik – Journal of Science Education and Technology, 2012

Our study examines the efficacy of Computer Assisted Scoring (CAS) of open-response text relative to expert human scoring within the complex domain of evolutionary biology. Specifically, we explored whether CAS can diagnose the explanatory elements (or Key Concepts) that comprise undergraduate students' explanatory models of natural selection with…

Descriptors: Evolution, Undergraduate Students, Interrater Reliability, Computers

What Are They Thinking? Automated Analysis of Student Writing about Acid-Base Chemistry in Introductory Biology

Peer reviewed

Direct link

Haudek, Kevin C.; Prevost, Luanna B.; Moscarella, Rosa A.; Merrill, John; Urban-Lurain, Mark – CBE - Life Sciences Education, 2012

Students' writing can provide better insight into their thinking than can multiple-choice questions. However, resource constraints often prevent faculty from using writing assessments in large undergraduate science courses. We investigated the use of computer software to analyze student writing and to uncover student ideas about chemistry in an…

Descriptors: Chemistry, Biology, Introductory Courses, Science Instruction

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Can Machine Scoring Deal with Broad and Open Writing Tests as Well as Human Readers?

Peer reviewed

Direct link

McCurry, Doug – Assessing Writing, 2010

This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…

Descriptors: Writing Tests, Scoring, Interrater Reliability, Computer Assisted Testing

Factors that Influence Fast Mapping in Children Exposed to Spanish and English

Peer reviewed

Direct link

Alt, Mary; Meyers, Christina; Figueroa, Cecilia – Journal of Speech, Language, and Hearing Research, 2013

Purpose: The purpose of this study was to determine whether children exposed to 2 languages would benefit from the phonotactic probability cues of a single language in the same way as monolingual peers and to determine whether crosslinguistic influence would be present in a fast-mapping task. Method: Two groups of typically developing children…

Descriptors: Regression (Statistics), Spanish, Cues, Task Analysis

Experimenting with a Computer Essay-Scoring Program Based on ESL Student Writing Scripts

Peer reviewed

Direct link

Coniam, David – ReCALL, 2009

This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…

Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability

Computer Grading of Student Prose, Using Modern Concepts and Software.

Peer reviewed

Page, Ellis Batten – Journal of Experimental Education, 1994

National Assessment of Educational Progress writing sample essays from 1988 and 1990 (495 and 599 essays) were subjected to computerized grading and human ratings. Cross-validation suggests that computer scoring is superior to a two-judge panel, a finding encouraging for large programs of essay evaluation. (SLD)

Descriptors: Computer Assisted Testing, Computer Software, Essays, Evaluation Methods

Toward an Understanding of the Role of Speech Recognition in Nonnative Speech Assessment. TOEFL iBT Research Report. TOEFL iBT-02. ETS RR-07-02

Peer reviewed
PDF on ERIC

Download full text

Zechner, Klaus; Bejar, Isaac I.; Hemat, Ramin – ETS Research Report Series, 2007

The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency. In this investigation, we evaluated the feasibility of using an off-the-shelf speech-recognition system for scoring speaking prompts from the LanguEdge field test of 2002. We first…

Descriptors: Role, Computer Assisted Testing, Language Proficiency, Oral Language

Computer-Assisted Portfolio Scoring: Can Technology Enhance the Process of Scoring Portfolios?

Download full text

Solano-Flores, Guillermo; Raymond, Bruce; Schneider, Steven A. – 1997

The need for effective ways of monitoring the quality of scoring of portfolios resulted in the development of a software package that provides scoring leaders with updated information on their assessors' scoring quality. Assessors with computers enter data as they score, and this information is analyzed and reported to scoring leaders. The…

Descriptors: Art Teachers, Computer Assisted Testing, Computer Software, Computer Software Evaluation

A Computer-Based Approach for Deriving and Measuring Individual and Team Knowledge Structure from Essay Questions

Peer reviewed

Direct link

Clariana, Roy B.; Wallace, Patricia – Journal of Educational Computing Research, 2007

This proof-of-concept investigation describes a computer-based approach for deriving the knowledge structure of individuals and of groups from their written essays, and considers the convergent criterion-related validity of the computer-based scores relative to human rater essay scores and multiple-choice test scores. After completing a…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Construct Validity, Cognitive Structures

Previous Page | Next Page »

Pages: 1 | 2

Alt, Mary	1
Amanda Huee-Ping Wong	1
Bejar, Isaac I.	1
Breyer, F. Jay	1
Carlson, Sybil B.	1
Clariana, Roy B.	1
Coniam, David	1
Figueroa, Cecilia	1
Haertig, Hendrik	1
Haudek, Kevin C.	1
Hemat, Ramin	1
Ivan Cherh Chiet Low	1
Lee, Alwyn Vwen Yen	1
Linlin, Cao	1
Lorenz, Florian	1
Luco, Andrés Carlos	1
McCurry, Doug	1
Merrill, John	1
Meyers, Christina	1
Moscarella, Rosa A.	1
Nathasha Vihangi Luke	1
Nehm, Ross H.	1
Page, Ellis Batten	1
Prevost, Luanna B.	1
More ▼