ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	7
Since 2007 (last 20 years)	13

Descriptor

Computer Assisted Testing	20
Interrater Reliability	20
Writing Evaluation	20
Scoring	13
Essays	11
English (Second Language)	9
Second Language Learning	8
Automation	6
Computer Software	6
Foreign Countries	6
Evaluators	5
Comparative Analysis	4
Correlation	4
Essay Tests	4
Evaluation Methods	4
Writing Tests	4
Accuracy	3
Computer Software Evaluation	3
Educational Technology	3
Holistic Evaluation	3
Item Analysis	3
Measurement Techniques	3
Scoring Rubrics	3
Statistical Analysis	3
Student Attitudes	3
More ▼

Source

Journal of Technology,…	3
Assessing Writing	2
SAGE Open	2
ETS Research Report Series	1
Educational Research and…	1
English Teaching	1
IEEE Transactions on Learning…	1
International Educational…	1
International Journal of…	1
Journal of Applied Testing…	1
Journal of Experimental…	1
Journal of Interactive…	1
Journal of Pan-Pacific…	1
More ▼

Publication Type

Journal Articles	16
Reports - Research	11
Reports - Evaluative	6
Speeches/Meeting Papers	3
Books	1
Collected Works - General	1
Information Analyses	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	7
Postsecondary Education	4
Elementary Secondary Education	2
Elementary Education	1
Grade 11	1
Grade 8	1
High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Researchers	2
Practitioners	1
Teachers	1

Location

China	1
Germany	1
Hong Kong	1
South Korea	1
Switzerland	1
Taiwan	1
Texas	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Test of English as a Foreign…	2
Graduate Management Admission…	1
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Evaluating Quadratic Weighted Kappa as the Standard Performance Metric for Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023

Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…

Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy

Examining AI-Based Accuracy Assessment in L2 Learners' Writing

Peer reviewed

Direct link

On-Soon Lee – Journal of Pan-Pacific Association of Applied Linguistics, 2024

Despite the increasing interest in using AI tools as assistant agents in instructional settings, the effectiveness of ChatGPT, the generative pretrained AI, for evaluating the accuracy of second language (L2) writing has been largely unexplored in formative assessment. Therefore, the current study aims to examine how ChatGPT, as an evaluator,…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018

Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…

Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring

The Impact of Computers on Marking Behaviors and Assessment: A Many-Facet Rasch Measurement Analysis of Essays by EFL College Students

Peer reviewed

Direct link

He, Tung-hsien – SAGE Open, 2019

This study employed a mixed-design approach and the Many-Facet Rasch Measurement (MFRM) framework to investigate whether rater bias occurred between the onscreen scoring (OSS) mode and the paper-based scoring (PBS) mode. Nine human raters analytically marked scanned scripts and paper scripts using a six-category (i.e., six-criterion) rating…

Descriptors: Computer Assisted Testing, Scoring, Item Response Theory, Essays

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

Development of a Rubric to Assess Academic Writing Incorporating Plagiarism Detectors

Peer reviewed

Direct link

Razi, Salim – SAGE Open, 2015

Similarity reports of plagiarism detectors should be approached with caution as they may not be sufficient to support allegations of plagiarism. This study developed a 50-item rubric to simplify and standardize evaluation of academic papers. In the spring semester of 2011-2012 academic year, 161 freshmen's papers at the English Language Teaching…

Descriptors: Foreign Countries, Scoring Rubrics, Writing Evaluation, Writing (Composition)

Automated Essay Feedback Generation and Its Impact on Revision

Peer reviewed

Direct link

Liu, Ming; Li, Yi; Xu, Weiwei; Liu, Li – IEEE Transactions on Learning Technologies, 2017

Writing an essay is a very important skill for students to master, but a difficult task for them to overcome. It is particularly true for English as Second Language (ESL) students in China. It would be very useful if students could receive timely and effective feedback about their writing. Automatic essay feedback generation is a challenging task,…

Descriptors: Foreign Countries, College Students, Second Language Learning, English (Second Language)

Can Machine Scoring Deal with Broad and Open Writing Tests as Well as Human Readers?

Peer reviewed

Direct link

McCurry, Doug – Assessing Writing, 2010

This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…

Descriptors: Writing Tests, Scoring, Interrater Reliability, Computer Assisted Testing

Automated Formative Assessment as a Tool to Scaffold Student Documentary Writing

Peer reviewed

Direct link

Ferster, Bill; Hammond, Thomas C.; Alexander, R. Curby; Lyman, Hunt – Journal of Interactive Learning Research, 2012

The hurried pace of the modern classroom does not permit formative feedback on writing assignments at the frequency or quality recommended by the research literature. One solution for increasing individual feedback to students is to incorporate some form of computer-generated assessment. This study explores the use of automated assessment of…

Descriptors: Feedback (Response), Scripts, Formative Evaluation, Essays

A Comparison of Onscreen and Paper-Based Marking in the Hong Kong Public Examination System

Peer reviewed

Direct link

Coniam, David – Educational Research and Evaluation, 2009

This paper describes a study comparing paper-based marking (PBM) and onscreen marking (OSM) in Hong Kong utilising English language essay scripts drawn from the live 2007 Hong Kong Certificate of Education Examination (HKCEE) Year 11 English Language Writing Paper. In the study, 30 raters from the 2007 HKCEE Writing Paper marked on paper 100…

Descriptors: Student Attitudes, Foreign Countries, Essays, Comparative Analysis

Computer Grading of Student Prose, Using Modern Concepts and Software.

Peer reviewed

Page, Ellis Batten – Journal of Experimental Education, 1994

National Assessment of Educational Progress writing sample essays from 1988 and 1990 (495 and 599 essays) were subjected to computerized grading and human ratings. Cross-validation suggests that computer scoring is superior to a two-judge panel, a finding encouraging for large programs of essay evaluation. (SLD)

Descriptors: Computer Assisted Testing, Computer Software, Essays, Evaluation Methods

Automated Essay Scoring versus Human Scoring: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Direct link

Wang, Jinhao; Brown, Michelle Stallone – Journal of Technology, Learning, and Assessment, 2007

The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by an AES tool, IntelliMetric [TM] and human raters. Data collection included administering the Texas version of the WriterPlacer "Plus" test and obtaining scores assigned by IntelliMetric [TM] and by…

Descriptors: Test Scoring Machines, Scoring, Comparative Testing, Intermode Differences

Writing Sample Assessment: Reliable, Efficient, and Computerized.

Merrill, Beverly; Peterson, Sarah – 1986

When the Mesa, Arizona Public Schools initiated an ambitious writing instruction program in 1978, two assessments based on student writing samples were developed. The first is based on a ninth grade proficiency test. If the student does not pass the test, high school remediation is provided. After 1987, students must pass this test in order to…

Descriptors: Computer Assisted Testing, Elementary Secondary Education, Graduation Requirements, Holistic Evaluation

Toward More Substantively Meaningful Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Ben-Simon, Anat; Bennett, Randy Elliott – Journal of Technology, Learning, and Assessment, 2007

This study evaluated a "substantively driven" method for scoring NAEP writing assessments automatically. The study used variations of an existing commercial program, e-rater[R], to compare the performance of three approaches to automated essay scoring: a "brute-empirical" approach in which variables are selected and weighted solely according to…

Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays

Previous Page | Next Page »

Pages: 1 | 2

Alexander, R. Curby	1
Ben-Simon, Anat	1
Bennett, Randy Elliott	1
Bhola, Dennison S.	1
Brown, Michelle Stallone	1
Buckendahl, Chad W.	1
Camp, Roberta	1
Carlson, Sybil B.	1
Casabianca, Jodi M.	1
Coniam, David	1
Doewes, Afrizal	1
Engelhard, George, Jr.	1
Ferster, Bill	1
Foltz, Peter	1
Garcia, Veronica	1
Hammond, Thomas C.	1
He, Tung-hsien	1
Jiyeo Yun	1
Juszkiewicz, Piotr J.	1
Keller, Stefan	1
Krüger, Maleika	1
Kurdhi, Nughthoh Arfawi	1
Köller, Olaf	1
Lee, H. K.	1
Li, Yi	1
More ▼