ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	11

Source

Applied Linguistics	1
Applied Psychological…	1
Art Therapy: Journal of the…	1
Assessing Writing	1
Educational Technology &…	1
Electronic Journal of…	1
Francais dans le Monde	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Science Education…	1
Journal of Technology,…	1
Language Testing	1
Measurement:…	1
ReCALL	1
More ▼

Publication Type

Reports - Evaluative	21
Journal Articles	16
Numerical/Quantitative Data	2
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	2
Secondary Education	2
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Higher Education	1
Middle Schools	1

Audience

Location

California	1
Canada	1
Hong Kong	1
Japan	1
Malaysia	1
South Korea	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Test of English as a Foreign…	2

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Item Response Theory and Modeling with Stata

Peer reviewed

Direct link

Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023

This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…

Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis

Language Models in Automated Essay Scoring: Insights for the Turkish Language

Peer reviewed
PDF on ERIC

Download full text

Tahereh Firoozi; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2023

The proliferation of large language models represents a paradigm shift in the landscape of automated essay scoring (AES) systems, fundamentally elevating their accuracy and efficacy. This study presents an extensive examination of large language models, with a particular emphasis on the transformative influence of transformer-based models, such as…

Descriptors: Turkish, Writing Evaluation, Essays, Accuracy

Can Machine Scoring Deal with Broad and Open Writing Tests as Well as Human Readers?

Peer reviewed

Direct link

McCurry, Doug – Assessing Writing, 2010

This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…

Descriptors: Writing Tests, Scoring, Interrater Reliability, Computer Assisted Testing

Human vs. Computer Diagnosis of Students' Natural Selection Knowledge: Testing the Efficacy of Text Analytic Software

Peer reviewed

Direct link

Nehm, Ross H.; Haertig, Hendrik – Journal of Science Education and Technology, 2012

Our study examines the efficacy of Computer Assisted Scoring (CAS) of open-response text relative to expert human scoring within the complex domain of evolutionary biology. Specifically, we explored whether CAS can diagnose the explanatory elements (or Key Concepts) that comprise undergraduate students' explanatory models of natural selection with…

Descriptors: Evolution, Undergraduate Students, Interrater Reliability, Computers

Toward Automated Multi-Trait Scoring of Essays: Investigating Links among Holistic, Analytic, and Text Feature Scores

Peer reviewed

Direct link

Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – Applied Linguistics, 2010

The main purpose of the study was to investigate the distinctness and reliability of analytic (or multi-trait) rating dimensions and their relationships to holistic scores and "e-rater"[R] essay feature variables in the context of the TOEFL[R] computer-based test (TOEFL CBT) writing assessment. Data analyzed in the study were holistic…

Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays

The Role of Essay Tests Assessment in e-Learning: A Japanese Case Study

Peer reviewed
PDF on ERIC

Download full text

Nakayama, Minoru; Yamamoto, Hiroh; Santiago, Rowena – Electronic Journal of e-Learning, 2010

e-Learning has some restrictions on how learning performance is assessed. Online testing is usually in the form of multiple-choice questions, without any essay type of learning assessment. Major reasons for employing multiple-choice tasks in e-learning include ease of implementation and ease of managing learner's responses. To address this…

Descriptors: Electronic Learning, Testing, Essay Tests, Online Courses

Guessing, Partial Knowledge, and Misconceptions in Multiple-Choice Tests

Peer reviewed

Direct link

Lau, Paul Ngee Kiong; Lau, Sie Hoe; Hong, Kian Sam; Usop, Hasbee – Educational Technology & Society, 2011

The number right (NR) method, in which students pick one option as the answer, is the conventional method for scoring multiple-choice tests that is heavily criticized for encouraging students to guess and failing to credit partial knowledge. In addition, computer technology is increasingly used in classroom assessment. This paper investigates the…

Descriptors: Guessing (Tests), Multiple Choice Tests, Computers, Scoring

EduSpeak[R]: A Speech Recognition and Pronunciation Scoring Toolkit for Computer-Aided Language Learning Applications

Peer reviewed

Direct link

Franco, Horacio; Bratt, Harry; Rossier, Romain; Rao Gadde, Venkata; Shriberg, Elizabeth; Abrash, Victor; Precoda, Kristin – Language Testing, 2010

SRI International's EduSpeak[R] system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology. Automatic pronunciation scoring allows the computer to provide feedback on the overall quality of pronunciation and to point to…

Descriptors: Feedback (Response), Sentences, Oral Language, Predictor Variables

Utility in a Fallible Tool: A Multi-Site Case Study of Automated Writing Evaluation

Peer reviewed
PDF on ERIC

Download full text

Grimes, Douglas; Warschauer, Mark – Journal of Technology, Learning, and Assessment, 2010

Automated writing evaluation (AWE) software uses artificial intelligence (AI) to score student essays and support revision. We studied how an AWE program called MY Access![R] was used in eight middle schools in Southern California over a three-year period. Although many teachers and students considered automated scoring unreliable, and teachers'…

Descriptors: Automation, Writing Evaluation, Essays, Artificial Intelligence

A Computer System to Rate the Variety of Color in Drawings

Peer reviewed
PDF on ERIC

Download full text

Kim, Seong-in; Hameed, Ibrahim A. – Art Therapy: Journal of the American Art Therapy Association, 2009

For mental health professionals, art assessment is a useful tool for patient evaluation and diagnosis. Consideration of various color-related elements is important in art assessment. This correlational study introduces the concept of variety of color as a new color-related element of an artwork. This term represents a comprehensive use of color,…

Descriptors: Mental Health Workers, Essays, Scoring, Visual Stimuli

Experimenting with a Computer Essay-Scoring Program Based on ESL Student Writing Scripts

Peer reviewed

Direct link

Coniam, David – ReCALL, 2009

This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…

Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability

The Evaluation of a Microcomputer Scoring Program to Mark Examinee Write-In Responses.

Peer reviewed

Harasym, Peter H.; And Others – Journal of Educational Computing Research, 1993

Discussion of the use of human markers to mark responses on write-in questions focuses on a study that determined the feasibility of using a computer program to mark write-in responses for the Medical Council of Canada Qualifying Examination. The computer performance was compared with that of physician markers. (seven references) (LRW)

Descriptors: Comparative Analysis, Computer Assisted Testing, Computer Software Development, Computer Software Evaluation

Evaluating a Prototype Essay Scoring Procedure Using Off-the-Shelf Software.

Download full text

Kaplan, Randy M.; And Others – 1995

The increased use of constructed-response items, like essays, creates a need for tools to score these responses automatically in part or as a whole. This study explores one approach to analyzing essay-length natural language constructed-responses. A decision model for scoring essays was developed and evaluated. The decision model uses…

Descriptors: Computer Software, Constructed Response, Essay Tests, Grammar

Fitting Polytomous Item Response Theory Models to Multiple-Choice Tests.

Peer reviewed

Drasgow, Fritz; And Others – Applied Psychological Measurement, 1995

This study examined how well current software implementations of four polytomous item response theory models fit several multiple-choice tests. The main conclusion is that fitting polytomous item response models to multiple-choice item responses is more complex than fitting the three-parameter logistic model to dichotomously scored responses. (SLD)

Descriptors: Computer Software, Goodness of Fit, Item Response Theory, Models

Computer Grading of Student Prose, Using Modern Concepts and Software.

Peer reviewed

Page, Ellis Batten – Journal of Experimental Education, 1994

National Assessment of Educational Progress writing sample essays from 1988 and 1990 (495 and 599 essays) were subjected to computerized grading and human ratings. Cross-validation suggests that computer scoring is superior to a two-judge panel, a finding encouraging for large programs of essay evaluation. (SLD)

Descriptors: Computer Assisted Testing, Computer Software, Essays, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2

Scoring	21
Computer Software	18
Computer Assisted Testing	10
Essays	6
Foreign Countries	6
Writing Evaluation	5
Comparative Analysis	4
Computer Software Evaluation	4
Correlation	4
Essay Tests	4
Evaluation Methods	4
Interrater Reliability	4
Item Response Theory	4
Models	4
Test Items	4
Educational Assessment	3
Educational Technology	3
English (Second Language)	3
Feedback (Response)	3
Item Analysis	3
Multiple Choice Tests	3
Second Language Instruction	3
Second Language Learning	3
Test Construction	3
Test Scoring Machines	3
More ▼

O'Neil, Harold F., Jr.	2
Schacter, John	2
Abrash, Victor	1
Bratt, Harry	1
Chung, Gregory K. W. K.	1
Coniam, David	1
Drasgow, Fritz	1
Franco, Horacio	1
Garrigues, Mylene	1
Gentile, Claudia	1
Grimes, Douglas	1
Haertig, Hendrik	1
Hameed, Ibrahim A.	1
Harasym, Peter H.	1
Herl, Howard E.	1
Hong, Kian Sam	1
Kantor, Robert	1
Kaplan, Randy M.	1
Kim, Seong-in	1
Klein, Davina C. D.	1
Lau, Paul Ngee Kiong	1
Lau, Sie Hoe	1
Lee, Yong-Won	1
Madsen, Harold S.	1
More ▼