ERIC - Search Results

Publication Date

In 2026	0
Since 2025	7
Since 2022 (last 5 years)	19
Since 2017 (last 10 years)	60
Since 2007 (last 20 years)	106

Descriptor

Interrater Reliability	170
Language Tests	170
Second Language Learning	91
English (Second Language)	87
Foreign Countries	69
Language Proficiency	57
Scoring	51
Second Language Instruction	47
Oral Language	45
Evaluators	41
Test Validity	40
Scores	34
Rating Scales	32
Test Reliability	32
Comparative Analysis	31
Testing	28
Correlation	24
Interviews	21
Statistical Analysis	19
Computer Assisted Testing	18
Evaluation Methods	18
Test Items	16
Writing Evaluation	16
College Students	15
Speech Communication	15
More ▼

Publication Type

Reports - Research	129
Journal Articles	126
Speeches/Meeting Papers	22
Tests/Questionnaires	20
Reports - Evaluative	19
Reports - Descriptive	9
Information Analyses	6
Dissertations/Theses -…	4
Guides - Non-Classroom	2
Opinion Papers	2
Books	1
Collected Works - General	1
Collected Works - Serials	1
ERIC Digests in Full Text	1
ERIC Publications	1
Numerical/Quantitative Data	1
Reports - General	1
More ▼

Education Level

Higher Education	30
Postsecondary Education	25
Elementary Education	9
Secondary Education	6
Early Childhood Education	5
Adult Education	4
Primary Education	4
Intermediate Grades	3
Grade 2	2
Grade 6	2
High Schools	2
Kindergarten	2
Preschool Education	2
Grade 1	1
Grade 11	1
Grade 4	1
More ▼

Audience

Practitioners	3
Researchers	1
Teachers	1

Location

Iran	9
China	7
Japan	5
India	4
Netherlands	4
Sweden	4
Australia	3
Canada	3
Hong Kong	3
Turkey	3
Arizona	2
Germany	2
South Korea	2
Switzerland	2
Taiwan	2
California	1
China (Beijing)	1
Colombia	1
Cyprus	1
Denmark	1
Europe	1
Finland	1
France	1
Iran (Tehran)	1
Ireland (Dublin)	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	27
International English…	7
ACTFL Oral Proficiency…	4
Peabody Picture Vocabulary…	3
Test of English for…	2
Alabama High School…	1
Clinical Evaluation of…	1
Graduate Record Examinations	1
Illinois Test of…	1
Modern Language Aptitude Test	1
National Assessment of…	1
Raven Progressive Matrices	1
Strengths and Difficulties…	1
Test of Language Development	1
Wechsler Adult Intelligence…	1
Woodcock Reading Mastery Test	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 170 results Save | Export

Developing an Automatic Pronunciation Scorer: Aligning Speech Evaluation Models and Applied Linguistics Constructs

Peer reviewed

Direct link

Danwei Cai; Ben Naismith; Maria Kostromitina; Zhongwei Teng; Kevin P. Yancey; Geoffrey T. LaFlair – Language Learning, 2025

Globalization and increases in the numbers of English language learners have led to a growing demand for English proficiency assessments of spoken language. In this paper, we describe the development of an automatic pronunciation scorer built on state-of-the-art deep neural network models. The model is trained on a bespoke human-rated dataset that…

Descriptors: Automation, Scoring, Pronunciation, Speech Tests

Comparison of Traditional Machine Learning and Neural Network Approaches for Automated Scoring of Second Language English Essays

Peer reviewed

Direct link

Erik Voss – Language Testing, 2025

An increasing number of language testing companies are developing and deploying deep learning-based automated essay scoring systems (AES) to replace traditional approaches that rely on handcrafted feature extraction. However, there is hesitation to accept neural network approaches to automated essay scoring because the features are automatically…

Descriptors: Artificial Intelligence, Automation, Scoring, English (Second Language)

Automated Essay Scoring with GPT-4 for a Local Placement Test: Investigating Prompting Strategies, Intra-Rater Reliability, and Alignment with Human Scores

Peer reviewed

Direct link

Yoonseo Kim – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2025

This study explores the potential of OpenAI's ChatGPT-4 (gpt-4-0613) as an automated essay scoring (AES) tool in a trial involving 300 essays from an American university's academic English program placement test. Three prompting strategies (minimal/detailed rubric, require/not require rationale, and with/without scoring examples) were tested for…

Descriptors: Automation, Scoring, Artificial Intelligence, Placement Tests

Communal Factors in Rater Severity and Consistency over Time in High-Stakes Oral Assessment

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…

Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory

Artificial Intelligence in International English Language Testing System Writing Assessments: A Comparative Study of Human Ratings and DeepAI

Peer reviewed
PDF on ERIC

Download full text

Somayeh Fathali; Fatemeh Mohajeri – Technology in Language Teaching & Learning, 2025

The International English Language Testing System (IELTS) is a high-stakes exam where Writing Task 2 significantly influences the overall scores, requiring reliable evaluation. While trained human raters perform this task, concerns about subjectivity and inconsistency have led to growing interest in artificial intelligence (AI)-based assessment…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Artificial Intelligence

Investigating the Effect of Classroom-Based Feedback on Speaking Assessment: A Multifaceted Rasch Analysis

Peer reviewed

Direct link

Bijani, Houman; Hashempour, Bahareh; Ibrahim, Khaled Ahmed Abdel-Al; Orabah, Salim Said Bani; Heydarnejad, Tahereh – Language Testing in Asia, 2022

Due to subjectivity in oral assessment, much concentration has been put on obtaining a satisfactory measure of consistency among raters. However, the process for obtaining more consistency might not result in valid decisions. One matter that is at the core of both reliability and validity in oral assessment is rater training. Recently,…

Descriptors: Oral Language, Language Tests, Feedback (Response), Bias

The Vague Language Use Scale: Clinical Utility and Psychometrics from Adults with Traumatic Brain Injury

Peer reviewed

Direct link

Kathryn J. Greenslade; Julia K. Bushell; Emily F. Dillon; Amy E. Ramage – International Journal of Language & Communication Disorders, 2025

Background: Pragmatic communication difficulties encompass many distinct behaviours, including the use of vague and/or insufficient language, a common characteristic following traumatic brain injury (TBI) that negatively impacts psychosocial outcomes. Existing assessments evaluate pragmatic communication broadly, often with only one or two items…

Descriptors: Neurological Impairments, Head Injuries, Language Impairments, Language Tests

The Insufficiency of Norm-Referenced Writing Assessment for Identifying Writing Weaknesses in Children Who Are Deaf and Hard of Hearing

Peer reviewed

Direct link

Brittany Grey; Marren C. Brooks; Emily A. Lund; Krystal L. Werfel – Language, Speech, and Hearing Services in Schools, 2025

Purpose: This study examined the internal consistency reliability, interrater reliability, and concurrent validity of the norm-referenced Test of Early Written Language--Third Edition (TEWL-3) to determine if it is an appropriate measure to use when determining if elementary children who are deaf and hard of hearing (DHH) meet grade-level writing…

Descriptors: Hard of Hearing, Sensory Aids, Writing Improvement, Writing Instruction

When Do Pragmatic Abilities Peak? Assessment of Pragmatic Abilities and Cognitive Substrates--French Version Psychometric Properties across the Lifespan

Peer reviewed

Direct link

Nicolas Petit; Flavia Mengarelli; Marie-Maude Geoffray Cassar; Giorgio Arcara; Valentina Bambini – Journal of Speech, Language, and Hearing Research, 2025

Purpose: This study aims (a) to assess the psychometric properties of a French adaptation of the Assessment of Pragmatic Abilities and Cognitive Substrates (APACS-Fr), a comprehensive test of pragmatic abilities for French-speaking adolescents and adults, and (b) to use it to study lifespan variations in pragmatic abilities, to determine when…

Descriptors: Pragmatics, Cognitive Ability, Language Skills, Cognitive Measurement

The Rashomon Effect: Which Features of a Speaker's Talk Do Listeners Notice?

Peer reviewed

Direct link

Seedhouse, Paul; Satar, Müge – Classroom Discourse, 2023

The same L2 speaking performance may be analysed and evaluated in very different ways by different teachers or raters. We present a new, technology-assisted research design which opens up to investigation the trajectories of convergence and divergence between raters. We tracked and recorded what different raters noticed when, whilst grading a…

Descriptors: Language Tests, English (Second Language), Second Language Learning, Oral Language

Examining Consistency among Different Rubrics for Assessing Writing

Peer reviewed

Direct link

Shabani, Enayat A.; Panahi, Jaleh – Language Testing in Asia, 2020

The literature on using scoring rubrics in writing assessment denotes the significance of rubrics as practical and useful means to assess the quality of writing tasks. This study tries to investigate the agreement among rubrics endorsed and used for assessing the essay writing tasks by the internationally recognized tests of English language…

Descriptors: Writing Evaluation, Scoring Rubrics, Scores, Interrater Reliability

A Rasch Analysis of Rater Behaviour in Speaking Assessment

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat – International Online Journal of Education and Teaching, 2020

The assessment of speaking skills in foreign language testing has always had some pros (testing learners' speaking skills doubles the validity of any language test) and cons (many testrelevant/irrelevant variables interfere) since it is a multi-dimensional process. In the meantime, exploring grader behaviours while scoring learners' speaking…

Descriptors: Item Response Theory, Interrater Reliability, Speech Skills, Second Language Learning

Validation of Rating Processes within an Argument-Based Framework

Peer reviewed

Direct link

Knoch, Ute; Chapelle, Carol A. – Language Testing, 2018

Argument-based validation requires test developers and researchers to specify what is entailed in test interpretation and use. Doing so has been shown to yield advantages (Chapelle, Enright, & Jamieson, 2010), but it also requires an analysis of how the concerns of language testers can be conceptualized in the terms used to construct a…

Descriptors: Test Validity, Language Tests, Evaluation Research, Rating Scales

Measurement Properties of a Standardized Elicited Imitation Test: An Integrative Data Analysis

Peer reviewed

Direct link

Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022

Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…

Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning

Applying Generalizability Theory in Language Testing: Comparing Nested and Crossed Scoring Designs in the Assessment of Speaking Skills

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat; Turhan, Nihan Sölpük – International Journal of Curriculum and Instruction, 2021

Scoring language learners' speaking skills is open to a number of measurement errors since raters' personal judgements could involve in the process. Different grading designs in which raters score a student's whole speaking skills or a specific dimension of the speaking performance could be settled to control and minimize the amount of the error…

Descriptors: Language Tests, Scoring, Speech Communication, State Universities

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12

Language Testing	23
Language Assessment Quarterly	9
ETS Research Report Series	8
English Language Teaching	5
International Journal of…	5
Journal of Speech, Language,…	3
Language, Speech, and Hearing…	3
ProQuest LLC	3
Canadian Modern Language…	2
Clinical Linguistics &…	2
Cogent Education	2
ELT Journal	2
Educational Testing Service	2
Educational and Psychological…	2
Foreign Language Annals	2
International Journal of…	2
JALT CALL Journal	2
Journal of Communication…	2
Language Education &…	2
Language Learning Journal	2
Language Testing in Asia	2
Studies in Second Language…	2
System	2
Advances in Language and…	1
Annual Review of Applied…	1
More ▼

Nakamura, Yuji	3
Ahmadi, Alireza	2
Anna-Maria Fall	2
Barnwell, David	2
Bejar, Isaac I.	2
Beula M. Magimairaj	2
Bijani, Houman	2
Coniam, David	2
Davis, Larry	2
Elder, Catherine	2
Grant, Leslie	2
Greg Roberts	2
Henning, Grant	2
Hsieh, Mingchuan	2
Knoch, Ute	2
Magnan, Sally Sieloff	2
Mollaun, Pam	2
Philip Capin	2
Polat, Murat	2
Ronald B. Gillam	2
Sandra L. Gillam	2
Sharon Vaughn	2
Stansfield, Charles W.	2
Wigglesworth, Gillian	2
More ▼