ERIC - Search Results

Publication Date

In 2026	0
Since 2025	4
Since 2022 (last 5 years)	22
Since 2017 (last 10 years)	38
Since 2007 (last 20 years)	72

Source

Language Testing

Publication Type

Journal Articles	72
Reports - Research	58
Reports - Evaluative	12
Tests/Questionnaires	7
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	20
Postsecondary Education	18
Elementary Education	11
Secondary Education	6
Early Childhood Education	2
Elementary Secondary Education	2
Grade 6	2
Intermediate Grades	2
Primary Education	2
Grade 1	1
Grade 5	1
High Schools	1
Kindergarten	1
Middle Schools	1
More ▼

Audience

Location

Australia	5
China	5
Japan	5
South Korea	3
Finland	2
Germany	2
Hawaii	2
Hong Kong	2
Netherlands	2
Bulgaria	1
California (Los Angeles)	1
Canada	1
Colombia	1
Georgia	1
Indonesia	1
Iran	1
Iran (Tehran)	1
Japan (Tokyo)	1
Kenya	1
Poland	1
Russia	1
Sweden	1
Switzerland	1
Taiwan	1
Thailand	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	4
International English…	3
Kuder Occupational Interest…	1
Peabody Picture Vocabulary…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 72 results Save | Export

Communal Factors in Rater Severity and Consistency over Time in High-Stakes Oral Assessment

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…

Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory

A Systematic Review of Differential Item Functioning in Second Language Assessment

Peer reviewed

Direct link

Xueliang Chen; Vahid Aryadoust; Wenxin Zhang – Language Testing, 2025

The growing diversity among test takers in second or foreign language (L2) assessments makes the importance of fairness front and center. This systematic review aimed to examine how fairness in L2 assessments was evaluated through differential item functioning (DIF) analysis. A total of 83 articles from 27 journals were included in a systematic…

Descriptors: Second Language Learning, Language Tests, Test Items, Item Analysis

Examining the Consistency of Instructor versus Large Language Model Ratings on Summary Content: Toward Checklist-Based Feedback Provision with Second Language Writers

Peer reviewed

Direct link

Yasuyo Sawaki; Yutaka Ishii; Hiroaki Yamada; Takenobu Tokunaga – Language Testing, 2025

This study examined the consistency between instructor ratings of learner-generated summaries and those estimated by a large language model (LLM) on summary content checklist items designed for undergraduate second language (L2) writing instruction in Japan. The effects of the LLM prompt design on the consistency between the two were also explored…

Descriptors: Interrater Reliability, Writing Teachers, College Faculty, Artificial Intelligence

The Development of a Chinese Vocabulary Proficiency Test (CVPT) for Learners of Chinese as a Second/Foreign Language

Peer reviewed

Direct link

Haiwei Zhang; Peng Sun; Yaowaluk Bianglae; Winda Widiawati – Language Testing, 2024

In order to address the needs of the continually growing number of Chinese language learners, the present study developed and presented initial validation of a 100-item Chinese vocabulary proficiency test (CVPT) for learners of Chinese as a second/foreign language (CS/FL) using Item Response Theory among 170 CS/FL learners from Indonesia and 354…

Descriptors: Test Construction, Vocabulary, Language Proficiency, Language Tests

The Effect of Viewing Visual Cues in a Listening Comprehension Test on Second Language Learners' Test-Taking Process and Performance: An Eye-Tracking Study

Peer reviewed

Direct link

Suh Keong Kwon; Guoxing Yu – Language Testing, 2024

In this study, we examined the effect of visual cues in a second language listening test on test takers' viewing behaviours and their test performance. Fifty-seven learners of English in Korea took a video-based listening test, with their eye movements recorded, and 23 of them were interviewed individually after the test. The participants viewed…

Descriptors: Foreign Countries, English (Second Language), Second Language Learning, Eye Movements

Triangulating Natural Language Processing (NLP)-Based Analysis of Rater Comments and Many-Facet Rasch Measurement (MFRM): An Innovative Approach to Investigating Raters' Application of Rating Scales in Writing Assessment

Peer reviewed

Direct link

Huiying Cai; Xun Yan – Language Testing, 2024

Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…

Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation

Feedforwarding Diagnostic Language Assessment: Artificial Intelligence- (AI-) Driven Weakness Identification and Contextualised Feedback for Second Language Speaking

Peer reviewed

Direct link

Shungo Suzuki; Hiroaki Takatsu; Ryuki Matsuura; Miina Koyama; Mao Saeki; Yoichi Matsuyama – Language Testing, 2025

The current study proposes a new approach to weakness identification in diagnostic language assessment (DLA) for speaking skills. We also propose to design actionable and contextualised diagnostic feedback through the systematic integration of feedback and remedial learning activities. Focusing on lexical use in second language speaking, the…

Descriptors: English (Second Language), Speech Skills, Artificial Intelligence, Second Language Learning

Revisiting Raters' Accent Familiarity in Speaking Tests: Evidence That Presentation Mode Interacts with Accent Familiarity to Variably Affect Comprehensibility Ratings

Peer reviewed

Direct link

Michael D. Carey; Stefan Szocs – Language Testing, 2024

This controlled experimental study investigated the interaction of variables associated with rating the pronunciation component of high-stakes English-language-speaking tests such as IELTS and TOEFL iBT. One hundred experienced raters who were all either familiar or unfamiliar with Brazilian-accented English or Papua New Guinean Tok Pisin-accented…

Descriptors: Dialects, Pronunciation, Suprasegmentals, Familiarity

Modeling Local Item Dependence in C-Tests with the Loglinear Rasch Model

Peer reviewed

Direct link

Baghaei, Purya; Christensen, Karl Bang – Language Testing, 2023

C-tests are gap-filling tests mainly used as rough and economical measures of second-language proficiency for placement and research purposes. A C-test usually consists of several short independent passages where the second half of every other word is deleted. Owing to their interdependent structure, C-test items violate the local independence…

Descriptors: Item Response Theory, Language Tests, Language Proficiency, Second Language Learning

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

Construct Validity and Fairness of an Operational Listening Test with World Englishes

Peer reviewed

Direct link

Nishizawa, Hitoshi – Language Testing, 2023

In this study, I investigate the construct validity and fairness pertaining to the use of a variety of Englishes in listening test input. I obtained data from a post-entry English language placement test administered at a public university in the United States. In addition to expectedly familiar American English, the test features Hawai'i,…

Descriptors: Construct Validity, Listening Comprehension Tests, Language Tests, English (Second Language)

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

A Context-Aligned Two Thousand Test: Toward Estimating High-Frequency French Vocabulary Knowledge for Beginner-to-Low Intermediate Proficiency Adolescent Learners in England

Peer reviewed

Direct link

Amber Dudley; Emma Marsden; Giulia Bovolenta – Language Testing, 2024

Vocabulary knowledge strongly predicts second language reading, listening, writing, and speaking. Yet, few tests have been developed to assess vocabulary knowledge in French. The primary aim of this pilot study was to design and initially validate the Context-Aligned Two Thousand Test (CA-TTT), following open research practices. The CA-TTT is a…

Descriptors: French, Vocabulary Development, Secondary School Students, Language Tests

Psychometric Approaches to Analyzing C-Tests

Peer reviewed

Direct link

Alpizar, David; Li, Tongyun; Norris, John M.; Gu, Lixiong – Language Testing, 2023

The C-test is a type of gap-filling test designed to efficiently measure second language proficiency. The typical C-test consists of several short paragraphs with the second half of every second word deleted. The words with deleted parts are considered as items nested within the corresponding paragraph. Given this testlet structure, it is commonly…

Descriptors: Psychometrics, Language Tests, Second Language Learning, Test Items

Operationalizing the Reading-into-Writing Construct in Analytic Rating Scales: Effects of Different Approaches on Rating

Peer reviewed

Direct link

Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023

Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…

Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Knoch, Ute	5
Min, Shangchao	4
He, Lianzhen	3
Batty, Aaron Olaf	2
Brunfaut, Tineke	2
Choi, Ikkyu	2
Eckes, Thomas	2
Elder, Catherine	2
Jang, Eunice Eunhee	2
McNamara, Tim	2
Pill, John	2
Abrash, Victor	1
Alanen, Riikka	1
Alderson, J. Charles	1
Alpizar, David	1
Amber Dudley	1
Ann Tai Choe	1
Aryadoust, Vahid	1
Attali, Yigal	1
Audeoud, Mireille	1
Bachman, Lyle F.	1
Bae, Jungok	1
Baghaei, Purya	1
Barkhuizen, Gary	1
Bishop, Kyoungwon	1
More ▼

Language Tests	54
Second Language Learning	46
Item Response Theory	44
English (Second Language)	40
Foreign Countries	37
Feedback (Response)	23
Language Proficiency	20
Test Items	17
Scores	16
Evaluators	15
Comparative Analysis	14
Scoring	14
Interrater Reliability	12
Second Language Instruction	12
Statistical Analysis	11
Test Reliability	11
Correlation	10
Listening Comprehension Tests	10
Writing Evaluation	10
College Students	9
Elementary School Students	9
Item Analysis	9
Models	9
Oral Language	9
Difficulty Level	8
More ▼