ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	10
Since 2017 (last 10 years)	18
Since 2007 (last 20 years)	37

Descriptor

Foreign Countries	38
Language Tests	30
Item Response Theory	28
English (Second Language)	24
Second Language Learning	22
Test Items	12
Language Proficiency	11
Feedback (Response)	9
Comparative Analysis	8
Scoring	8
Scores	7
College Students	6
Correlation	6
Difficulty Level	6
Interrater Reliability	6
Listening Comprehension Tests	6
Reading Comprehension	6
Item Analysis	5
Questionnaires	5
Reading Tests	5
Second Language Instruction	5
Secondary School Students	5
Statistical Analysis	5
Test Format	5
Test Reliability	5
More ▼

Source

Language Testing

Publication Type

Journal Articles	38
Reports - Research	32
Reports - Evaluative	6
Tests/Questionnaires	5

Education Level

Higher Education	13
Postsecondary Education	12
Elementary Education	6
Secondary Education	5
Elementary Secondary Education	2
Grade 6	2
Intermediate Grades	2
Early Childhood Education	1
Grade 5	1
High Schools	1
Kindergarten	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Location

Australia	5
China	5
Japan	5
South Korea	3
Canada	2
Finland	2
Germany	2
Hong Kong	2
Netherlands	2
Bulgaria	1
Colombia	1
Indonesia	1
Iran	1
Iran (Tehran)	1
Japan (Tokyo)	1
Kenya	1
Poland	1
Russia	1
Sweden	1
Switzerland	1
Taiwan	1
Thailand	1
Turkey (Ankara)	1
Ukraine	1
United Kingdom	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

International English…	2
Test of English as a Foreign…	2
Kuder Occupational Interest…	1
Peabody Picture Vocabulary…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 38 results Save | Export

Communal Factors in Rater Severity and Consistency over Time in High-Stakes Oral Assessment

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…

Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory

Examining the Consistency of Instructor versus Large Language Model Ratings on Summary Content: Toward Checklist-Based Feedback Provision with Second Language Writers

Peer reviewed

Direct link

Yasuyo Sawaki; Yutaka Ishii; Hiroaki Yamada; Takenobu Tokunaga – Language Testing, 2025

This study examined the consistency between instructor ratings of learner-generated summaries and those estimated by a large language model (LLM) on summary content checklist items designed for undergraduate second language (L2) writing instruction in Japan. The effects of the LLM prompt design on the consistency between the two were also explored…

Descriptors: Interrater Reliability, Writing Teachers, College Faculty, Artificial Intelligence

The Development of a Chinese Vocabulary Proficiency Test (CVPT) for Learners of Chinese as a Second/Foreign Language

Peer reviewed

Direct link

Haiwei Zhang; Peng Sun; Yaowaluk Bianglae; Winda Widiawati – Language Testing, 2024

In order to address the needs of the continually growing number of Chinese language learners, the present study developed and presented initial validation of a 100-item Chinese vocabulary proficiency test (CVPT) for learners of Chinese as a second/foreign language (CS/FL) using Item Response Theory among 170 CS/FL learners from Indonesia and 354…

Descriptors: Test Construction, Vocabulary, Language Proficiency, Language Tests

The Effect of Viewing Visual Cues in a Listening Comprehension Test on Second Language Learners' Test-Taking Process and Performance: An Eye-Tracking Study

Peer reviewed

Direct link

Suh Keong Kwon; Guoxing Yu – Language Testing, 2024

In this study, we examined the effect of visual cues in a second language listening test on test takers' viewing behaviours and their test performance. Fifty-seven learners of English in Korea took a video-based listening test, with their eye movements recorded, and 23 of them were interviewed individually after the test. The participants viewed…

Descriptors: Foreign Countries, English (Second Language), Second Language Learning, Eye Movements

Feedforwarding Diagnostic Language Assessment: Artificial Intelligence- (AI-) Driven Weakness Identification and Contextualised Feedback for Second Language Speaking

Peer reviewed

Direct link

Shungo Suzuki; Hiroaki Takatsu; Ryuki Matsuura; Miina Koyama; Mao Saeki; Yoichi Matsuyama – Language Testing, 2025

The current study proposes a new approach to weakness identification in diagnostic language assessment (DLA) for speaking skills. We also propose to design actionable and contextualised diagnostic feedback through the systematic integration of feedback and remedial learning activities. Focusing on lexical use in second language speaking, the…

Descriptors: English (Second Language), Speech Skills, Artificial Intelligence, Second Language Learning

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

A Context-Aligned Two Thousand Test: Toward Estimating High-Frequency French Vocabulary Knowledge for Beginner-to-Low Intermediate Proficiency Adolescent Learners in England

Peer reviewed

Direct link

Amber Dudley; Emma Marsden; Giulia Bovolenta – Language Testing, 2024

Vocabulary knowledge strongly predicts second language reading, listening, writing, and speaking. Yet, few tests have been developed to assess vocabulary knowledge in French. The primary aim of this pilot study was to design and initially validate the Context-Aligned Two Thousand Test (CA-TTT), following open research practices. The CA-TTT is a…

Descriptors: French, Vocabulary Development, Secondary School Students, Language Tests

Linking Scores from Two Written Receptive English Academic Vocabulary Tests--The VLT-Ac and the AVT

Peer reviewed

Direct link

Warnby, Marcus; Malmström, Hans; Hansen, Kajsa Yang – Language Testing, 2023

The academic section of the Vocabulary Levels Test (VLT-Ac) and the Academic Vocabulary Test (AVT) both assess meaning-recognition knowledge of written receptive academic vocabulary, deemed central for engagement in academic activities. Depending on the purpose and context of the testing, either of the tests can be appropriate, but for research…

Descriptors: Foreign Countries, Scores, Written Language, Receptive Language

Bridging Local Needs and National Standards: Use of Standards-Based Individualized Feedback of an In-House EFL Listening Test in China

Peer reviewed

Direct link

Min, Shangchao; Zhang, Juan; Li, Yue; He, Lianzhen – Language Testing, 2022

Local language tests are an arena where national language standards can be operationalized to create a hub for integrating assessment results and language support. Few studies, however, have examined the operationalization of national standards in local language assessment contexts. In this study, we proposed a model to present the integration of…

Descriptors: Language Tests, Listening Comprehension Tests, Second Language Learning, English (Second Language)

Validity Evidence for a Sentence Repetition Test of Swiss German Sign Language

Peer reviewed

Direct link

Haug, Tobias; Batty, Aaron Olaf; Venetz, Martin; Notter, Christa; Girard-Groeber, Simone; Knoch, Ute; Audeoud, Mireille – Language Testing, 2020

In this study we seek evidence of validity according to the socio-cognitive framework (Weir, 2005) for a new sentence repetition test (SRT) for young Deaf L1 Swiss German Sign Language (DSGS) users. SRTs have been developed for various purposes for both spoken and sign languages to assess language development in children. In order to address the…

Descriptors: Foreign Countries, Language Tests, Sentences, Repetition

IRT-Based Classification Analysis of an English Language Reading Proficiency Subtest

Peer reviewed

Direct link

Kaya, Elif; O'Grady, Stefan; Kalender, Ilker – Language Testing, 2022

Language proficiency testing serves an important function of classifying examinees into different categories of ability. However, misclassification is to some extent inevitable and may have important consequences for stakeholders. Recent research suggests that classification efficacy may be enhanced substantially using computerized adaptive…

Descriptors: Item Response Theory, Test Items, Language Tests, Classification

Mapping the Fluctuating Effect of Strategy Use Ability on English Reading Performance for Nursing Students: A Multi-Layered Moderation Analysis Approach

Peer reviewed

Direct link

Cai, Yuyang; Kunnan, Antony John – Language Testing, 2020

An essential hypothesis of modern language assessment theory pertains to the interaction between strategy use ability (strategic competence) and second language knowledge. However, how they interact with each other is rarely explored. Drawing on relevant research in the literature, in this paper we proposed three interaction patterns (i.e.,…

Descriptors: English (Second Language), Second Language Learning, Nursing Education, Reading Tests

Development and Validation of a Chinese Character Acquisition Assessment for Second-Language Kindergarteners

Peer reviewed

Direct link

Chan, Stephanie W. Y.; Cheung, Wai Ming; Huang, Yanli; Lam, Wai-Ip; Lin, Chin-Hsi – Language Testing, 2020

Demand for second-language (L2) Chinese education for kindergarteners has grown rapidly, but little is known about these kindergarteners' L2 skills, with existing studies focusing on school-age populations and alphabetic languages. Accordingly, we developed a six-subtest Chinese character acquisition assessment to measure L2 kindergarteners'…

Descriptors: Chinese, Second Language Learning, Second Language Instruction, Written Language

Evaluating Subscore Uses across Multiple Levels: A Case of Reading and Listening Subscores for Young EFL Learners

Peer reviewed

Direct link

Choi, Ikkyu; Papageorgiou, Spiros – Language Testing, 2020

Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores

Setting Cut Scores on an EFL Placement Test Using the Prototype Group Method: A Receiver Operating Characteristic (ROC) Analysis

Peer reviewed

Direct link

Eckes, Thomas – Language Testing, 2017

This paper presents an approach to standard setting that combines the prototype group method (PGM; Eckes, 2012) with a receiver operating characteristic (ROC) analysis. The combined PGM-ROC approach is applied to setting cut scores on a placement test of English as a foreign language (EFL). To implement the PGM, experts first named learners whom…

Descriptors: English (Second Language), Language Tests, Cutting Scores, Standard Setting (Scoring)

Previous Page | Next Page »

Pages: 1 | 2 | 3

Batty, Aaron Olaf	2
Eckes, Thomas	2
He, Lianzhen	2
Jang, Eunice Eunhee	2
Knoch, Ute	2
McNamara, Tim	2
Min, Shangchao	2
Alanen, Riikka	1
Amber Dudley	1
Aryadoust, Vahid	1
Audeoud, Mireille	1
Cai, Yuyang	1
Campfield, Dorota E.	1
Chan, Stephanie W. Y.	1
Cheung, Wai Ming	1
Choi, Hyeran	1
Choi, Ikkyu	1
Culligan, Brent	1
Deygers, Bart	1
Dunlop, Maggie	1
Elder, Catherine	1
Emma Marsden	1
Esmat Babaii	1
Farshad Effatpanah	1
Ferne, Tracy	1
More ▼