ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	13
Since 2017 (last 10 years)	22
Since 2007 (last 20 years)	40

Descriptor

English (Second Language)	47
Second Language Learning	36
Language Tests	35
Item Response Theory	31
Foreign Countries	24
Test Items	14
Language Proficiency	13
Feedback (Response)	12
Scores	11
Comparative Analysis	10
Evaluators	8
Models	8
Reading Comprehension	8
Second Language Instruction	8
Statistical Analysis	8
Item Analysis	7
Listening Comprehension Tests	7
College Students	6
Reading Tests	6
Scoring	6
Writing Evaluation	6
Correlation	5
Elementary School Students	5
Factor Analysis	5
Interrater Reliability	5
More ▼

Source

Language Testing

Publication Type

Journal Articles	47
Reports - Research	35
Reports - Evaluative	11
Tests/Questionnaires	5
Reports - Descriptive	1

Education Level

Higher Education	15
Postsecondary Education	13
Elementary Education	6
High Schools	2
Secondary Education	2
Elementary Secondary Education	1
Grade 6	1
Intermediate Grades	1

Audience

Location

China	4
Japan	4
Australia	2
Hawaii	2
South Korea	2
Canada	1
Finland	1
Germany	1
Iran	1
Iran (Tehran)	1
Japan (Tokyo)	1
Kenya	1
Netherlands	1
Poland	1
Sweden	1
Taiwan	1
Turkey (Ankara)	1
United Kingdom	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	8
International English…	3
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 47 results Save | Export

Examining the Consistency of Instructor versus Large Language Model Ratings on Summary Content: Toward Checklist-Based Feedback Provision with Second Language Writers

Peer reviewed

Direct link

Yasuyo Sawaki; Yutaka Ishii; Hiroaki Yamada; Takenobu Tokunaga – Language Testing, 2025

This study examined the consistency between instructor ratings of learner-generated summaries and those estimated by a large language model (LLM) on summary content checklist items designed for undergraduate second language (L2) writing instruction in Japan. The effects of the LLM prompt design on the consistency between the two were also explored…

Descriptors: Interrater Reliability, Writing Teachers, College Faculty, Artificial Intelligence

The Effect of Viewing Visual Cues in a Listening Comprehension Test on Second Language Learners' Test-Taking Process and Performance: An Eye-Tracking Study

Peer reviewed

Direct link

Suh Keong Kwon; Guoxing Yu – Language Testing, 2024

In this study, we examined the effect of visual cues in a second language listening test on test takers' viewing behaviours and their test performance. Fifty-seven learners of English in Korea took a video-based listening test, with their eye movements recorded, and 23 of them were interviewed individually after the test. The participants viewed…

Descriptors: Foreign Countries, English (Second Language), Second Language Learning, Eye Movements

Feedforwarding Diagnostic Language Assessment: Artificial Intelligence- (AI-) Driven Weakness Identification and Contextualised Feedback for Second Language Speaking

Peer reviewed

Direct link

Shungo Suzuki; Hiroaki Takatsu; Ryuki Matsuura; Miina Koyama; Mao Saeki; Yoichi Matsuyama – Language Testing, 2025

The current study proposes a new approach to weakness identification in diagnostic language assessment (DLA) for speaking skills. We also propose to design actionable and contextualised diagnostic feedback through the systematic integration of feedback and remedial learning activities. Focusing on lexical use in second language speaking, the…

Descriptors: English (Second Language), Speech Skills, Artificial Intelligence, Second Language Learning

Revisiting Raters' Accent Familiarity in Speaking Tests: Evidence That Presentation Mode Interacts with Accent Familiarity to Variably Affect Comprehensibility Ratings

Peer reviewed

Direct link

Michael D. Carey; Stefan Szocs – Language Testing, 2024

This controlled experimental study investigated the interaction of variables associated with rating the pronunciation component of high-stakes English-language-speaking tests such as IELTS and TOEFL iBT. One hundred experienced raters who were all either familiar or unfamiliar with Brazilian-accented English or Papua New Guinean Tok Pisin-accented…

Descriptors: Dialects, Pronunciation, Suprasegmentals, Familiarity

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

Construct Validity and Fairness of an Operational Listening Test with World Englishes

Peer reviewed

Direct link

Nishizawa, Hitoshi – Language Testing, 2023

In this study, I investigate the construct validity and fairness pertaining to the use of a variety of Englishes in listening test input. I obtained data from a post-entry English language placement test administered at a public university in the United States. In addition to expectedly familiar American English, the test features Hawai'i,…

Descriptors: Construct Validity, Listening Comprehension Tests, Language Tests, English (Second Language)

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Psychometric Approaches to Analyzing C-Tests

Peer reviewed

Direct link

Alpizar, David; Li, Tongyun; Norris, John M.; Gu, Lixiong – Language Testing, 2023

The C-test is a type of gap-filling test designed to efficiently measure second language proficiency. The typical C-test consists of several short paragraphs with the second half of every second word deleted. The words with deleted parts are considered as items nested within the corresponding paragraph. Given this testlet structure, it is commonly…

Descriptors: Psychometrics, Language Tests, Second Language Learning, Test Items

"How Do Raters Learn to Rate?" Many-Facet Rasch Modeling of Rater Performance over the Course of a Rater Certification Program

Peer reviewed

Direct link

Yan, Xun; Chuang, Ping-Lin – Language Testing, 2023

This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program.…

Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Certification

Linking Scores from Two Written Receptive English Academic Vocabulary Tests--The VLT-Ac and the AVT

Peer reviewed

Direct link

Warnby, Marcus; Malmström, Hans; Hansen, Kajsa Yang – Language Testing, 2023

The academic section of the Vocabulary Levels Test (VLT-Ac) and the Academic Vocabulary Test (AVT) both assess meaning-recognition knowledge of written receptive academic vocabulary, deemed central for engagement in academic activities. Depending on the purpose and context of the testing, either of the tests can be appropriate, but for research…

Descriptors: Foreign Countries, Scores, Written Language, Receptive Language

Bridging Local Needs and National Standards: Use of Standards-Based Individualized Feedback of an In-House EFL Listening Test in China

Peer reviewed

Direct link

Min, Shangchao; Zhang, Juan; Li, Yue; He, Lianzhen – Language Testing, 2022

Local language tests are an arena where national language standards can be operationalized to create a hub for integrating assessment results and language support. Few studies, however, have examined the operationalization of national standards in local language assessment contexts. In this study, we proposed a model to present the integration of…

Descriptors: Language Tests, Listening Comprehension Tests, Second Language Learning, English (Second Language)

Developing Individualized Feedback for Listening Assessment: Combining Standard Setting and Cognitive Diagnostic Assessment Approaches

Peer reviewed

Direct link

Min, Shangchao; He, Lianzhen – Language Testing, 2022

In this study, we present the development of individualized feedback for a large-scale listening assessment by combining standard setting and cognitive diagnostic assessment (CDA) approaches. We used the performance data from 3,358 students' item-level responses to a field test of a national EFL test primarily intended for tertiary-level EFL…

Descriptors: Feedback (Response), Second Language Learning, Second Language Instruction, English (Second Language)

Developing Tools for Learning Oriented Assessment of Interactional Competence: Bridging Theory and Practice

Peer reviewed

Direct link

May, Lyn; Nakatsuhara, Fumiyo; Lam, Daniel; Galaczi, Evelina – Language Testing, 2020

In this paper we report on a project in which we developed tools to support the classroom assessment of learners' interactional competence (IC) and provided learning oriented feedback in the context of preparation for a high-stakes face-to-face speaking test. Six trained examiners provided stimulated verbal reports (n = 72) on 12 paired…

Descriptors: Intercultural Communication, High Stakes Tests, Feedback (Response), Evaluators

Examining the Effects of Different English Speech Varieties on an L2 Academic Listening Comprehension Test at the Item Level

Peer reviewed

Direct link

Shin, Sun-Young; Lee, Senyung; Lidster, Ryan – Language Testing, 2021

In this study we investigated the potential for a shared-first-language (shared-L1) effect on second language (L2) listening test scores using differential item functioning (DIF) analyses. We did this in order to understand how accented speech may influence performance at the item level, while controlling for key variables including listening…

Descriptors: Listening Comprehension Tests, Language Tests, Native Language, Scores

IRT-Based Classification Analysis of an English Language Reading Proficiency Subtest

Peer reviewed

Direct link

Kaya, Elif; O'Grady, Stefan; Kalender, Ilker – Language Testing, 2022

Language proficiency testing serves an important function of classifying examinees into different categories of ability. However, misclassification is to some extent inevitable and may have important consequences for stakeholders. Recent research suggests that classification efficacy may be enhanced substantially using computerized adaptive…

Descriptors: Item Response Theory, Test Items, Language Tests, Classification

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

He, Lianzhen	3
Min, Shangchao	3
Bachman, Lyle F.	2
Boldt, Robert F.	2
Knoch, Ute	2
Pill, John	2
Alanen, Riikka	1
Alderson, J. Charles	1
Alpizar, David	1
Ann Tai Choe	1
Aryadoust, Vahid	1
Bae, Jungok	1
Barkhuizen, Gary	1
Batty, Aaron Olaf	1
Cai, Yuyang	1
Campfield, Dorota E.	1
Carr, Nathan T.	1
Chodorow, Martin	1
Choi, Hyeran	1
Choi, Ikkyu	1
Choi, Inn-Chull	1
Chuang, Ping-Lin	1
Daniel Holden	1
Daniel R. Isbell	1
Eckes, Thomas	1
More ▼