Publication Date
| In 2026 | 0 |
| Since 2025 | 24 |
| Since 2022 (last 5 years) | 141 |
| Since 2017 (last 10 years) | 302 |
| Since 2007 (last 20 years) | 476 |
Descriptor
| Language Tests | 699 |
| Test Items | 699 |
| English (Second Language) | 422 |
| Second Language Learning | 388 |
| Foreign Countries | 331 |
| Second Language Instruction | 213 |
| Test Construction | 204 |
| Language Proficiency | 201 |
| Item Analysis | 170 |
| Difficulty Level | 137 |
| Scores | 131 |
| More ▼ | |
Source
Author
| Baghaei, Purya | 9 |
| Stansfield, Charles W. | 8 |
| Alonzo, Julie | 7 |
| Anderson, Daniel | 7 |
| Park, Bitnara Jasmine | 7 |
| Tindal, Gerald | 7 |
| Perkins, Kyle | 6 |
| Ravand, Hamdollah | 5 |
| Aryadoust, Vahid | 4 |
| Brown, James Dean | 4 |
| Coniam, David | 4 |
| More ▼ | |
Publication Type
Education Level
Audience
| Practitioners | 29 |
| Teachers | 28 |
| Researchers | 10 |
| Students | 6 |
| Administrators | 4 |
Location
| Iran | 42 |
| China | 33 |
| Japan | 29 |
| Canada | 24 |
| Turkey | 21 |
| South Korea | 12 |
| Germany | 11 |
| Taiwan | 10 |
| Europe | 9 |
| Thailand | 8 |
| Hong Kong | 7 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 3 |
| Elementary and Secondary… | 1 |
| Elementary and Secondary… | 1 |
| Lau v Nichols | 1 |
| Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Ikkyu Choi; Jiyun Zu – Language Testing, 2025
Today's language models can produce syntactically accurate and semantically coherent texts. This capability presents new opportunities for generating content for language assessments, which have traditionally required intensive expert resources. However, these models are also known to generate biased texts, leading to representational harms.…
Descriptors: Artificial Intelligence, Language Tests, Test Bias, Test Construction
Linh Thi Thao Le; Nam Thi Phuong Ho; Nguyen Huynh Trang; Hung Tan Ha – SAGE Open, 2025
The International English Language Testing System (IELTS) has served as one of the most reliable proofs of people's English language proficiency. There have been rumors about the discrepancy in difficulty between the two modules of IELTS, namely Academic (AC) and General Training (GT); however, there is little empirical evidence to confirm such a…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Reading Tests
Ildiko Porter-Szucs; Cynthia J. Macknish; Suzanne Toohey – John Wiley & Sons, Inc, 2025
"A Practical Guide to Language Assessment" helps educators at every level redefine their approach to language assessment. Grounded in extensive research and aligned with the latest advances in language education, this comprehensive guide introduces foundational concepts and explores key principles in test development and item writing.…
Descriptors: Student Evaluation, Language Tests, Test Construction, Test Items
Jiawei Xiong; George Engelhard; Allan S. Cohen – Measurement: Interdisciplinary Research and Perspectives, 2025
It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational…
Descriptors: Responses, Test Items, Test Format, Grade 8
Kaja Haugen; Cecilie Hamnes Carlsen; Christine Möller-Omrani – Language Awareness, 2025
This article presents the process of constructing and validating a test of metalinguistic awareness (MLA) for young school children (age 8-10). The test was developed between 2021 and 2023 as part of the MetaLearn research project, financed by The Research Council of Norway. The research team defines MLA as using metalinguistic knowledge at a…
Descriptors: Language Tests, Test Construction, Elementary School Students, Metalinguistics
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025
This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…
Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Xueliang Chen; Vahid Aryadoust; Wenxin Zhang – Language Testing, 2025
The growing diversity among test takers in second or foreign language (L2) assessments makes the importance of fairness front and center. This systematic review aimed to examine how fairness in L2 assessments was evaluated through differential item functioning (DIF) analysis. A total of 83 articles from 27 journals were included in a systematic…
Descriptors: Second Language Learning, Language Tests, Test Items, Item Analysis
Paula Elosua – Language Assessment Quarterly, 2024
In sociolinguistic contexts where standardized languages coexist with regional dialects, the study of differential item functioning is a valuable tool for examining certain linguistic uses or varieties as threats to score validity. From an ecological perspective, this paper describes three stages in the study of differential item functioning…
Descriptors: Reading Tests, Reading Comprehension, Scores, Test Validity
Al Lawati, Zahra Ali – Language Testing in Asia, 2023
This study discusses the characteristics of test specifications (specs) and item writer guidelines (IWGs), their role in item development of English as a Second Language (ESL) reading tests, and the use of the CEFR for specs development. This mixed-method study analyzed specs, IWGs, tests, and the Pearson Test of English General test statistics.…
Descriptors: Language Tests, Test Items, Test Construction, English (Second Language)
Dongkwang Shin; Jang Ho Lee – ELT Journal, 2024
Although automated item generation has gained a considerable amount of attention in a variety of fields, it is still a relatively new technology in ELT contexts. Therefore, the present article aims to provide an accessible introduction to this powerful resource for language teachers based on a review of the available research. Particularly, it…
Descriptors: Language Tests, Artificial Intelligence, Test Items, Automation
Jeong-eun Kim – English Teaching, 2025
This study investigated the thematic and lexical characteristics of high-difficulty English reading items--commonly referred to as "killer questions"--in the Korean College Scholastic Ability Test (CSAT) between 2018 and 2025. Using text mining methods, including Latent Dirichlet Allocation (LDA) and CEFR-based lexical profiling, the…
Descriptors: English (Second Language), Difficulty Level, Test Items, Questioning Techniques
Apichat Khamboonruang – Language Testing in Asia, 2025
Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…
Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction

Peer reviewed
Direct link
