NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 20250
Since 2022 (last 5 years)0
Since 2017 (last 10 years)11
Since 2007 (last 20 years)31
Source
Language Testing35
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 35 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Longabach, Tanya; Peyton, Vicki – Language Testing, 2018
K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…
Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018
This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…
Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Eckes, Thomas – Language Testing, 2017
This paper presents an approach to standard setting that combines the prototype group method (PGM; Eckes, 2012) with a receiver operating characteristic (ROC) analysis. The combined PGM-ROC approach is applied to setting cut scores on a placement test of English as a foreign language (EFL). To implement the PGM, experts first named learners whom…
Descriptors: English (Second Language), Language Tests, Cutting Scores, Standard Setting (Scoring)
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Hongli; Hunter, C. Vincent; Lei, Pui-Wa – Language Testing, 2016
Cognitive diagnostic models (CDMs) have great promise for providing diagnostic information to aid learning and instruction, and a large number of CDMs have been proposed. However, the assumptions and performances of different CDMs and their applications in regard to reading comprehension tests are not fully understood. In the present study, we…
Descriptors: Reading Comprehension, Reading Tests, Models, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
McCray, Gareth; Brunfaut, Tineke – Language Testing, 2018
This study investigates test-takers' processing while completing banked gap-fill tasks, designed to test reading proficiency, in order to test theoretically based expectations about the variation in cognitive processes of test-takers across levels of performance. Twenty-eight test-takers' eye traces on 24 banked gap-fill items (on six tasks) were…
Descriptors: Language Tests, Test Items, Item Analysis, Eye Movements
Peer reviewed Peer reviewed
Direct linkDirect link
Davis, Larry – Language Testing, 2016
Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…
Descriptors: Evaluators, Oral Language, Scores, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Shinhye; Winke, Paula – Language Testing, 2018
We investigated how young language learners process their responses on and perceive a computer-mediated, timed speaking test. Twenty 8-, 9-, and 10-year-old non-native English-speaking children (NNSs) and eight same-aged, native English-speaking children (NSs) completed seven computerized sample TOEFL® Primary™ speaking test tasks. We investigated…
Descriptors: Elementary School Students, Second Language Learning, Responses, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Shin, Sun-Young; Lidster, Ryan – Language Testing, 2017
In language programs, it is crucial to place incoming students into appropriate levels to ensure that course curriculum and materials are well targeted to their learning needs. Deciding how and where to set cutscores on placement tests is thus of central importance to programs, but previous studies in educational measurement disagree as to which…
Descriptors: Language Tests, English (Second Language), Standard Setting (Scoring), Student Placement
Peer reviewed Peer reviewed
Direct linkDirect link
Kyle, Kristopher; Crossley, Scott – Language Testing, 2017
Over the past 45 years, the construct of syntactic sophistication has been assessed in L2 writing using what Bulté and Housen (2012) refer to as absolute complexity (Lu, 2011; Ortega, 2003; Wolfe-Quintero, Inagaki, & Kim, 1998). However, it has been argued that making inferences about learners based on absolute complexity indices (e.g., mean…
Descriptors: Syntax, Verbs, Second Language Learning, Word Frequency
Peer reviewed Peer reviewed
Direct linkDirect link
LaFlair, Geoffrey T.; Staples, Shelley – Language Testing, 2017
Investigations of the validity of a number of high-stakes language assessments are conducted using an argument-based approach, which requires evidence for inferences that are critical to score interpretation (Chapelle, Enright, & Jamieson, 2008b; Kane, 2013). The current study investigates the extrapolation inference for a high-stakes test of…
Descriptors: Computational Linguistics, Language Tests, Test Validity, Inferences
Peer reviewed Peer reviewed
Direct linkDirect link
Trace, Jonathan; Brown, James Dean; Janssen, Gerriet; Kozhevnikova, Liudmila – Language Testing, 2017
Cloze tests have been the subject of numerous studies regarding their function and use in both first language and second language contexts (e.g., Jonz & Oller, 1994; Watanabe & Koyama, 2008). From a validity standpoint, one area of investigation has been the extent to which cloze tests measure reading ability beyond the sentence level.…
Descriptors: Cloze Procedure, Language Tests, Test Items, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Kyle, Kristopher; Crossley, Scott A.; McNamara, Danielle S. – Language Testing, 2016
This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…
Descriptors: Construct Validity, Natural Language Processing, Speech Skills, Speech Acts
Peer reviewed Peer reviewed
Direct linkDirect link
Campfield, Dorota E. – Language Testing, 2017
This paper reports a post-hoc analysis of the influence of lexical difficulty of cue sentences on performance in an elicited imitation (EI) task to assess oral production skills for 645 child L2 English learners in instructional settings. This formed part of a large-scale investigation into effectiveness of foreign language teaching in Polish…
Descriptors: Difficulty Level, Second Language Learning, Second Language Instruction, Elementary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Jarvis, Scott – Language Testing, 2017
The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct…
Descriptors: Computational Linguistics, English (Second Language), Second Language Learning, Native Speakers
Peer reviewed Peer reviewed
Direct linkDirect link
O'Hagan, Sally; Pill, John; Zhang, Ying – Language Testing, 2016
Criticism of specific-purpose language (LSP) tests is often directed at their limited ability to represent fully the demands of the target language use situation. Such criticisms extend to the criteria used to assess test performance, which may fail to capture what matters to participants in the domain of interest. This paper reports on the…
Descriptors: Health Personnel, Language Tests, English for Special Purposes, Criticism
Previous Page | Next Page »
Pages: 1  |  2  |  3