ERIC - Search Results

Publication Date

In 2026	0
Since 2025	6
Since 2022 (last 5 years)	21
Since 2017 (last 10 years)	52
Since 2007 (last 20 years)	93

Descriptor

Language Tests	87
Test Reliability	71
Second Language Learning	69
English (Second Language)	54
Foreign Countries	46
Test Validity	42
Interrater Reliability	39
Language Proficiency	37
Scores	32
Evaluators	22
Comparative Analysis	21
Correlation	20
Scoring	20
Test Construction	19
Item Response Theory	18
Testing	18
Rating Scales	16
Reliability	15
Writing Evaluation	15
Oral Language	13
Writing Tests	12
High Stakes Tests	11
Reading Comprehension	11
Second Language Instruction	11
Secondary School Students	11
More ▼

Source

Language Testing

120

Publication Type

Journal Articles	120
Reports - Research	84
Reports - Evaluative	23
Reports - Descriptive	9
Information Analyses	6
Tests/Questionnaires	5
Opinion Papers	3
Speeches/Meeting Papers	1

Education Level

Higher Education	23
Postsecondary Education	16
Secondary Education	12
Elementary Education	6
Elementary Secondary Education	4
Junior High Schools	3
Middle Schools	3
High Schools	2
Adult Education	1
Early Childhood Education	1
Grade 12	1
Grade 6	1
Grade 7	1
Intermediate Grades	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Location

China	7
Netherlands	7
Finland	4
Germany	4
Australia	3
Japan	3
South Korea	3
Canada	2
France	2
Hong Kong	2
Taiwan	2
United Kingdom	2
Arizona	1
Austria	1
Bulgaria	1
China (Guangzhou)	1
Colombia	1
Denmark	1
Europe	1
Georgia	1
Hawaii	1
Illinois	1
Illinois (Urbana)	1
India	1
Indiana	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	10
ACTFL Oral Proficiency…	1
English Proficiency Test	1
Graduate Record Examinations	1
International English…	1
Peabody Picture Vocabulary…	1
Test of Written English	1

What Works Clearinghouse Rating

Language Testing X

Showing 31 to 45 of 120 results Save | Export

The Longitudinal Stability of Rating Characteristics in an EFL Examination: Methodological and Substantive Considerations

Peer reviewed

Direct link

Lamprianou, Iasonas; Tsagari, Dina; Kyriakou, Nansia – Language Testing, 2021

This longitudinal study (2002-2014) investigates the stability of rating characteristics of a large group of raters over time in the context of the writing paper of a national high-stakes examination. The study uses one measure of rater severity and two measures of rater consistency. The results suggest that the rating characteristics of…

Descriptors: Longitudinal Studies, Evaluators, High Stakes Tests, Writing Evaluation

Monitoring the Performance of Human and Automated Scores for Spoken Responses

Peer reviewed

Direct link

Wang, Zhen; Zechner, Klaus; Sun, Yu – Language Testing, 2018

As automated scoring systems for spoken responses are increasingly used in language assessments, testing organizations need to analyze their performance, as compared to human raters, across several dimensions, for example, on individual items or based on subgroups of test takers. In addition, there is a need in testing organizations to establish…

Descriptors: Automation, Scoring, Speech Tests, Language Tests

Examining the L2 Reading Comprehension Ability of Adult ELLs: Developing a Diagnostic Test within the Cognitive Diagnostic Assessment Framework

Peer reviewed

Direct link

Toprak, Tugba Elif; Cakir, Abdulvahit – Language Testing, 2021

Cognitive diagnostic assessment (CDA) has been applied to language assessment in a number of studies in which a diagnostic classification model (DCM) was retrofitted to the results of a non-diagnostic assessment. However, the need to apply CDA through utilization of an inductive rather than a retrofitted approach has been a recurrent theme in…

Descriptors: English (Second Language), Second Language Learning, Undergraduate Students, Young Adults

Validity Evidence for a Sentence Repetition Test of Swiss German Sign Language

Peer reviewed

Direct link

Haug, Tobias; Batty, Aaron Olaf; Venetz, Martin; Notter, Christa; Girard-Groeber, Simone; Knoch, Ute; Audeoud, Mireille – Language Testing, 2020

In this study we seek evidence of validity according to the socio-cognitive framework (Weir, 2005) for a new sentence repetition test (SRT) for young Deaf L1 Swiss German Sign Language (DSGS) users. SRTs have been developed for various purposes for both spoken and sign languages to assess language development in children. In order to address the…

Descriptors: Foreign Countries, Language Tests, Sentences, Repetition

A Systematic Review of Methods for Evaluating Rating Quality in Language Assessment

Peer reviewed

Direct link

Wind, Stefanie A.; Peterson, Meghan E. – Language Testing, 2018

The use of assessments that require rater judgment (i.e., rater-mediated assessments) has become increasingly popular in high-stakes language assessments worldwide. Using a systematic literature review, the purpose of this study is to identify and explore the dominant methods for evaluating rating quality within the context of research on…

Descriptors: Language Tests, Evaluators, Evaluation Methods, Interrater Reliability

Measuring the Development of General Language Skills in English as a Foreign Language--Longitudinal Invariance of the C-Test

Peer reviewed

Direct link

Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023

Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies

Cloze Testing for Comprehension Assessment: The HyTeC-cloze

Peer reviewed

Direct link

Kleijn, Suzanne; Pander Maat, Henk; Sanders, Ted – Language Testing, 2019

Although there are many methods available for assessing text comprehension, the cloze test is not widely acknowledged as one of them. Critiques on cloze testing center on its supposedly limited ability to measure comprehension beyond the sentence. However, these critiques do not hold for all types of cloze tests; the particular configuration of a…

Descriptors: Cloze Procedure, Language Tests, Semantics, Scoring

Test Review: TestDaF

Peer reviewed

Direct link

Norris, John; Drackert, Anastasia – Language Testing, 2018

The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…

Descriptors: German, Second Language Learning, Language Tests, Language Proficiency

Professional and Non-Professional Raters' Responsiveness to Fluency and Accuracy in L2 Speech: An Experimental Approach

Peer reviewed

Direct link

Duijm, Klaartje; Schoonen, Rob; Hulstijn, Jan H. – Language Testing, 2018

It is general practice to use rater judgments in speaking proficiency testing. However, it has been shown that raters' knowledge and experience may influence their ratings, both in terms of leniency and varied focus on different aspects of speech. The purpose of this study is to identify raters' relative responsiveness to fluency and linguistic…

Descriptors: Language Fluency, Accuracy, Second Languages, Language Tests

ACTFL Oral Proficiency Interview -- Computer (OPIc)

Peer reviewed

Direct link

Isbell, Dan; Winke, Paula – Language Testing, 2019

The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…

Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning

Mapping the Fluctuating Effect of Strategy Use Ability on English Reading Performance for Nursing Students: A Multi-Layered Moderation Analysis Approach

Peer reviewed

Direct link

Cai, Yuyang; Kunnan, Antony John – Language Testing, 2020

An essential hypothesis of modern language assessment theory pertains to the interaction between strategy use ability (strategic competence) and second language knowledge. However, how they interact with each other is rarely explored. Drawing on relevant research in the literature, in this paper we proposed three interaction patterns (i.e.,…

Descriptors: English (Second Language), Second Language Learning, Nursing Education, Reading Tests

Development and Validation of a Chinese Character Acquisition Assessment for Second-Language Kindergarteners

Peer reviewed

Direct link

Chan, Stephanie W. Y.; Cheung, Wai Ming; Huang, Yanli; Lam, Wai-Ip; Lin, Chin-Hsi – Language Testing, 2020

Demand for second-language (L2) Chinese education for kindergarteners has grown rapidly, but little is known about these kindergarteners' L2 skills, with existing studies focusing on school-age populations and alphabetic languages. Accordingly, we developed a six-subtest Chinese character acquisition assessment to measure L2 kindergarteners'…

Descriptors: Chinese, Second Language Learning, Second Language Instruction, Written Language

Evaluating Subscore Uses across Multiple Levels: A Case of Reading and Listening Subscores for Young EFL Learners

Peer reviewed

Direct link

Choi, Ikkyu; Papageorgiou, Spiros – Language Testing, 2020

Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores

Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring

Peer reviewed

Direct link

Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013

Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…

Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests

A Comparison of Reliability and Precision of Subscore Reporting Methods for a State English Language Proficiency Assessment

Peer reviewed

Direct link

Longabach, Tanya; Peyton, Vicki – Language Testing, 2018

K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…

Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Knoch, Ute	4
Alderson, J. Charles	2
Aryadoust, Vahid	2
Attali, Yigal	2
Brown, James Dean	2
Chapelle, Carol A.	2
Deygers, Bart	2
Elder, Catherine	2
Haug, Tobias	2
Iasonas Lamprianou	2
Jarvis, Scott	2
Kunnan, Antony John	2
Lee, Yong-Won	2
Lin, Chih-Kai	2
Reeta Neittaanmäki	2
Schoonen, Rob	2
Shin, Sun-Young	2
Stansfield, Charles W.	2
Wind, Stefanie A.	2
Winke, Paula	2
Yan, Xun	2
de Jong, Nivja H.	2
Alanen, Riikka	1
Allan, Alistair	1
More ▼