ERIC - Search Results

Publication Date

In 2026	0
Since 2025	4
Since 2022 (last 5 years)	20
Since 2017 (last 10 years)	52
Since 2007 (last 20 years)	98

Descriptor

Comparative Analysis	149
English (Second Language)	149
Second Language Learning	100
Foreign Countries	83
Language Tests	69
Second Language Instruction	68
Test Reliability	61
Reliability	52
Language Proficiency	41
Interrater Reliability	39
Scores	36
Test Validity	36
Statistical Analysis	29
Teaching Methods	29
College Students	28
Correlation	26
Validity	24
Scoring	23
Writing Evaluation	23
Computer Assisted Testing	18
Essays	18
Evaluators	18
Pretests Posttests	18
Language Teachers	16
Questionnaires	16
More ▼

Publication Type

Reports - Research	112
Journal Articles	109
Tests/Questionnaires	17
Reports - Evaluative	16
Speeches/Meeting Papers	16
Reports - Descriptive	6
Books	3
Dissertations/Theses -…	3
Information Analyses	3
Collected Works - General	2
Guides - Non-Classroom	2
Book/Product Reviews	1
Collected Works - Proceedings	1
Collected Works - Serials	1
Dissertations/Theses -…	1
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	50
Postsecondary Education	40
Elementary Education	15
Secondary Education	14
High Schools	9
Elementary Secondary Education	6
Early Childhood Education	5
Preschool Education	4
Adult Education	3
Grade 11	3
Grade 6	3
Grade 8	3
Intermediate Grades	3
Middle Schools	3
Grade 10	2
Grade 12	2
Junior High Schools	2
Primary Education	2
Adult Basic Education	1
Grade 2	1
Grade 4	1
Grade 7	1
Grade 9	1
Kindergarten	1
Two Year Colleges	1
More ▼

Audience

Teachers	5
Practitioners	4
Researchers	2
Administrators	1

Location

Iran	13
China	9
Japan	8
Turkey	7
Taiwan	4
United States	4
Egypt	3
Hong Kong	3
Pakistan	3
Saudi Arabia	3
Australia	2
California	2
Canada	2
Denmark	2
Europe	2
Germany	2
Indonesia	2
Israel	2
Netherlands	2
Philippines	2
South Korea	2
Spain	2
Thailand	2
Turkey (Ankara)	2
Vietnam	2
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Test of English as a Foreign…	10
International English…	3
Graduate Record Examinations	2
Wide Range Achievement Test	2
Constructivist Learning…	1
English Proficiency Test	1
Expressive One Word Picture…	1
Gates MacGinitie Reading Tests	1
Graduate Management Admission…	1
Mean Length of Utterance	1
Michigan Test of English…	1
Peabody Picture Vocabulary…	1
SAT (College Admission Test)	1
Torrance Tests of Creative…	1
Wechsler Intelligence Scale…	1
Woodcock Johnson Tests of…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 149 results Save | Export

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

Utilizing Large Language Models for EFL Essay Grading: An Examination of Reliability and Validity in Rubric-Based Assessments

Peer reviewed

Direct link

Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025

This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Effective Vocabulary Interventions for Young Emergent Bilinguals: A Best-Evidence Synthesis

Peer reviewed

Direct link

Alain Bengochea; Sabrina F. Sembiante – Review of Education, 2024

This best-evidence synthesis appraises the design and outcome characteristics of vocabulary intervention studies conducted with preschool through 6th grade emergent bilingual (EB) children and spotlights rigorously designed studies for which effects could be better attributed to instructional features. Twenty-nine selected studies were analysed…

Descriptors: Bilingualism, Vocabulary Development, Intervention, Comparative Analysis

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Estimating the Impact of Local Item Dependency in a Test of Second Language Reading Comprehension

Peer reviewed
PDF on ERIC

Download full text

Tim Stoeckel; Liang Ye Tan; Hung Tan Ha; Nam Thi Phuong Ho; Tomoko Ishii; Young Ae Kim; Chunmei Huang; Stuart McLean – Vocabulary Learning and Instruction, 2024

Local item dependency (LID) occurs when test-takers' responses to one test item are affected by their responses to another. It can be problematic if it causes inflated reliability estimates or distorted person and item measures. The cued-recall reading comprehension test in Hu and Nation's (2000) well-known and influential coverage--comprehension…

Descriptors: Reading Comprehension, English (Second Language), Second Language Instruction, Second Language Learning

The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2

Peer reviewed
PDF on ERIC

Download full text

Osama Koraishi – Language Teaching Research Quarterly, 2024

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Depth-Perception-Based Representation in Holistic Rating on ESL Essay Writing

Peer reviewed

Direct link

Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024

This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…

Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy

Rubric Rating with MFRM versus Randomly Distributed Comparative Judgment: A Comparison of Two Approaches to Second-Language Writing Assessment

Peer reviewed

Direct link

Sims, Maureen E.; Cox, Troy L.; Eckstein, Grant T.; Hartshorn, K. James; Wilcox, Matthew P.; Hart, Judson M. – Educational Measurement: Issues and Practice, 2020

The purpose of this study is to explore the reliability of a potentially more practical approach to direct writing assessment in the context of ESL writing. Traditional rubric rating (RR) is a common yet resource-intensive evaluation practice when performed reliably. This study compared the traditional rubric model of ESL writing assessment and…

Descriptors: Scoring Rubrics, Item Response Theory, Second Language Learning, English (Second Language)

The Effects of Multimodal Teaching on English Vocabulary Knowledge of Thai Primary School Students

Peer reviewed
PDF on ERIC

Download full text

Kasikarn Bansong; Somkiet Poopatwiboon; Apisak Sukying – Journal of Education and Learning, 2023

It is increasingly prevalent in digital learning and teaching strategies for discerning a global perspective on creating the student learning experience. Multimodality is an emergent phenomenon that may influence how digital learning is designed, especially during the COVID-19 pandemic in which immersive learning environments, such as a virtual…

Descriptors: Elementary School Students, English (Second Language), Second Language Learning, Second Language Instruction

Impacts of ChatGPT-Assisted Writing for EFL English Majors: Feasibility and Challenges

Peer reviewed

Direct link

Chung-You Tsai; Yi-Ti Lin; Iain Kelsall Brown – Education and Information Technologies, 2024

To determine the impacts of using ChatGPT to assist English as a foreign language (EFL) English college majors in revising essays and the possibility of leading to higher scores and potentially causing unfairness. A prospective, double-blinded, paired-comparison study was conducted in Feb. 2023. A total of 44 students provided 44 original essays…

Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, English (Second Language)

Can Didactic Audiovisual Translation Enhance Intercultural Learning through CALL? Validity and Reliability of a Students' Questionnaire

Peer reviewed

Direct link

Pilar Rodríguez-Arancón; María Bobadilla-Pérez; Alberto Fernández-Costales – Journal for Multicultural Education, 2024

Purpose: This study aims to delve into the interplay between didactic audiovisual translation (DAT) and computer-assisted language learning (CALL), exploring their combined impact on the development of intercultural competence (IC) among learners of English as a foreign language (EFL). Design/methodology/approach: Using a quasi-experimental…

Descriptors: Translation, Teaching Methods, Second Language Learning, Second Language Instruction

Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill

Peer reviewed
PDF on ERIC

Download full text

Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022

In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…

Descriptors: Evaluators, Training, Comparative Analysis, Academic Language

Elicited Imitation as a Measure of L2 Proficiency: New Insights from a Comparison of Two L2 English Parallel Forms

Peer reviewed

Direct link

Wu, Shu-Ling; Tio, Yee Pin; Ortega, Lourdes – Studies in Second Language Acquisition, 2022

Elicited imitation (EI), a short-cut measure of global proficiency in second language (L2) research, requires participants to listen to sentences and repeat them as closely as possible. To support instrument sharing and assessment of L2 proficiency for longitudinal and crosslinguistic research, we created a parallel form of an EI task (EIT) for L2…

Descriptors: Imitation, Second Language Learning, Second Language Instruction, Language Proficiency

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Language Testing	11
English Language Teaching	7
Language Assessment Quarterly	5
Online Submission	4
ETS Research Report Series	3
ProQuest LLC	3
Assessing Writing	2
Assessment in Education:…	2
Cogent Education	2
ELT Journal	2
EURASIA Journal of…	2
English Teaching	2
International Journal of…	2
International Journal of…	2
International Journal of…	2
Journal of Education and…	2
Journal of Language and…	2
Journal of Speech, Language,…	2
Language Learning	2
Language Learning Journal	2
Language Testing in Asia	2
ReCALL	2
TESOL International Journal	2
TESOL Quarterly: A Journal…	2
Action in Teacher Education	1
More ▼

Coniam, David	3
Kunnan, Antony John	3
Attali, Yigal	2
August, Diane	2
Henning, Grant	2
Nakamura, Yuji	2
Abdel-Haq, Eman Muhammad	1
Adams, R. J.	1
Ahmadi Shirazi, Masoumeh	1
Ahmadi, Alireza	1
Ahmed, Tamim	1
Ahn, Jieun Irene	1
Ahour, Touran	1
Akinwamide, Timothy Kolade	1
Al-Jafar, Ali A.	1
Al-Sayed, Rania Kamal Muhammad	1
Al-Yousefi, Zainab H.	1
Alain Bengochea	1
Alamprese, Judith A.	1
Alberto Fernández-Costales	1
Alderson, J. Charles, Ed.	1
Alhaisoni, Eid	1
Alharthi, Saleh	1
Ali, Mahsoub Abdel-Sadeq	1
More ▼