NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Ping-Lin Chuang – Language Testing, 2025
This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…
Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kevin Hirschi; Okim Kang – Language Teaching Research Quarterly, 2023
This paper extends the use of Generalizability Theory to the measurement of extemporaneous L2 speech through the lens of speech perception. Using six datasets of previous studies, it reports on "G studies"--a method of breaking down measurement variance--and "D studies"--a predictive study of the impact on reliability when…
Descriptors: Evaluators, Generalization, Evaluation Methods, Speech Communication
Peer reviewed Peer reviewed
Direct linkDirect link
Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023
Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…
Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sumner, Josh – Research-publishing.net, 2021
Comparative Judgement (CJ) has emerged as a technique that typically makes use of holistic judgement to assess difficult-to-specify constructs such as production (speaking and writing) in Modern Foreign Languages (MFL). In traditional approaches, markers assess candidates' work one-by-one in an absolute manner, assigning scores to different…
Descriptors: Holistic Approach, Student Evaluation, Comparative Analysis, Decision Making
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Thai, Thuy; Sheehan, Susan – Language Education & Assessment, 2022
In language performance tests, raters are important as their scoring decisions determine which aspects of performance the scores represent; however, raters are considered as one of the potential sources contributing to unwanted variability in scores (Davis, 2012). Although a great number of studies have been conducted to unpack how rater…
Descriptors: Rating Scales, Speech Communication, Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Linlin, Cao – English Language Teaching, 2020
Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…
Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Derrick, Deirdre J. – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2016
Second language (L2) researchers often have to develop or change the instruments they use to measure numerous constructs (Norris & Ortega, 2012). Given the prevalence of researcher-developed and -adapted data collection instruments, and given the profound effect instrumentation can have on results, thorough reporting of instrumentation is…
Descriptors: Second Language Learning, Language Research, Research Methodology, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017
The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…
Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse
Peer reviewed Peer reviewed
Direct linkDirect link
Kabuto, Bobbie – Reading Horizons, 2016
Through the presentation of two bilingual reader profiles, this article will illustrate how miscue analysis can act as a culturally relevant assessment tool as it allows for the study of reading across different spoken and written languages. The research presented in this article integrates a socio-psycholinguistic perspective to reading and a…
Descriptors: Sociolinguistics, Psycholinguistics, Miscue Analysis, Code Switching (Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Coniam, David – ReCALL, 2009
This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…
Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability
Peer reviewed Peer reviewed
Jafarpur, Abdoljavad – System, 1988
Investigation of non-native English speakers' ratings of other non-native English learners' oral proficiency. Results indicate that the judges' ratings significantly differed, and the average of three judges' ratings was a better appraisal of the testee's true ability than that of any single rating or pair of ratings. (Author/CB)
Descriptors: English (Second Language), Evaluation Methods, Foreign Countries, Interrater Reliability
Takala, Sauli – 1998
This paper discusses recent developments in language testing. It begins with a review of the traditional criteria that are applied to all measurement and outlines recent emphases that derive from the expanding range of stakeholders. Drawing on Alderson's seminal work, criteria are presented for evaluating communicative language tests. Developments…
Descriptors: Alternative Assessment, Communicative Competence (Languages), Comparative Analysis, Evaluation Criteria
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zechner, Klaus; Bejar, Isaac I.; Hemat, Ramin – ETS Research Report Series, 2007
The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency. In this investigation, we evaluated the feasibility of using an off-the-shelf speech-recognition system for scoring speaking prompts from the LanguEdge field test of 2002. We first…
Descriptors: Role, Computer Assisted Testing, Language Proficiency, Oral Language
Carlson, Sybil B.; And Others – 1985
Four writing samples were obtained from 638 foreign college applicants who represented three major foreign language groups (Arabic, Chinese, and Spanish), and from 60 native English speakers. All four were scored holistically, two were also scored for sentence-level and discourse-level skills, and some were scored by the Writer's Workbench…
Descriptors: Arabic, Chinese, College Entrance Examinations, Computer Software
International Association for Development of the Information Society, 2012
The IADIS CELDA 2012 Conference intention was to address the main issues concerned with evolving learning processes and supporting pedagogies and applications in the digital age. There had been advances in both cognitive psychology and computing that have affected the educational arena. The convergence of these two disciplines is increasing at a…
Descriptors: Academic Achievement, Academic Persistence, Academic Support Services, Access to Computers