NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Practitioners1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 113 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Alan Huebner; Gustaf B. Skar; Mengchen Huang – Practical Assessment, Research & Evaluation, 2025
Generalizability theory is a modern and powerful framework for conducting reliability analyses. It is flexible to accommodate both random and fixed facets. However, there has been a relative scarcity in the practical literature on how to handle the fixed facet case. This article aims to provide practitioners a conceptual understanding and…
Descriptors: Generalizability Theory, Multivariate Analysis, Statistical Analysis, Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Joan Li; Nikhil Kumar Jangamreddy; Ryuto Hisamoto; Ruchita Bhansali; Amalie Dyda; Luke Zaphir; Mashhuda Glencross – Australasian Journal of Educational Technology, 2024
Generative artificial intelligence technologies, such as ChatGPT, bring an unprecedented change in education by leveraging the power of natural language processing and machine learning. Employing ChatGPT to assist with marking written assessment presents multiple advantages including scalability, improved consistency, eliminating biases associated…
Descriptors: Higher Education, Artificial Intelligence, Grading, Scoring Rubrics
Peer reviewed Peer reviewed
Direct linkDirect link
Bouwer, Renske; Koster, Monica; van den Bergh, Huub – Assessment in Education: Principles, Policy & Practice, 2023
Assessing students' writing performance is essential to adequately monitor and promote individual writing development, but it is also a challenge. The present research investigates a benchmark rating procedure for assessing texts written by upper-elementary students. In two studies we examined whether a benchmark rating procedure (1) leads to…
Descriptors: Benchmarking, Writing Evaluation, Evaluation Methods, Elementary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022
Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…
Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024
Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…
Descriptors: Scoring, Essays, Writing Evaluation, Memory
Peer reviewed Peer reviewed
Direct linkDirect link
Huiying Cai; Xun Yan – Language Testing, 2024
Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…
Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Pinot de Moira, Anne; Wheadon, Christopher; Christodoulou, Daisy – Research in Education, 2022
Writing is generally assessed internationally using rubric-based approaches, but there is a growing body of evidence to suggest that the reliability of such approaches is poor. In contrast, comparative judgement studies suggest that it is possible to assess open ended tasks such as writing with greater reliability. Many previous studies, however,…
Descriptors: Writing Evaluation, Classification, Accuracy, Scoring Rubrics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Elif Sari – International Journal of Assessment Tools in Education, 2024
Employing G-theory and rater interviews, the study investigated how a high-stakes writing assessment procedure (i.e., a single-task, single-rater, and holistic scoring procedure) impacted the variability and reliability of its scores within the Turkish higher education context. Thirty-two essays written on two different writing tasks (i.e.,…
Descriptors: Foreign Countries, High Stakes Tests, Writing Evaluation, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Paul Deane; Duanli Yan; Katherine Castellano; Yigal Attali; Michelle Lamar; Mo Zhang; Ian Blood; James V. Bruno; Chen Li; Wenju Cui; Chunyi Ruan; Colleen Appel; Kofi James; Rodolfo Long; Farah Qureshi – ETS Research Report Series, 2024
This paper presents a multidimensional model of variation in writing quality, register, and genre in student essays, trained and tested via confirmatory factor analysis of 1.37 million essay submissions to ETS' digital writing service, Criterion®. The model was also validated with several other corpora, which indicated that it provides a…
Descriptors: Writing (Composition), Essays, Models, Elementary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Tahani I. Aldosemani; Hussein Assalahi; Areej Lhothali; Maram Albsisi – International Journal of Computer-Assisted Language Learning and Teaching, 2023
This paper explores the literature on AWE feedback, particularly its perceived impact on enhancing EFL student writing proficiency. Prior research highlighted the contribution of AWE in fostering learner autonomy and alleviating teacher workloads, with a substantial focus on student engagement with AWE feedback. This review strives to illuminate…
Descriptors: Automation, Student Evaluation, Writing Evaluation, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025
This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics
Peer reviewed Peer reviewed
Direct linkDirect link
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Romig, John Elwood; Olsen, Amanda A. – Reading & Writing Quarterly, 2021
Compared to other content areas, there is a dearth of research examining curriculum-based measurement of writing (CBM-W). This study conducted a conceptual replication examining the reliability, stability, and sensitivity to growth of slopes produced from CBM-W. Eighty-nine (N = 89) eighth-grade students responded to one CBM-W probe weekly for 11…
Descriptors: Curriculum Based Assessment, Writing Evaluation, Middle School Students, Grade 8
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Ramon-Casas, Marta; Nuño, Neus; Pons, Ferran; Cunillera, Toni – Assessment & Evaluation in Higher Education, 2019
This article presents an empirical evaluation of the validity and reliability of a peer-assessment activity to improve academic writing competences. Specifically, we explored a large group of psychology undergraduate students with different initial writing skills. Participants (n = 365) produced two different essays, which were evaluated by their…
Descriptors: Peer Evaluation, Validity, Reliability, Writing Skills
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8