ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	8
Since 2017 (last 10 years)	16
Since 2007 (last 20 years)	26

Publication Type

Tests/Questionnaires	38
Reports - Research	32
Journal Articles	24
Speeches/Meeting Papers	6
Reports - Descriptive	3
Books	1
Guides - General	1
Guides - Non-Classroom	1
Information Analyses	1
Reports - Evaluative	1

Education Level

Higher Education	13
Postsecondary Education	13
Elementary Education	5
Secondary Education	4
Elementary Secondary Education	2
Early Childhood Education	1
High Schools	1
Kindergarten	1
Primary Education	1

Audience

Administrators	2
Practitioners	2
Researchers	1
Teachers	1

Location

Iran	4
Japan	2
Australia	1
California	1
China	1
Hawaii	1
Illinois	1
Iran (Tehran)	1
Spain	1
Taiwan	1
Tennessee	1
Turkey	1
United Kingdom	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
Flanders System of…	1
Flesch Kincaid Grade Level…	1
International English…	1
National Assessment of…	1
Praxis Series	1
edTPA (Teacher Performance…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 38 results Save | Export

Scoring Rubric Reliability and Internal Validity in Rater-Mediated EFL Writing Assessment: Insights from Many-Facet Rasch Measurement

Peer reviewed

Direct link

Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022

Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…

Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

The Whole Is More than the Sum of Its Parts -- Assessing Writing Using the Consensual Assessment Technique

Peer reviewed

Direct link

Zahn, Daniela; Canton, Ursula; Boyd, Victoria; Hamilton, Laura; Mamo, Josianne; McKay, Jane; Proudfoot, Linda; Telfer, Dickson; Williams, Kim; Wilson, Colin – Studies in Higher Education, 2021

Evaluating the impact of Academic Literacies teaching (Lea and Street [1998. "Student Writing in Higher Education: An Academic Literacies Approach." "Studies in Higher Education" 23 (2): 157-72. doi:10.1080/03075079812331380364]) is difficult, as it involves gauging whether writers: (1) gain better understanding of what…

Descriptors: Writing Evaluation, Evaluation Methods, Undergraduate Students, Foreign Countries

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

Do You Mean What I Mean? Comparing Teacher Performance Self-Scores and Evaluator-Generated Scores

Peer reviewed

Direct link

Hunter, Seth B. – Journal of Education Human Resources, 2023

Teacher performance scores inform education leaders' management of teacher human resources. However, prior research has implied that different interpretations of performance criteria between teachers and their evaluators suppress teacher development. Although research has examined teacher perceptions of performance scores and compared teacher…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Self Evaluation (Individuals), Interrater Reliability

Using Rasch Analysis to Examine Raters' Expertise Turkish Teacher Candidates' Competency Levels in Writing Different Types of Test Items

Peer reviewed
PDF on ERIC

Download full text

Sayin, Ayfer; Sata, Mehmet – International Journal of Assessment Tools in Education, 2022

The aim of the present study was to examine Turkish teacher candidates' competency levels in writing different types of test items by utilizing Rasch analysis. In addition, the effect of the expertise of the raters scoring the items written by the teacher candidates was examined within the scope of the study. 84 Turkish teacher candidates…

Descriptors: Foreign Countries, Item Response Theory, Evaluators, Expertise

Fairness in Oral Language Assessment: Training Raters and Considering Examinees' Expectations

Peer reviewed
PDF on ERIC

Download full text

Doosti, Mehdi; Ahmadi Safa, Mohammad – International Journal of Language Testing, 2021

This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees' expectations by the examiners have any effect on test-takers' perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian…

Descriptors: Oral Language, Language Tests, Interrater Reliability, Training

Assessing Transversal Competences in Professional Internships: The Role of Assessment Agents

Peer reviewed
PDF on ERIC

Download full text

Romeo, Marina; Yepes-Baldó, Montserrat; González, Vicenta; Burset, Silvia; Martín, Carolina; Bosch, Emma – International Journal of Instruction, 2022

The assessment process in higher education considers four aspects: assessment agents, procedure, content, and scoring. In this study, we delve into the who. We analyze the role of transversal competence assessment agents in the framework of professional internships in university master's degree programs, comparing the suitability of their…

Descriptors: Internship Programs, Higher Education, Evaluators, Masters Programs

The Effect of Workshop Training on Rater Variability in Children's Oral Narrative Assessment

Peer reviewed

Direct link

Karusoo-Musumeci, Ava; Pearce, Wendy M.; Donaghy, Michelle – Child Language Teaching and Therapy, 2022

Oral narrative assessments are important for diagnosis of language disorders in school-age children so scoring needs to be reliable and consistent. This study explored the impact of training on the variability of story grammar scores in children's oral narrative assessments scored by multiple raters. Fifty-one speech pathologists and 19 final-year…

Descriptors: Oral Language, Speech Evaluation, Language Impairments, Elementary School Students

Low Inter-Rater Reliability of a High Stakes Performance Assessment of Teacher Candidates

Peer reviewed
PDF on ERIC

Download full text

Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021

The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…

Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language

The Use of Semantic Similarity Tools in Automated Content Scoring of Fact-Based Essays Written by EFL Learners

Peer reviewed

Direct link

Wang, Qiao – Education and Information Technologies, 2022

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring

Writing Scale Effects on Raters: An Exploratory Study

Peer reviewed

Direct link

Jeong, Heejeong – Language Testing in Asia, 2019

In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects of a binary scale and an analytic scale across teacher raters and expert raters. The purpose of the study is to find out how different scale…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

Comparison of Automatic and Expert Teachers' Rating of Computerized English Listening-Speaking Test

Peer reviewed
PDF on ERIC

Download full text

Linlin, Cao – English Language Teaching, 2020

Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…

Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning

Better Feedback for Better Teaching: A Practical Guide to Improving Classroom Observations

Download full text

Archer, Jeff; Cantrell, Steve; Holtzman, Steven L.; Joe, Jilliam N.; Tocci, Cynthia M.; Wood, Jess – Bill & Melinda Gates Foundation, 2016

In this book the authors explain how to build, and over time improve, the elements of an observation system that equips all observers to identify and develop effective teaching. It is based on the collective knowledge of key partners in the Measures of Effective Teaching (MET) project--which carried out one of the largest-ever studies of classroom…

Descriptors: Feedback (Response), Teacher Effectiveness, Observation, Teacher Evaluation

Previous Page | Next Page »

Pages: 1 | 2 | 3

ETS Research Report Series	2
English Language Teaching	2
International Journal of…	2
Language Assessment Quarterly	2
Language Testing	2
Advances in Language and…	1
Bill & Melinda Gates…	1
Child Language Teaching and…	1
Cogent Education	1
Education Sciences	1
Education and Information…	1
International Journal of…	1
International Journal of…	1
Journal of Education Human…	1
Language Testing in Asia	1
Mid-Western Educational…	1
Online Submission	1
Reading and Writing: An…	1
Regional Educational…	1
SAGE Open	1
Society for Research on…	1
Studies in Higher Education	1
More ▼

Ahmadi Safa, Mohammad	1
Ahmadi Shirazi, Masoumeh	1
Ahmadi, Alireza	1
Ahrari, Ramin	1
Ann Tai Choe	1
Archer, Jeff	1
Arnold, Voiza	1
Beh-Afarin, Seyed Reza	1
Berger, Cynthia M.	1
Bosch, Emma	1
Boser, Judith A.	1
Boyd, Victoria	1
Breyer, F. Jay	1
Brodersen, R. Marc	1
Burset, Silvia	1
Canton, Ursula	1
Cantrell, Steve	1
Chen, Yuan-shan	1
Cherasaro, Trudy L.	1
Clark, Sheldon B.	1
Clarke, Laura	1
Cook, Daniel W.	1
Cooper, Paul G.	1
Crews, William E., Jr.	1
More ▼

Evaluators	38
Interrater Reliability	24
English (Second Language)	15
Second Language Learning	15
Foreign Countries	14
Evaluation Methods	13
Language Tests	10
Scoring Rubrics	10
Second Language Instruction	10
Writing Evaluation	10
Reliability	9
Scores	9
Scoring	9
Comparative Analysis	8
Test Reliability	8
Undergraduate Students	8
Correlation	7
Teacher Evaluation	7
Elementary School Teachers	6
Evaluation Criteria	6
Statistical Analysis	6
Test Construction	6
College Faculty	5
Feedback (Response)	5
Oral Language	5
More ▼