NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 8 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Chen, Michelle Y.; Liu, Yan; Zumbo, Bruno D. – Educational and Psychological Measurement, 2020
This study introduces a novel differential item functioning (DIF) method based on propensity score matching that tackles two challenges in analyzing performance assessment data, that is, continuous task scores and lack of a reliable internal variable as a proxy for ability or aptitude. The proposed DIF method consists of two main stages. First,…
Descriptors: Probability, Scores, Evaluation Methods, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A. – Language Testing, 2023
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…
Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Eskin, Daniel – Studies in Applied Linguistics & TESOL, 2022
For agencies that deliver high-stakes Second Language (L2) proficiency exams, a research agenda has been undertaken for years to examine the role of rater, task, and rubric as sources of variability into their performance assessments (Lee, 2006; Sawaki & Sinharay, 2013; Xi, 2007; Xi & Mollaun, 2006). However, these challenges are more…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Student Placement
Peer reviewed Peer reviewed
Direct linkDirect link
Lim, Gad S. – Language Testing, 2011
Raters are central to writing performance assessment, and rater development--training, experience, and expertise--involves a temporal dimension. However, few studies have examined new and experienced raters' rating performance longitudinally over multiple time points. This study uses operational data from the writing section of the MELAB (n =…
Descriptors: Expertise, Writing Evaluation, Performance Based Assessment, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Eckes, Thomas – Language Testing, 2008
Research on rater effects in language performance assessments has provided ample evidence for a considerable degree of variability among raters. Building on this research, I advance the hypothesis that experienced raters fall into types or classes that are clearly distinguishable from one another with respect to the importance they attach to…
Descriptors: Performance Based Assessment, Language Tests, Measures (Individuals), Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Knoch, Ute – Language Testing, 2009
Alderson (2005) suggests that diagnostic tests should identify strengths and weaknesses in learners' use of language and focus on specific elements rather than global abilities. However, rating scales used in performance assessment have been repeatedly criticized for being imprecise and therefore often resulting in holistic marking by raters…
Descriptors: Feedback (Response), Language Usage, Performance Based Assessment, Performance Tests
Lim, Gad S. – ProQuest LLC, 2009
Performance assessments have become the norm for evaluating language learners' writing abilities in international examinations of English proficiency. Two aspects of these assessments are usually systematically varied: test takers respond to different prompts, and their responses are read by different raters. This raises the possibility of undue…
Descriptors: Performance Based Assessment, Language Tests, Performance Tests, Test Validity
Barton, Paul E.; Coley, Richard J. – 1994
This report provides a profile of state testing programs in 1992-93, as well as a view of classroom testing practices by state, school district, school, or individual teacher. Information, taken from a variety of sources, including the National Assessment of Educational Progress and a General Accounting Office study, indicates that the…
Descriptors: Ability, Accountability, Achievement Tests, Alternative Assessment