NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Educational Assessment16
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 16 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Mark White; Matt Ronfeldt – Educational Assessment, 2024
Standardized observation systems seek to reliably measure a specific conceptualization of teaching quality, managing rater error through mechanisms such as certification, calibration, validation, and double-scoring. These mechanisms both support high quality scoring and generate the empirical evidence used to support the scoring inference (i.e.,…
Descriptors: Interrater Reliability, Quality Control, Teacher Effectiveness, Error Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Peer reviewed Peer reviewed
Direct linkDirect link
Jensen, Bryant; Grajeda, Sara; Haertel, Edward – Educational Assessment, 2018
We trace the development and analyze the generalizability of the Classroom Assessment of Sociocultural Interactions (CASI), an observation system designed to measure cultural dimensions of classroom interactions. We establish CASI measurement properties by analyzing panoramic videos of 4th and 5th grade classrooms from the Measures of Effective…
Descriptors: Classroom Observation Techniques, Grade 4, Grade 5, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Berger, Jean-Louis; Karabenick, Stuart A. – Educational Assessment, 2016
Despite their significant contributions to research on self-regulated learning, those favoring online and trace approaches have questioned the use of self-report to assess learners' use of learning strategies. An important rejoinder to such criticisms consists of examining the validity of self-report items. The present study was designed to assess…
Descriptors: Construct Validity, Metacognition, Learning Strategies, Self Disclosure (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
Dawadi, Saraswati; Shrestha, Prithvi N. – Educational Assessment, 2018
There has been a steady interest in investigating the validity of language tests in the last decades. Despite numerous studies on construct validity in language testing, there are not many studies examining the construct validity of a reading test. This paper reports on a study that explored the construct validity of the English reading test in…
Descriptors: Foreign Countries, Construct Validity, Reading Tests, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D. – Educational Assessment, 2015
This study compared short-form constructed responses evaluated by both human raters and machine scoring algorithms. The context was a public competition on which both public competitors and commercial vendors vied to develop machine scoring algorithms that would match or exceed the performance of operational human raters in a summative high-stakes…
Descriptors: Test Scoring Machines, Responses, Interrater Reliability, High Stakes Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Engelhard, George, Jr.; Wesolowski, Brian – Educational Assessment, 2016
When good model-data fit is observed, the Many-Facet Rasch (MFR) model acts as a linking and equating model that can be used to estimate student achievement, item difficulties, and rater severity on the same linear continuum. Given sufficient connectivity among the facets, the MFR model provides estimates of student achievement that are equated to…
Descriptors: Evaluators, Interrater Reliability, Academic Achievement, Music Education
Peer reviewed Peer reviewed
Direct linkDirect link
Morris, R. C.; Parker, Loran Carleton; Nelson, David; Pistilli, Matthew D.; Hagen, Adam; Levesque-Bristol, Chantal; Weaver, Gabriela – Educational Assessment, 2014
This study examines the development and implementation of a survey-based instrument assessing the effectiveness of a course redesign initiative focused on student centeredness at a large midwestern university in the United States. Given the scope of the reform initiative under investigation in this study, researchers developed an instrument called…
Descriptors: Curriculum Design, Curriculum Development, Educational Change, Program Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Viger, Steven G. – Educational Assessment, 2011
An important part of test development is ensuring alignment between test forms and content standards. One common way of measuring alignment is the Webb (1997, 2007) alignment procedure. This article investigates (a) how well item writers understand components of the definition of Depth of Knowledge (DOK) from the Webb alignment procedure and (b)…
Descriptors: Test Items, Difficulty Level, Test Construction, Alignment (Education)
Peer reviewed Peer reviewed
Direct linkDirect link
Hill, Heather C.; Charalambous, Charalambos Y.; Blazar, David; McGinn, Daniel; Kraft, Matthew A.; Beisiegel, Mary; Humez, Andrea; Litke, Erica; Lynch, Kathleen – Educational Assessment, 2012
Measurement scholars have recently constructed validity arguments in support of a variety of educational assessments, including classroom observation instruments. In this article, we note that users must examine the robustness of validity arguments to variation in the implementation of these instruments. We illustrate how such an analysis might be…
Descriptors: Validity, Classroom Observation Techniques, Measures (Individuals), Teacher Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Martinez, Jose Felipe; Stecher, Brian; Borko, Hilda – Educational Assessment, 2009
In this study we use data from the Early Childhood Longitudinal Survey third- and fifth-grade samples to investigate teacher judgments of student achievement, the extent to which they offer a similar picture of student mathematics achievement compared to standardized test scores, and whether classroom assessment practices moderate the relationship…
Descriptors: Mathematics Achievement, Standardized Tests, Grade 5, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Martinez, Jose Felipe; Goldschmidt, Pete; Niemi, David; Baker, Eva L.; Sylvester, Roxanne M. – Educational Assessment, 2007
We conducted generalizability studies to examine the extent to which ratings of language arts performance assignments, administered in a large, diverse, urban district to students in second through ninth grades, result in reliable and precise estimates of true student performance. The results highlight three important points when considering the…
Descriptors: Assignments, Language Arts, Academic Achievement, Urban Areas
Peer reviewed Peer reviewed
Frederiksen, John R.; Sipusic, Mike; Sherin, Miriam; Wolfe, Edward W. – Educational Assessment, 1998
Developed a video portfolio technique of teacher assessment and evaluated the technique through studies of six teachers and their raters. Results show that teachers are consistent in observing teaching functions and using their observations to evaluate teaching. (SLD)
Descriptors: Evaluation Methods, Interrater Reliability, Portfolio Assessment, Teacher Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Matsumura, Lindsay Clare; Garnier, Helen; Pascal, Jenny; Valdes, Rosa – Educational Assessment, 2002
This article reports the technical quality of a measure describing the quality of classroom assignments piloted in the Los Angeles Unified School District's proposed new accountability system. For this study, 181 teachers were sampled from 35 schools selected at random. Participating teachers submitted three language arts assignments with samples…
Descriptors: Achievement Tests, Accountability, Academic Achievement, Assignments
Peer reviewed Peer reviewed
Supovitz, Jonathan A.; MacGowan, Andrew, III; Slattery, Jean – Educational Assessment, 1997
Reports on the interrater reliability of a language arts portfolio assessment in the primary grades of the Rochester (New York) school system. Results from approximately 400 primary grade portfolios rated by 2 raters show that teachers can assess their own students' work reliably. (SLD)
Descriptors: Evaluation Methods, Evaluators, Interrater Reliability, Portfolio Assessment
Previous Page | Next Page ยป
Pages: 1  |  2