ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	9
Since 2007 (last 20 years)	13

Descriptor

Interrater Reliability	13
Test Reliability	13
Test Validity	6
Student Evaluation	5
Correlation	4
Scoring Rubrics	4
Writing Evaluation	4
Evaluation Methods	3
Test Construction	3
Accuracy	2
College Faculty	2
English (Second Language)	2
Essay Tests	2
Evaluation Criteria	2
Pilot Projects	2
Psychometrics	2
Scores	2
Second Language Learning	2
Test Bias	2
Academic Achievement	1
Achievement Gains	1
Autism Spectrum Disorders	1
Behavior Rating Scales	1
Case Studies	1
Chemistry	1
More ▼

Source

ProQuest LLC

Author

Ashley Marinez	1
Ballard, Laura	1
Castle, Courtney	1
Dockterman, Daniel Milo	1
Emma Healy	1
Kelvin Terrell Pompey	1
Laura Jimenez Snelson	1
Lim, Gad S.	1
Lynsey Joohyun Lee	1
Michelle Herridge	1
Scharf, Davida	1
Wenjing Guo	1
Yi, Gina Jisun	1
More ▼

Publication Type

Dissertations/Theses -…

Education Level

Higher Education	3
Postsecondary Education	3
Early Childhood Education	1
Preschool Education	1

Audience

Location

Arizona	1
Pennsylvania	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Social Responsiveness Scale	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

A Unified Approach to Estimating the Intraclass Correlation Coefficient and Its Bias: An Exploratory Study

Direct link

Kelvin Terrell Pompey – ProQuest LLC, 2021

Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…

Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation

The Utility of the Social Responsiveness Scale, Second Edition for Children Suspected to Have Autism: Examining the SRS-2 in Light of Children's Race and Ethnicity

Direct link

Emma Healy – ProQuest LLC, 2024

The shortage of autism specialists and lack of culturally sensitive autism assessment tools are helping to perpetuate racial and ethnic disparities in autism identification and treatment. Using DisCrit as a framework, this quantitative study examined the utility of one autism assessment tool, the Social Responsiveness Scale, second edition (SRS-2)…

Descriptors: Autism Spectrum Disorders, Student Evaluation, Diagnostic Tests, Disability Identification

Developing a Validity Argument Case for Locally Developed University English Preparedness Testing from an Ethical Perspective

Direct link

Lynsey Joohyun Lee – ProQuest LLC, 2021

Reliability and validity are two important topics that have been studied for many decades in the educational measurement field, including discussions of Writing Studies' subfield of writing assessment, since the establishment of the College Entrance Exam Board [CEEB] in 1899 (Huot et al., 2010). In recent years, scholarly conversations of fairness…

Descriptors: Writing Evaluation, Test Validity, Test Reliability, Case Studies

Exploring Rating Quality in the Context of High-Stakes Rater-Mediated Educational Assessments

Direct link

Wenjing Guo – ProQuest LLC, 2021

Constructed response (CR) items are widely used in large-scale testing programs, including the National Assessment of Educational Progress (NAEP) and many district and state-level assessments in the United States. One unique feature of CR items is that they depend on human raters to assess the quality of examinees' work. The judgment of human…

Descriptors: National Competency Tests, Responses, Interrater Reliability, Error of Measurement

Grading in Chemistry: Variations in Instructors' Evaluation of Student Written Responses

Direct link

Michelle Herridge – ProQuest LLC, 2021

Evaluation of student written work during summative assessments is an important and critical task for instructors at all educational levels. Nevertheless, few research studies exist that provide insights into how different instructors approach this task. Chemistry faculty (FIs) and graduate student instructors (GSIs) regularly engage in the…

Descriptors: Science Instruction, Chemistry, College Faculty, Teaching Assistants

Interrater Reliability of Curriculum-Based Measures in Reading (R-CBMs) for English Learners

Direct link

Ashley Marinez – ProQuest LLC, 2020

The purpose of the current study was to determine interrater reliability (IRR) of Oral Reading Fluency (ORF) Curriculum-based Measures (R-CBM) when used with Spanish-speaking English Learner (EL) students. The ORF R-CBM probes obtained from AIMSweb are measures of a student's reading accuracy skills and reading fluency skills. Certified school…

Descriptors: Interrater Reliability, English (Second Language), Reading Fluency, Curriculum Based Assessment

The Effects of Primacy on Rater Cognition: An Eye-Tracking Study

Direct link

Ballard, Laura – ProQuest LLC, 2017

Rater scoring has an impact on writing test reliability and validity. Thus, there has been a continued call for researchers to investigate issues related to rating (Crusan, 2015). Investigating the scoring process and understanding how raters arrive at particular scores are critical "because the score is ultimately what will be used in making…

Descriptors: Evaluators, Schemata (Cognition), Eye Movements, Scoring Rubrics

Measuring Multidimensional Science Learning: Item Design, Scoring, and Psychometric Considerations

Direct link

Castle, Courtney – ProQuest LLC, 2018

The Next Generation Science Standards propose a multidimensional model of science learning, comprised of Core Disciplinary Ideas, Science and Engineering Practices, and Crosscutting Concepts (NGSS Lead States, 2013). Accordingly, there is a need for student assessment aligned with the new standards. Creating assessments that validly and reliably…

Descriptors: Science Education, Student Evaluation, Science Tests, Test Construction

Discrepancies between Students' and Teachers' Ratings of Instructional Practice: A Way to Measure Classroom Intuneness and Evaluate Teaching Quality

Direct link

Dockterman, Daniel Milo – ProQuest LLC, 2017

Student surveys have gained prominence in recent years as a way to give students a voice in their learning process, and teacher self-reports have always been an effective instrument for revealing the planning, intentions, and expectations behind a given lesson. Though student and teacher surveys are widely used, extant research in education has…

Descriptors: Outcome Measures, Teacher Evaluation, Student Evaluation of Teacher Performance, Evaluation Methods

Development and Validation of a Musical Behavior Measure for Preschool Children

Direct link

Yi, Gina Jisun – ProQuest LLC, 2013

The purpose of this study was to develop a measure for use in assessing musical behaviors of preschool children in the context of regular music instruction and to determine the validity and the reliability of the measure. The Early Childhood Musical Behavior Measure (ECMBM) was constructed for use with preschool-aged children to measure their…

Descriptors: Preschool Children, Child Behavior, Music, Behavior Rating Scales

An Intervention and Assessment to Improve Information Literacy

Direct link

Scharf, Davida – ProQuest LLC, 2013

Purpose: The goal of the study was to test an intervention using a brief essay as an instrument for evaluating higher-order information literacy skills in college students, while accounting for prior conditions such as socioeconomic status and prior academic achievement, and identify other predictors of information literacy through an evaluation…

Descriptors: Information Literacy, Intervention, Student Evaluation, College Students

Estimating the Reliability of Concept Map Ratings Using a Scoring Rubric Based on Three Attributes

Direct link

Laura Jimenez Snelson – ProQuest LLC, 2010

Concept maps provide a way to assess how well students have developed an organized understanding of how the concepts taught in a unit are interrelated and fit together. However, concept maps are challenging to score because of the idiosyncratic ways in which students organize their knowledge (McClure, Sonak, & Suen, 1999). The construct a map…

Descriptors: Concept Mapping, Scoring Rubrics, Test Reliability, Interrater Reliability

Prompt and Rater Effects in Second Language Writing Performance Assessment

Direct link

Lim, Gad S. – ProQuest LLC, 2009

Performance assessments have become the norm for evaluating language learners' writing abilities in international examinations of English proficiency. Two aspects of these assessments are usually systematically varied: test takers respond to different prompts, and their responses are read by different raters. This raises the possibility of undue…

Descriptors: Performance Based Assessment, Language Tests, Performance Tests, Test Validity