ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	19

Descriptor

Evaluation Problems	40
Interrater Reliability	40
Evaluation Methods	18
Evaluation Criteria	13
Student Evaluation	12
Evaluation Research	6
Evaluators	6
Scoring Rubrics	6
Test Reliability	6
Writing Evaluation	6
Educational Assessment	5
Elementary Secondary Education	5
Scoring	5
Foreign Countries	4
Higher Education	4
Measurement	4
Measurement Techniques	4
Models	4
Portfolio Assessment	4
Portfolios (Background…	4
Teacher Effectiveness	4
Bias	3
Case Studies	3
Comparative Analysis	3
Content Analysis	3
More ▼

Publication Type

Journal Articles	29
Reports - Research	26
Reports - Evaluative	8
Speeches/Meeting Papers	7
Reports - Descriptive	4
Dissertations/Theses -…	2
Opinion Papers	1

Education Level

Higher Education	10
Postsecondary Education	8
Elementary Secondary Education	6
Adult Education	3
Elementary Education	2
Early Childhood Education	1
Kindergarten	1
Preschool Education	1
Primary Education	1
Secondary Education	1

Audience

Researchers	5
Practitioners	2
Administrators	1
Teachers	1

Location

Canada	2
Arizona	1
Australia	1
California	1
Finland	1
Pennsylvania	1
Tennessee	1
Washington	1
West Germany	1

Laws, Policies, & Programs

Education Consolidation…	1
No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

Classroom Assessment Scoring…	1
National Assessment of…	1
Stanford Achievement Tests	1

What Works Clearinghouse Rating

Showing 1 to 15 of 40 results Save | Export

Different Methods for Assessing Preservice Teachers' Instruction: Why Measures Matter

Peer reviewed

Direct link

Arielle Boguslav; Julie Cohen – Journal of Teacher Education, 2024

Teacher preparation programs are increasingly expected to use data on preservice teacher (PST) skills to drive program improvement and provide targeted supports. Observational ratings are especially vital, but also prone to measurement issues. Scores may be influenced by factors unrelated to PSTs' instructional skills, including rater standards.…

Descriptors: Preservice Teachers, Measures (Individuals), Evaluation Problems, Teaching Skills

Monitoring Rater Quality in Observational Systems: Issues Due to Unreliable Estimates of Rater Quality

Peer reviewed

Direct link

Mark White; Matt Ronfeldt – Educational Assessment, 2024

Standardized observation systems seek to reliably measure a specific conceptualization of teaching quality, managing rater error through mechanisms such as certification, calibration, validation, and double-scoring. These mechanisms both support high quality scoring and generate the empirical evidence used to support the scoring inference (i.e.,…

Descriptors: Interrater Reliability, Quality Control, Teacher Effectiveness, Error Patterns

Grading in Chemistry: Variations in Instructors' Evaluation of Student Written Responses

Direct link

Michelle Herridge – ProQuest LLC, 2021

Evaluation of student written work during summative assessments is an important and critical task for instructors at all educational levels. Nevertheless, few research studies exist that provide insights into how different instructors approach this task. Chemistry faculty (FIs) and graduate student instructors (GSIs) regularly engage in the…

Descriptors: Science Instruction, Chemistry, College Faculty, Teaching Assistants

Inter-Rater Reliability of Washington State's Kindergarten Entry Assessment

Peer reviewed

Direct link

Joseph, Gail; Soderberg, Janet S.; Stull, Sara; Cummings, Kevin; McCutchen, Deborah; Han, Rachel J. – Early Education and Development, 2020

Research Findings: This study explores the inter-rater reliability of WaKIDS, Washington State's kindergarten entry assessment (KEA). Specifically, we analyze (1) the extent to which teachers' assessments are in agreement with a master code, (2) how often inaccurate assessment decisions lead to misidentification of school readiness, and (3)…

Descriptors: Interrater Reliability, School Readiness, Kindergarten, Evaluation Problems

The Miscalculation of Interrater Reliability: A Case Study Involving the AAC&U VALUE Rubrics

Peer reviewed
PDF on ERIC

Download full text

Szafran, Robert F. – Practical Assessment, Research & Evaluation, 2017

Institutional assessment of student learning objectives has become a fact-of-life in American higher education and the Association of American Colleges and Universities' (AAC&U) VALUE Rubrics have become a widely adopted evaluation and scoring tool for student work. As faculty from a variety of disciplines, some less familiar with the…

Descriptors: Interrater Reliability, Case Studies, Scoring Rubrics, Behavioral Objectives

Reliability of Multi-Category Rating Scales

Peer reviewed

Direct link

Parker, Richard I.; Vannest, Kimberly J.; Davis, John L. – Journal of School Psychology, 2013

The use of multi-category scales is increasing for the monitoring of IEP goals, classroom and school rules, and Behavior Improvement Plans (BIPs). Although they require greater inference than traditional data counting, little is known about the inter-rater reliability of these scales. This simulation study examined the performance of nine…

Descriptors: Rating Scales, Scaling, Interrater Reliability, Test Reliability

Researching Student Learning in a Two-Tiered General Education Program

Peer reviewed

Direct link

Csomay, Eniko; Pollard, Elizabeth; Bordelon, Suzanne; Beck, Audrey – Journal of General Education, 2015

Despite the desire of employers to hire those with the critical-thinking and communication skills a general education (GE) program can offer, the value of GE programs is often questioned due to concerns about four-year graduation rates, perceived low immediate economic payoff, and a dearth of evidence to support their efficacy. This article…

Descriptors: General Education, Critical Thinking, Communication Skills, Graduation Rate

Observer Ratings of Instructional Quality: Do They Fulfill What They Promise?

Peer reviewed

Direct link

Praetorius, Anna-Katharina; Lenske, Gerlinde; Helmke, Andreas – Learning and Instruction, 2012

Despite considerable interest in the topic of instructional quality in research as well as practice, little is known about the quality of its assessment. Using generalizability analysis as well as content analysis, the present study investigates how reliably and validly instructional quality is measured by observer ratings. Twelve trained raters…

Descriptors: Student Teachers, Interrater Reliability, Content Analysis, Observation

The Elusive Nature of Reliability: Problems and Pitfalls in Scoring Clinical Practice Action Research Projects

Download full text

Moffett, David W.; Reid, Barbara K. – Online Submission, 2010

The Investigators studied scoring reliability of Candidates' ten day unit plans of instruction through prescribed action research projects, across three academic years. Scoring of the projects in year one provided opportunities for further refinement of the action research evaluation methods in year two. Across three terms in years one and two…

Descriptors: Research Projects, Action Research, Student Evaluation, Mastery Learning

Some Key Issues in Creativity Research and Evaluation as Seen from a Psychological Perspective

Peer reviewed

Direct link

Fryer, Marilyn – Creativity Research Journal, 2012

This article explores a number of key issues with regard to the measurement of creativity in the course of conducting psychological research or when applying various evaluation measures. It is argued that, although creativity is a fuzzy concept, it is no more difficult to investigate than other fuzzy concepts people tend to take for granted. At…

Descriptors: Creativity, Educational Research, Psychological Studies, Evaluation Methods

Assessment of the Risk of Bias in Rehabilitation Reviews

Peer reviewed

Direct link

Farmer, Sybil E.; Wood, Duncan; Swain, Ian D.; Pandyan, Anand D. – International Journal of Rehabilitation Research, 2012

Systematic reviews are used to inform practice, and develop guidelines and protocols. A questionnaire to quantify the risk of bias in systematic reviews, the review paper assessment (RPA) tool, was developed and tested. A search of electronic databases provided a data set of review articles that were then independently reviewed by two assessors…

Descriptors: Outcome Measures, Interrater Reliability, Questionnaires, Literature Reviews

Fairness and Using Reflective Journals in Assessment

Peer reviewed

Direct link

Clarkeburn, Henriikka; Kettula, Kirsi – Teaching in Higher Education, 2012

This study looks at the fairness of assessing learning journals both as the fairness in creating a valid and robust marking process as well as how different student groups may have unfair disadvantages in performing well in reflective assessment tasks. The fairness of a marking process is discussed through reflecting on the practical process and…

Descriptors: Student Evaluation, Reflection, Summative Evaluation, Formative Evaluation

Playing with the Stakes: A Consideration of an Aspect of the Social Context of a Gatekeeping Writing Assessment

Peer reviewed

Direct link

Baker, Beverly A. – Assessing Writing, 2010

In high-stakes writing assessments, rater training in the use of a rating scale does not eliminate variability in grade attribution. This realisation has been accompanied by research that explores possible sources of rater variability, such as rater background or rating scale type. However, there has been little consideration thus far of…

Descriptors: Foreign Countries, Writing Evaluation, Writing Tests, Testing

Obscuring Vital Distinctions: The Oversimplification of Learning Disabilities within RTI

Peer reviewed

Direct link

McKenzie, Robert G. – Learning Disability Quarterly, 2009

The assessment procedures within Response to Intervention (RTI) models have begun to supplant the use of traditional, discrepancy-based frameworks for identifying students with specific learning disabilities (SLD). Many RTI proponents applaud this shift because of perceived shortcomings in utilizing discrepancy as an indicator of SLD. However,…

Descriptors: Intervention, Learning Disabilities, Error of Measurement, Psychometrics

PE Metrics: Background, Testing Theory, and Methods

Peer reviewed

Direct link

Zhu, Weimo; Rink, Judy; Placek, Judith H.; Graber, Kim C.; Fox, Connie; Fisette, Jennifer L.; Dyson, Ben; Park, Youngsik; Avery, Marybell; Franck, Marian; Raynes, De – Measurement in Physical Education and Exercise Science, 2011

New testing theories, concepts, and psychometric methods (e.g., item response theory, test equating, and item bank) developed during the past several decades have many advantages over previous theories and methods. In spite of their introduction to the field, they have not been fully accepted by physical educators. Further, the manner in which…

Descriptors: Physical Education, Quality Control, Psychometrics, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2 | 3

Journal of Communication…	2
Personnel Psychology	2
ProQuest LLC	2
Alberta Journal of…	1
Assessing Writing	1
Creativity Research Journal	1
ELT Journal	1
Early Education and…	1
Education Next	1
Education Policy Analysis…	1
Educational Assessment	1
Industrial and Labor…	1
International Journal of…	1
International Journal of…	1
Journal of Consulting and…	1
Journal of Counseling…	1
Journal of General Education	1
Journal of Marital and Family…	1
Journal of School Psychology	1
Journal of Speech and Hearing…	1
Journal of Teacher Education	1
Learning Disability Quarterly	1
Learning and Instruction	1
Measurement in Physical…	1
Online Submission	1
More ▼

Arielle Boguslav	1
Arnault, E. Jane	1
Avery, Marybell	1
Baker, Beverly A.	1
Ball, Martin J.	1
Beck, Audrey	1
Bordelon, Suzanne	1
Clarkeburn, Henriikka	1
Costrell, Robert	1
Cousins, J. Bradley	1
Csomay, Eniko	1
Cummings, Kevin	1
Davis, John L.	1
Dobson, Keith S.	1
Dwyer, Ann	1
Dyson, Ben	1
Ellis, John	1
Farmer, Sybil E.	1
Fisette, Jennifer L.	1
Fox, Connie	1
Franck, Marian	1
Fryer, Marilyn	1
Gordon, Louis	1
Graber, Kim C.	1
More ▼