Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Shepherd, J. Brad; Britton, Paula J.; Kress, Victoria E. – Australian Journal of Guidance and Counselling, 2008
The definition and measurement of counsellor trainee competency is an issue that has received increased attention yet lacks quantitative study. This research evaluates item responses, scale reliability and intercorrelations, interrater agreement, and criterion-related validity of the Professional Performance Fitness Evaluation/Professional…
Descriptors: Counselor Training, Trainees, Competence, Student Evaluation
Moffett, David W.; Reid, Barbara K.; Zhou, Yunfang – Online Submission, 2008
The unit determined that "Assessment 5: Effect on Student Learning" would be best measured by student teachers and interns utilizing an action research activity in their clinical experience. Twenty four action research projects were evaluated by the Director of Student Teaching. Interraters blind to the Director's scores evaluated the projects.…
Descriptors: Student Teaching, Student Teachers, Research Projects, Interrater Reliability
Johnson, Martin – Issues in Educational Research, 2008
This study investigated the cognitive strategies that underpin assessors' holistic judgments of a school-based vocationally-related portfolio performance. Using a portfolio already identified as containing borderline qualities, quantitative data were gathered about features that six assessors attended to as they holistically evaluated the…
Descriptors: Statistical Analysis, Holistic Approach, Portfolios (Background Materials), Vocational Education
The Functional Analytic Psychotherapy Rating Scale (FAPRS): A Behavioral Psychotherapy Coding System
Callaghan, Glenn M.; Follette, William C.; Ruckstuhl, L. E., Jr.; Linnerooth, Peter J. N. – Behavior Analyst Today, 2008
Many researchers and clinicians believe that the therapeutic relationship is essential in bringing about clinical change. Empirical research to support this contention is scarce in part due to the difficulty of specifying and measuring theoretically derived mechanisms of change and the important dimensions of the client-therapist relationship.…
Descriptors: Psychotherapy, Behavior Modification, Rating Scales, Behavior Change
Morris, Christopher – Developmental Medicine & Child Neurology, 2008
To address the need for a standardized system to classify the gross motor function of children with cerebral palsy, the authors developed a five-level classification system analogous to the staging and grading systems used in medicine. Nominal group process and Delphi survey consensus methods were used to examine content validity and revise the…
Descriptors: Psychomotor Skills, Children, Test Construction, Content Validity
Danov, Stacy E.; Symons, Frank J. – Behavior Modification, 2008
Visual inspection is the primary method used to analyze graphed behavioral data produced by functional analyses of problem behavior. The purpose of this study was to examine rater reliability of functional analysis graphs using visual inspection. Forty-three participants responded to a one-time anonymous survey (N = 454) mailed to graduate…
Descriptors: Graphs, Visual Discrimination, Behavior Problems, Functional Behavioral Assessment
Greatorex, Jackie; Bell, John F. – Research Papers in Education, 2008
It is particularly important that GCSE and A-level marking is valid and reliable as it affects the life chances of many young people in England. Current developments in marking technology are coinciding with potential changes in procedures to ensure valid and reliable marking. In this research the effectiveness of procedures to facilitate the…
Descriptors: Scripts, Intervention, Interrater Reliability, Examiners
Bryson, Susan E.; Zwaigenbaum, Lonnie; Mcdermott, Catherine; Rombough, Vicki; Brian, Jessica – Journal of Autism and Developmental Disorders, 2008
The Autism Observation Scale for Infants (AOSI) was developed to detect and monitor early signs of autism as they emerge in high-risk infants (all with an older sibling with an autistic spectrum disorder). Here we describe the scale and its development, and provide preliminary data on its reliability. Inter-rater reliability both for total scores…
Descriptors: Observation, Autism, Interrater Reliability, Infants
Cocchio, Kathy L. – Online Submission, 2009
The study sought to develop consensus opinion on the core competencies required to succeed as a female executive in the C-Suites of Alberta, Canada. The study was prompted by the significant under-representation of women in Canadian corporate executive positions and by a post-secondary institution's interest in determining whether a market exists…
Descriptors: Delphi Technique, Foreign Countries, Competence, Womens Studies
Lepkowski, William J.; Packman, Jill; Smaby, Marlowe H.; Maddux, Cleborne – Education, 2009
Counselor ability to accurately self-assess their competence is important to ethical practice. However, research indicates that people in general are not reliable in judging their own competence. This study compared the self-assessments of skills of 69 counselors-in-training to the skill ratings of trained expert-raters at three points during…
Descriptors: Counselor Training, Ethics, Pretests Posttests, Self Evaluation (Individuals)
Ehrenreich, Jill T.; Micco, Jamie A.; Fisher, Paige H.; Warner, Carrie Masia – Child Psychiatry and Human Development, 2009
Objective: Research on child and adolescent anxiety disorders has seen a surge in investigations of parenting factors potentially associated with their etiology. However, many of the well-established parenting measures are limited by over-reliance on self-report or lengthy behavioral observation procedures. Such measures may not assess factors…
Descriptors: Test Validity, Child Rearing, Interrater Reliability, Adolescents
Shahvali, M.; Poursaeed, A.; Sharifzadeh, M. – Journal of Natural Resources and Life Sciences Education, 2009
This study investigated the effects of workshop and lecture methods on pastoralists' learning in Ilam Province, west of Iran. A quasi-experimental research method and non-equivalent control group design was used. Sixty pastoralists participated in this study. An open-ended questionnaire was used as the instrument of the study and found to have…
Descriptors: Control Groups, Content Validity, Validity, Interrater Reliability
Ridley, Charles R.; Shaw-Ridley, Mary – Counseling Psychologist, 2009
Clinical judgment is foundational to psychological practice. Accurate judgment forms the basis for establishing reasonable goals and selecting appropriate treatments, which in turn are essential in achieving positive therapeutic outcomes. Therefore, Spengler and colleagues' meta-analytic finding--clinical judgment accuracy improves marginally with…
Descriptors: Medical Evaluation, Clinical Experience, Inferences, Therapy
Porter, Jennifer Marie – ProQuest LLC, 2010
This research evaluated the inter-rater reliability of the Performance Assessment for California Teachers (PACT). Multiple methods for estimating overall rater consistency include percent agreement and Cohen's Kappa (1960), which yielded discrepancies between rater agreement in terms of whether candidates passed or failed particular PACT rubrics.…
Descriptors: Interrater Reliability, Program Effectiveness, Scoring Rubrics, Item Analysis
Gerlick, Robert Edward – ProQuest LLC, 2010
The research presented in this manuscript was focused on the development of assessments for engineering design outcomes. The primary goal was to support efforts by the Transferrable Integrated Design Engineering Education (TIDEE) consortium in developing assessment instruments for multidisciplinary engineering capstone courses. Research conducted…
Descriptors: Engineering Education, Student Evaluation, Formative Evaluation, Testing

Peer reviewed
Direct link
