Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Mainkar, Avinash V. – Journal of Management Education, 2008
Although practiced widely in management education, grading of student participation in class discussions has been criticized by researchers: The instructor simultaneously adopts two incompatible tasks of facilitating class discussion and evaluating student participation; students play for points instead of focusing on learning; and common,…
Descriptors: Administrator Education, Teaching Styles, Student Evaluation, Student Participation
Wang, Wen-Chung; Wilson, Mark – Applied Psychological Measurement, 2005
The random-effects facet model that deals with local item dependence in many-facet contexts is presented. It can be viewed as a special case of the multidimensional random coefficients multinomial logit model (MRCMLM) so that the estimation procedures for the MRCMLM can be directly applied. Simulations were conducted to examine parameter recovery…
Descriptors: Test Reliability, Item Response Theory, Interrater Reliability, Rating Scales
Hardison, Chaitra M.; Vilamovska, Anna-Marie – RAND Corporation, 2009
The Collegiate Learning Assessment (CLA) is a measure of how much students' critical thinking improves after attending college or university. This report illustrates how institutions can set their own standards on the CLA using a method that is appropriate for the CLA's unique characteristics. The authors examined evidence of reliability and…
Descriptors: Standard Setting, Evaluation Methods, Research Reports, Critical Thinking
Moffett, David W.; Zhou, Yunfang; Reid, Barbara K. – Online Submission, 2009
The Investigators studied effects of Candidates' 10 day unit plans of instruction through prescribed action research projects, across academic years 2007-2008 and 2008-2009. Results of the spring term '07-'08 Action Research projects informed the Unit in such a way that modifications were possible and made across programs. This resulted in…
Descriptors: Student Teachers, Research Projects, Action Research, Interrater Reliability
Coniam, David – Educational Research and Evaluation, 2009
This paper describes a study comparing paper-based marking (PBM) and onscreen marking (OSM) in Hong Kong utilising English language essay scripts drawn from the live 2007 Hong Kong Certificate of Education Examination (HKCEE) Year 11 English Language Writing Paper. In the study, 30 raters from the 2007 HKCEE Writing Paper marked on paper 100…
Descriptors: Student Attitudes, Foreign Countries, Essays, Comparative Analysis
Mawhinney, Thomas C. – Journal of Organizational Behavior Management, 2009
It is possible to define an organization's culture in terms of its dominant behavioral practices and their molar consequences, from the shop floor to the executive suite (Redmon & Mason, 2001). Dysfunctional and potentially deadly practices (for the organization as a whole) can be "latent." They often go undetected until their…
Descriptors: Foreign Countries, Banking, Leadership Effectiveness, Organizational Culture
Martinez, Jose Felipe; Stecher, Brian; Borko, Hilda – Educational Assessment, 2009
In this study we use data from the Early Childhood Longitudinal Survey third- and fifth-grade samples to investigate teacher judgments of student achievement, the extent to which they offer a similar picture of student mathematics achievement compared to standardized test scores, and whether classroom assessment practices moderate the relationship…
Descriptors: Mathematics Achievement, Standardized Tests, Grade 5, Student Evaluation
Webb, Noreen M.; Herman, Joan L.; Webb, Norman L. – Educational Measurement: Issues and Practice, 2007
This article examines the role of reviewer agreement in judgments about alignment between tests and standards. We used case data from three state alignment studies to explore how different approaches to incorporating reviewer agreement changes alignment conclusions. The three case studies showed varying degrees of reviewer agreement about…
Descriptors: Test Items, Case Studies, Mathematics, Interrater Reliability
Hedge, Jerry W.; Borman, Walter C.; Kubisiak, U. Christean; Bourne, Mark J. – Performance Improvement, 2007
The goal of the project examined here was to establish performance standards for Navy aerographer's mate (AG) enlisted sailors at three skill levels. We used an online expert judgment task and consensus workshop methodology to gather information from subject matter experts on minimal proficiency requirements for each task within each skill level.…
Descriptors: Workshops, Task Analysis, Cutting Scores, Military Personnel
Mclaughlin, Kevin; Coderre, Sylvain; Mortis, Garth; Fick, Gordon; Mandin, Henry – Advances in Health Sciences Education, 2007
Context: Evolution from novice to expert is associated with the development of expert-type knowledge structure. The objectives of this study were to examine reliability and validity of concept sorting (ConSort[C]) as a measure of static knowledge structure and to determine the relationship between concepts in static knowledge structure and…
Descriptors: Medical Students, Protocol Analysis, Construct Validity, Validity
Ideishi, Roger I.; O'Neil, Margaret E.; Chiarello, Lisa A.; Nixon-Cave, Kim – Physical & Occupational Therapy in Pediatrics, 2010
This study explored perspectives of therapist's role in care coordination between early intervention (EI) and medical services, and identified strategies for improving service delivery. Fifty adults participated in one of six focus groups. Participants included parents, pediatricians, and therapists working in hospital and EI programs. Structured…
Descriptors: Grounded Theory, Medical Services, Delivery Systems, Early Intervention
DeCarlo, Lawrence T.; Kim, YoungKoung – College Board, 2008
[Slides] presented at the American Educational Research Association (AERA) Conference in New York in March 2008. This presentation explores what cues are used as a deciding factor in essay scoring by the essay grader.
Descriptors: Essays, Grading, Evaluation Criteria, Scoring Rubrics
Dymond, Stacy K.; Renzaglia, Adelle; Halle, James W.; Chadsey, Janis; Bentz, Johnell L. – Teacher Education and Special Education, 2008
In this study, the authors determine the efficacy of videoconferencing to supervise pre-service special education teachers. Efficacy is determined by (a) assessing interobserver reliability between on-site and off-site observers and (b) evaluating the feasibility and practicality of the videoconferencing technology. Data are collected in two…
Descriptors: Practicums, Practicum Supervision, Interrater Reliability, Disabilities
Nasstrom, Gunilla; Henriksson, Widar – Electronic Journal of Research in Educational Psychology, 2008
Introduction: In a standards-based school-system alignment of policy documents with standards and assessment is important. To be able to evaluate whether schools and students have reached the standards, the assessment should focus on the standards. Different models and methods can be used for measuring alignment, i.e. the correspondence between…
Descriptors: Curriculum Development, Interrater Reliability, Classification, Foreign Countries
Pasco, Greg; Gordon, Rosanna K.; Howlin, Patricia; Charman, Tony – Journal of Autism and Developmental Disorders, 2008
The Classroom Observation Schedule to Measure Intentional Communication (COSMIC) was devised to provide ecologically valid outcome measures for a communication-focused intervention trial. Ninety-one children with autism spectrum disorder aged 6 years 10 months (SD 16 months) were videoed during their everyday snack, teaching and free play…
Descriptors: Play, Observation, Autism, Interrater Reliability

Peer reviewed
Direct link
