NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1,936 to 1,950 of 3,124 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Pfeiffer, Steven; Petscher, Yaacov; Kumtepe, Alper – Roeper Review, 2008
This study examined the internal consistency and validity of a new rating scale to identify gifted students, the Gifted Rating Scales-School Form (GRS-S). The study explored the effect of gender, race/ethnicity, age, and rater familiarity on GRS-S ratings. One hundred twenty-two students in first to eighth grade from elementary and middle schools…
Descriptors: Ethnicity, Middle Schools, Academically Gifted, Talent
Peer reviewed Peer reviewed
Direct linkDirect link
Roberts, Felicia; Cimasko, Tony – Journal of Second Language Writing, 2008
This study addresses the response of social science and engineering science faculty to a naturally occurring sample of second language writing. Using a matched-guise protocol, faculty participants were led to believe that the one-page essay was produced by an international student whose first language was either Chinese or Spanish. The faculty…
Descriptors: Foreign Students, Writing (Composition), Semantics, Social Sciences
Peer reviewed Peer reviewed
Direct linkDirect link
Grainger, Peter; Purnell, Ken; Zipf, Reyna – Assessment & Evaluation in Higher Education, 2008
Decisions by markers about quality in student work remain confusing to most students and markers. This may in part be due to the relatively subjective nature of what constitutes a quality response to an assessment task. This paper reports on an experiment that documented the process of decision-making by multiple markers at a university who…
Descriptors: Student Evaluation, Educational Quality, Achievement Rating, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Ya-Ting C.; Chan, Chia-Ying – Computers & Education, 2008
This study aimed to develop a set of evaluation criteria for English learning websites. These criteria can assist English teachers/web designers in designing effective websites for their English courses and can also guide English learners in screening for appropriate and reliable websites to use in increasing their English ability. To fulfill our…
Descriptors: Speech Communication, Content Validity, Interrater Reliability, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
D'Eon, Marcel; Sadownik, Leslie; Harrison, Alexandra; Nation, Jill – American Journal of Evaluation, 2008
An accepted gold standard for measuring change in participant behavior is third-party observation. This method is highly resource intensive, and many small-scale evaluations may not be in a position to use this approach. This study was designed to assess the validity and reliably of aggregated group self-assessments as one way to measure workshop…
Descriptors: Program Effectiveness, Workshops, Feedback (Response), Self Evaluation (Groups)
Peer reviewed Peer reviewed
Direct linkDirect link
Cook, David A.; Beckman, Thomas J. – Advances in Health Sciences Education, 2009
Educators must often decide how many points to use in a rating scale. No studies have compared interrater reliability for different-length scales, and few have evaluated accuracy. This study sought to evaluate the interrater reliability and accuracy of mini-clinical evaluation exercise (mini-CEX) scores, comparing the traditional mini-CEX…
Descriptors: Interrater Reliability, Rating Scales, Internal Medicine, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Quigg, Mark; Lado, Fred A. – Journal of Continuing Education in the Health Professions, 2009
Introduction: The Accreditation Council for Continuing Medical Education (ACCME) provides guidelines for continuing medical education (CME) materials to mitigate problems in the independence or validity of content in certified activities; however, the process of peer review of materials appears largely unstudied and the reproducibility of…
Descriptors: Medical Education, Physicians, Conflict of Interest, Interrater Reliability
Peer reviewed Peer reviewed
Newman, Jody L.; Fuqua, Dale R. – Counselor Education and Supervision, 1986
Examined the effects of order of stimulus presentation on observer ratings of counseling performance. Results revealed a statistically significant interaction between quality of performance and the order in which the performances were rated. (Author/ABB)
Descriptors: Counselor Evaluation, Counselor Performance, Interrater Reliability, Observation
Peer reviewed Peer reviewed
Ansorge, Charles J.; Scheer, John K. – Research Quarterly for Exercise and Sport, 1988
Analysis of gymnastics judges scores of their own and other countries' gymnasts' performance during the 1984 Olympic Games indicated that the judges were biased in favor of their own country's gymnasts. (Author/CB)
Descriptors: Bias, Competition, Gymnastics, International Relations
Peer reviewed Peer reviewed
Kane, Robert L.; And Others – Journal of Consulting and Clinical Psychology, 1987
Three experienced neuropsychologists rated brain damaged and control subjects for brain damage using the Halstead-Reitan Battery and the Luria-Nebraska Neuropsychological Battery. Using either battery, raters were accurate in judging the presence of brain damage. There was a high degree of consistency between raters and test batteries when both…
Descriptors: Interrater Reliability, Neurological Impairments, Psychological Testing, Psychometrics
Peer reviewed Peer reviewed
Cicchetti, Domenic V.; And Others – Educational and Psychological Measurement, 1984
This program computes multiple judge reliability levels under the following conditions. (1) different sets of judges perform the ratings; (2) the number of judges is a constant; and (3) the scale of measurement is nominal. (Author)
Descriptors: Computer Software, Interrater Reliability, Judgment Analysis Technique, Test Reliability
Peer reviewed Peer reviewed
Vance, B.; And Others – Psychology in the Schools, 1983
Investigated the interscorer reliability between a novice and a professional psychologist for the Minnesota Percepto-Diagnostic Test-Revised (MPDT-R), using a sample of 30 individuals. Results indicated that for three of the four MPDT-R scores there was a significant positive correlation between expert and novice scoring criteria. (JAC)
Descriptors: Experimenter Characteristics, Interrater Reliability, Psychological Evaluation, Psychologists
Randolph, Justus J. – Online Submission, 2005
Fleiss' popular multirater kappa is known to be influenced by prevalence and bias, which can lead to the paradox of high agreement but low kappa. It also assumes that raters are restricted in how they can distribute cases across categories, which is not a typical feature of many agreement studies. In this article, a free-marginal, multirater…
Descriptors: Multivariate Analysis, Statistical Distributions, Statistical Bias, Interrater Reliability
Peer reviewed Peer reviewed
Bartfay, Emma – International Journal of Testing, 2003
Used Monte Carlo simulation to compare the properties of a goodness-of-fit (GOF) procedure and a test statistic developed by E. Bartfay and A. Donner (2001) to the likelihood ratio test in assessing the existence of extra variation. Results show the GOF procedure possess satisfactory Type I error rate and power. (SLD)
Descriptors: Goodness of Fit, Interrater Reliability, Monte Carlo Methods, Simulation
Peer reviewed Peer reviewed
VanLeeuwen, Dawn M. – Journal of Agricultural Education, 1997
Generalizability Theory can be used to assess reliability in the presence of multiple sources and different types of error. It provides a flexible alternative to Classical Theory and can handle estimation of interrater reliability with any number of raters. (SK)
Descriptors: Error of Measurement, Generalizability Theory, Interrater Reliability, Measurement Techniques
Pages: 1  |  ...  |  126  |  127  |  128  |  129  |  130  |  131  |  132  |  133  |  134  |  ...  |  209