Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Ligon, Glynn; Ellis, John – 1986
For Texas's Career Ladder System of rewarding good teachers, teachers' performance evaluations from 1981 to 1984 were used to rank teachers in the Austin Independent School District. Significant biases were noted between raters, between years, and between elementary and secondary teacher ratings. To adjust for these biases, each teacher's raw…
Descriptors: Bias, Career Ladders, Elementary Secondary Education, Equated Scores
Barter, Alice K.; And Others – 1980
A follow-up study of two instruments for evaluating college writing was conducted. The experimental scale (E Scale) was developed in 1976 and revised for this study. The control scale (C Scale) was described in the literature in 1977. Ten English majors graded ten essays from diagnostic entrance exams. Both the E Scale and the C Scale were used,…
Descriptors: College Entrance Examinations, Comparative Testing, Essay Tests, Evaluation Criteria
Peer reviewedHoogeveen, Kim; Gutkin, Terry B. – American Educational Research Journal, 1986
This study, conducted in three elementary schools, examined several merit pay issues: (1) the congruence of teachers' ratings of themselves and their peers with principals' ratings; (2) teacher confidence in the ratings by themselves, their peers, and their principals; and (3) the relationship of teaching experience, teacher rating, and teacher…
Descriptors: Attitude Measures, Correlation, Elementary Education, Elementary School Teachers
Idiographic Measurement Strategies for Personality and Prediction: Some Unredeemed Promissory Notes.
Peer reviewedPaunonen, Sampo V.; Jackson, Douglas N. – Psychological Review, 1985
This article (1) examines the role of idiographic theory in contemporary personality assessment and behavioral prediction methodology, (2) reviews critically empirical studies designed to evaluate the idiographic theory of individual differences in predictability, and (3) presents new data on conclusions that have been advanced toward an…
Descriptors: Behavior Rating Scales, Higher Education, Individual Differences, Interrater Reliability
Van Velsor, Ellen; Leslie, Jean Brittain; Fleenor, John W. – 1997
This book presents a nontechnical, step-by-step process that shows how to evaluate any 360-degree-feedback instrument intended for management or leadership development. The 360-degree-feedback instruments collect information from different sources about a target manager's performance, and they offer multiple perspectives. The 16 steps in…
Descriptors: Administrator Characteristics, Evaluation Methods, Feedback, Interrater Reliability
Peer reviewedJohnson, Happy; Brady, Sharon J.; Shenkle, Ann; Amidon, Edmund – Journal of Special Education Technology, 1997
Presents the Verbal Interaction Analysis System (VIAS), a computer-based system designed to analyze verbal interactions, and describes application of the VIAS in analyzing family/professional interactions during the Individualized Family Service Plan process. The results of a pilot test of the VIAS found high interobserver reliability. (Author/CR)
Descriptors: Disabilities, Early Childhood Education, Educational Planning, Evaluation Methods
Peer reviewedEdwards, Alison L. – Modern Language Journal, 1996
Examined the validity of the pragmatic approach to test difficulty put forward by Child (1987). This study investigated whether the Child discourse-type hierarchy predicts text difficulty for second-language readers. Results suggested that this hierarchy may provide a sound basis for developing foreign-language tests when it is applied by trained…
Descriptors: Adult Students, Analysis of Variance, French, Interrater Reliability
Peer reviewedLyons, Peter; And Others – Social Work Research, 1996
Despite the increasing popularity of systematic risk assessment models by child protective services agencies, relatively few are empirically based. Reviews the empirical literature on 10 risk assessment models and concludes that although each model contains generally sound psychometric properties, there is still a need for further model…
Descriptors: Agency Role, At Risk Persons, Caseworker Approach, Child Welfare
Peer reviewedEley, Malcolm G.; Stecher, Erica J. – Assessment & Evaluation in Higher Education, 1997
Three studies compared the common Likert agree/disagree question form to a behavioral observation form for faculty evaluation. The Likert-type format prompted global, impressionistic responses; the behavioral observation form prompted more objective responses. Results suggest use of behavioral observation rather than agree/disagree questions can…
Descriptors: Behavior Rating Scales, College Faculty, Faculty Evaluation, Higher Education
Peer reviewedComm, Clare L.; LaBay, Duncan G. – Journal of Marketing for Higher Education, 1996
A survey of 65 freshman and 195 junior business administration majors in one metropolitan state university investigated student perceptions of institutional attributes salient in college choice. Results suggest standards of institutional quality are difficult to establish, since evaluations of institutional performance are not consistent, even…
Descriptors: Business Administration Education, College Administration, College Choice, College Students
Peer reviewedLevy, Howard B.; And Others – Journal of Interpersonal Violence, 1995
Examined interrater reliability of information obtained during child sexual abuse assessments using a clinical assessment interview protocol featuring anatomic dolls and patterns of disclosure and doll demonstration across subject's age, gender, and case outcome. Results suggest specific areas that tend to be ambiguous and areas that may be more…
Descriptors: Child Abuse, Counselor Evaluation, Elementary Education, Evaluation Methods
Krishnan, Lakshmy; Murphy, Noela – Guidelines, 1993
In a Singapore study of the evaluation of engineering students' writing, engineering lecturers and language lecturers marked scripts differently. Analysis showed that engineering lecturers placed more emphasis on students' understanding of a process rather than their expression of that understanding. (Contains 13 references.) (LB)
Descriptors: Engineering Education, English for Science and Technology, Foreign Countries, Interrater Reliability
Peer reviewedCornell, Dewey G.; And Others – Journal for the Education of the Gifted, 1994
Gifted (n=675) and regular education (n=322) students in grades two and three were compared for incidence of behavior problems as rated by parents and teachers. After controlling for grade and minority status, no significant differences were found between groups in incidence or type of behavior problems. Agreement between parent and teacher…
Descriptors: Behavior Problems, Behavior Rating Scales, Elementary School Students, Emotional Adjustment
Peer reviewedLane, Suzanne; And Others – Journal of Educational Measurement, 1996
Evidence from test results of 3,604 sixth and seventh graders is provided for the generalizability and validity of the Quantitative Understanding: Amplifying Student Achievement and Reasoning (QUASAR) Cognitive Assessment Instrument, which is designed to measure program outcomes and growth in mathematics. (SLD)
Descriptors: Achievement Tests, Cognitive Processes, Elementary Education, Elementary School Students
Burns, Matthew K.; Haight, Sherrel Lee – Teacher Education and Special Education, 2005
Standardized portfolios are being used by teachers and teacher candidates to demonstrate subject matter knowledge for certification or to be considered highly qualified; however, the psychometric adequacy of data used for this purpose has not been evaluated. In the current study, we examined the interscorer reliability, concurrent, predictive, and…
Descriptors: Portfolios (Background Materials), Psychometrics, Validity, Teacher Education Programs

Direct link
