Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Baker, David B. – 1989
A preliminary study examined the interrater reliability, factor structure, and convergent validity of the Haak Sentence Completion (in which the respondent is given a sentence stem and asked to provide the remainder of the sentence) using a three-point scoring system. The subjects were 100 boys ranging in age from 6 to 11 years who were randomly…
Descriptors: Behavior Disorders, Cloze Procedure, Elementary Education, Emotional Disturbances
Stewart, Norman R.; Johnson, Richard G. – 1986
The purpose of this study was to evaluate the quality of experimental research in counseling and counselor education published from 1976 through 1984. The focus of the study was the methodology and reporting of the research rather than its relevance. Time was the independent variable of interest and three 3-year spans were chosen for examination:…
Descriptors: Counseling, Counselor Training, Educational Psychology, Educational Research
Cope, Ronald T. – 1987
This study used generalizability theory and other statistical concepts to assess the application of the Angoff method to setting cutoff scores on two professional certification tests. A panel of ten judges gave pre- and post-feedback Angoff probability ratings of items of two forms of a professional certification test, and another panel of nine…
Descriptors: Certification, Correlation, Cutting Scores, Error of Measurement
Cason, Carolyn L.; And Others – 1986
Cason and Cason's model of performance rating was used to determine the extent to which variation in reviewer standards affected the reliability and validity of the program review process used to select papers for inclusion in the annual program. Data analyzed were the overall recommendation for acceptance and ratings on seven quality criteria…
Descriptors: Conference Papers, Data Analysis, Educational Research, Evaluation Criteria
Cason, Gerald J.; Cason, Carolyn L. – 1985
A more familiar and efficient method for estimating the parameters of Cason and Cason's model was examined. Using a two-step analysis based on linear regression, rather than the direct search interative procedure, gave about equally good results while providing a 33 to 1 computer processing time advantage, across 14 cohorts of junior medical…
Descriptors: Clinical Experience, Computers, Efficiency, Estimation (Mathematics)
Busch, Katharine Mitchell – 1985
A study explored (1) whether children demonstrated growth in writing ability through the experience of writing two or three times a week without direct teacher instruction; (2) what changes could be observed over time in the writing samples of second and fourth graders; (3) whether written language growth was a continuous progression of upward…
Descriptors: Elementary Education, Evaluation Criteria, Grade 2, Grade 4
Vinsonhaler, John F.; And Others – 1983
While diagnosis is generally considered a vital element in reading clinicians' expertise, research has revealed that even degreed, experienced reading clinicians display little personal consistency or agreement with one another when diagnosing simulated cases of reading difficulty. Three studies were conducted to determine if systematizing the…
Descriptors: Clinical Diagnosis, Elementary Secondary Education, Individualized Instruction, Interrater Reliability
Webber, Larry; And Others – 1986
Generalizability theory, which subsumes classical measurement theory as a special case, provides a general model for estimating the reliability of observational rating data by estimating the variance components of the measurement design. Research data from the "Heart Smart" health intervention program were analyzed as a heuristic tool.…
Descriptors: Behavior Rating Scales, Cardiovascular System, Error of Measurement, Generalizability Theory
Rock, D. A.; And Others – 1980
An experiment was designed that varied cutting score procedures, instructions, and types of judges in order to address the following questions concerning the Real Estate Licensing Examination: (1) Will the cutting score levels produced by groups of judges from differing backgrounds (academicians vs. practitioners vs. lawyers) using the same method…
Descriptors: Competence, Content Analysis, Criterion Referenced Tests, Cutting Scores
Peer reviewedTakashima, Hideyuki – British Journal of Language Teaching, 1987
Two native and one non-native (Japanese) instructors of English-as-a-foreign-language (EFL) corrected free compositions written by a Japanese college graduate with a degree in English. Analysis of the corrections revealed marked differences in type and number, with the non-native speaker most frequently indicating difficulty with articles, word…
Descriptors: Case Studies, English (Second Language), Error Analysis (Language), Interrater Reliability
Bensberg, Gerard J.; Irons, Thomas – Education and Training of the Mentally Retarded, 1986
The Vineland Adaptive Behavior Scale Classroom Edition and the American Association on Mental Deficiency Adaptive Behavior Scale School Edition were completed by teachers of 44 moderately and severely mentally retarded students; their parents (N=37) completed the Vineland Interview Edition-Survey Form. Comparison between the scales and respondents…
Descriptors: Adaptive Behavior (of Disabled), Behavior Rating Scales, Comparative Analysis, Elementary Secondary Education
Peer reviewedHamada, Roger S.; Tomikawa, Sandra – Educational and Psychological Measurement, 1986
The Windward Rating Scale (WRS), a locally-developed teacher rating scale of student behavior, was evaluated for potential use as a screening measure. Pre-certification ratings of 720 learning disabled students and non-special education students in grades K-6 were analyzed. Psychometric properties and diagnostic efficiency of the WRS were…
Descriptors: Concurrent Validity, Construct Validity, Diagnostic Tests, Educational Diagnosis
Peer reviewedCraig-Bray, Laura; Adams, Gerald R. – Journal of Youth and Adolescence, 1986
This article studies the convergent-divergent validity and reliability estimates for clinical interview and self-report measures of ego identity. The findings suggest that the two measures may be: (1) assessing relatively distinct forms of ego identity; or (2) that the ego-identity construct as measured by the process and outcome dimensions needs…
Descriptors: College Students, Higher Education, Interpersonal Competence, Interrater Reliability
The Attending Round Observation System: A Procedure for Describing Teaching During Attending Rounds.
Peer reviewedWeinholtz, Donn; And Others – Evaluation and the Health Professions, 1986
Two separate reliability studies were conducted on an observational instrument derived from previous qualitative research and designed for collecting data on teaching behaviors during attending rounds. The reliability estimates from both studies were quite high, indicating that the instrument shows promise for use in both research and evaluation…
Descriptors: Clinical Teaching (Health Professions), Graduate Medical Education, Higher Education, Interrater Reliability
Peer reviewedEpstein, Michael H.; Nieminen, Gayla S. – School Psychology Review, 1983
Teachers and classroom aides of learning disabled students completed the Conners Abbreviated Teacher Rating Scale (CATRS) on two separate occasions. The study investigated the inter-rater and intra-rater reliability of this instrument. CATRS appeared to have sufficient reliability to recommend its continued frequent use. (Author/DWH)
Descriptors: Behavior Rating Scales, Elementary Education, Elementary School Students, Hyperactivity


