Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Surface, Eric A.; Dierdorff, Erich C. – Foreign Language Annals, 2003
The reliability of the ACTFL Oral Proficiency Interview (OPI) has not been reported since ACTFL revised its speaking proficiency guidelines in 1999. Reliability data for assessments should be reported periodically to provide users with enough information to evaluate the psychometric characteristics of the assessment. This study provided the most…
Descriptors: Language Tests, Interrater Reliability, Program Effectiveness, Psychometrics
Center for Innovation in Assessment (NJ1), 2007
Research was conducted to evaluate how well the "Indiana Reading Assessment--Kindergarten" evaluates various reading skills of kindergarten students. Multiple analyses were conducted; while the results of all the analyses were encouraging, the results derived from the concurrent validity study were most significant. All correlations were…
Descriptors: Reading Tests, Test Validity, Test Reliability, Interrater Reliability
Muyskens, Paul; Marston, Doug; Reschly, Amy L. – California School Psychologist, 2007
Behavioral difficulties of school-aged students are typically dealt with in a reactive, rather than preventative manner. This article examines a proactive approach, consistent with the Response-to-Intervention model, using a screening measure designed to identify students at risk for behavior difficulties and targeting these students for early…
Descriptors: Early Intervention, At Risk Students, Teacher Attitudes, Academic Achievement
Tillema, Harm; Smith, Kari – Teaching and Teacher Education: An International Journal of Research and Studies, 2007
Two inherently contradictory forces are pushing for reform in portfolio assessment. On the one hand there is a felt need for creating more rigid standards that operate to promote uniformity of ratings in appraisal practice to certify achievement. However, on the other hand, critical questions are being raised about separating acclaimed portfolio…
Descriptors: Portfolios (Background Materials), Educational Change, Evaluation Criteria, Teacher Educators
Chen, H. Julie – 1995
A study investigated 42 native English-speakers' (NSs) perceptions of the pragmatic appropriateness of refusal statements. The NSs rated the appropriateness of 24 written statements in 4 different refusal scenarios, which were collected from both native speakers and non-native speakers. Four weeks later, as a reliability check, the subjects rated…
Descriptors: Attitudes, Comparative Analysis, English (Second Language), Interrater Reliability
Parkes, Jay; Suen, Hoi K. – 1995
This study demonstrates the advantages of using a constrained optimization algorithm to explore the optimal number of prompts, modes of discourse, and raters for achieving an acceptable level of reliability during a direct writing assessment. Writing samples elicited from 50 college students were rated by 3 graduate students and the scores…
Descriptors: Algorithms, College Students, Educational Assessment, Generalizability Theory
Brody, Leslie R.; Hay, Deborah H. – 1991
This paper reports on evaluations of a projective measure of self-esteem adapted from the Tasks of Emotional Development (TED). The evaluations were conducted in 7 studies with a total sample of 416 children and adults. The revised TED uses a five-point scoring system ranging from negative to positive self-esteem. Interrater reliability in the…
Descriptors: Adults, Children, Interrater Reliability, Measurement Techniques
Engelhard, George, Jr. – 1991
A many-faceted Rasch model (FACETS) is presented for the measurement of writing ability. The FACETS model is a multivariate extension of Rasch measurement models that can be used to provide a framework for calibrating both raters and writing tasks within the context of writing assessment. A FACETS model is described based on the current procedures…
Descriptors: Grade 8, Holistic Evaluation, Interrater Reliability, Item Response Theory
Fitz, Don – 1984
The Client Observation Checklist (COC) was developed to evaluate Project ADAPT's intervention in three behavioral areas: bathing; dressing; and socialization. Project ADAPT is designed to provide services to meet the needs of chronically mentally ill residents of nursing homes. Specifically, the project provides staff trained to work with the…
Descriptors: Client Characteristics (Human Services), Hygiene, Institutionalized Persons, Interrater Reliability
Gregory, Kemp – 1991
A balanced appraisal of holistic scoring of writing is presented via: examination of the present popularity of holistic scoring; analysis of several weaknesses associated with the holistic scoring method; and recommendations for remedying these weaknesses. Six reasons for the popularity of holistic scoring are: (1) relative lack of expense; (2)…
Descriptors: Child Development, Cost Effectiveness, Elementary Secondary Education, Holistic Evaluation
Merrill, Beverly; Peterson, Sarah – 1986
When the Mesa, Arizona Public Schools initiated an ambitious writing instruction program in 1978, two assessments based on student writing samples were developed. The first is based on a ninth grade proficiency test. If the student does not pass the test, high school remediation is provided. After 1987, students must pass this test in order to…
Descriptors: Computer Assisted Testing, Elementary Secondary Education, Graduation Requirements, Holistic Evaluation
McIntyre, Kenneth E. – 1986
This paper dealt with the use of classroom observation data for formative evaluation purposes, and with a research project in which scores based on observed performance of teachers in secondary school algebra and English classes were compared with efficiency scores based on an input-output model. The model, using Data Envelopment Analysis (DEA)…
Descriptors: Algebra, Classroom Observation Techniques, Classroom Research, Evaluation Methods
Humes, Ann – 1983
This paper, as an illustration of the procedures involved in a cooperative effort, describes a project in which the Southwest Regional Laboratory (SWRL) designed and developed a minimum standards test in collaboration with a large urban school district in California. The activity described focuses on the writing sample included in the test. The…
Descriptors: High Schools, Institutional Cooperation, Interrater Reliability, Minimum Competency Testing
Mitchell, Karen J.; Anderson, Judith A. – 1986
A pilot essay was included in the 1985 Spring and Fall administrations of the Medical College Admission Test. A sample of 320 of the essays written by Fall examinees who had expressed an interest in allopathic medicine was used to calculate interrater reliability estimates. Sixteen of 20 readers who had been trained by White's suggestions for…
Descriptors: Analysis of Variance, College Entrance Examinations, Essay Tests, Higher Education
Cloud-Silva, Connie; Denton, Jon J. – 1988
A prototype low inference observation instrument to measure minimal teaching competencies of teaching candidates was deductively developed. Focus is on determining if observers could be trained to use the observation instrument with a high degree of reliability and validity. The instrument, entitled Classroom Observation and Assessment Scale for…
Descriptors: Classroom Observation Techniques, Elementary Secondary Education, Evaluation Methods, Interrater Reliability

Peer reviewed
Direct link
