Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Schutz, Aaron; Moss, Pamela A. – Education Policy Analysis Archives, 2004
A central dilemma of portfolio assessment is that as the richness of the data available to readers increases, so do the challenges involved in ensuring acceptable reliability among readers. Drawing on empirical and theoretical work in discourse analysis, ethnomethodology, and other fields, we argue that this dilemma results, in part, from the fact…
Descriptors: Portfolios (Background Materials), Teacher Effectiveness, Portfolio Assessment, Interrater Reliability
Hommersen, Paul; Murray, Candice; Ohan, Jeneva L.; Johnston, Charlotte – Journal of Emotional and Behavioral Disorders, 2006
In this article, the authors report the psychometric properties of a parent-completed rating scale based on the criteria for oppositional defiant disorder (ODD) in the "Diagnostic and Statistical Manual of Mental Disorders-Fourth Edition-Text Revision" ("DSM-IV-TR"). Mothers of 294 boys and 48 girls with…
Descriptors: Psychometrics, Child Behavior, Check Lists, Behavior Problems
Turkstra, Lyn S. – Journal of Speech, Language, and Hearing Research, 2005
Purpose: The purpose of this study was to address the lack of quantitative data on eye-to-face gaze (also known as eye contact) in the literature on pragmatic communication. The study focused on adolescents and young adults with traumatic brain injury (TBI), as gaze often is included in social skills intervention in this population. Method: Gaze…
Descriptors: Speech Communication, Intervention, Young Adults, Adolescents
Grant, Leslie – 1995
Many states currently offer bilingual certification or endorsement, encouraging both practicing teachers and prospective teachers to complete their requirements necessary to add this certification on to their regular teaching license. Although these requirements routinely include courses in bilingual education, the "second" language…
Descriptors: Bilingual Education, Educational Policy, Factor Analysis, Interrater Reliability
Reckase, Mark D.; And Others – 1995
The research reported in this paper was conducted to gain information to guide the selection of standard setting procedures for use with polytomous items to set achievement levels on the National Assessment of Educational Progress (NAEP) assessments in U.S. History and geography. Standard-setting procedures were evaluated to determine the relative…
Descriptors: Academic Achievement, Educational Assessment, Elementary Secondary Education, Evaluation Methods
Crehan, Kevin D.; And Others – 1994
The Clark County (Nevada) School District introduced performance assessments into its assessment program for grades one through six as part of an effort to bring the assessment program in line with the revised curriculum. To study the effectiveness of these assessments, reading performance assessment results for grades three through five were…
Descriptors: Cost Effectiveness, Curriculum Development, Educational Assessment, Elementary Education
Salies, Tania Gastao – 1998
A discussion of the evaluation of writing, particularly in English as a Second Language, argues for a communicative approach reflecting the current approach to language teaching and learning. The movement toward more communication-oriented and more valid language testing is examined briefly, and direct assessment is chosen as the preferred format…
Descriptors: Communicative Competence (Languages), English (Second Language), Evaluation Criteria, Foreign Countries
Hamp-Lyons, Liz – 1986
A study investigated whether transfer from native to second language in writing occurs, and if so, whether the different rhetorical structures that student writers from other cultures bring to the task of writing in English affect their writing in ways that may affect the grades assigned by experienced raters. To do so, the processes used by essay…
Descriptors: Classroom Techniques, English for Academic Purposes, English (Second Language), Foreign Countries
Marsh, David – 1987
The speaking ability of 40 Finnish college students of English as a Second Language was assessed in tests of interactional and transactional language function. In the interactional test, the learner introduced a topic of his choice and attempted to converse with two native speakers of English. In the transactional test, the learner watched a video…
Descriptors: Applied Linguistics, Code Switching (Language), College Students, English (Second Language)
Janopoulos, Michael – 1991
Holistic scoring is widely used to assess writing proficiency in English-as-a-Second-Language (ESL) composition. Written recall protocols have recently been used to investigate the relationship between how much holistic scorers comprehend of a given text and how high they rate the quality of that text. This study investigated whether readers…
Descriptors: College Faculty, English (Second Language), Higher Education, Holistic Approach
Marsh, Herbert W.; Ireland, Robert – 1984
To test the applicability of multidimensional ratings of writing effectiveness that are amenable to normal classroom usage, all grade 7 students (N=139) from one suburban school (Sydney, Australia) wrote a brief essay. Master and student teachers evaluated all the essays according to overall effectiveness of written expression and according to…
Descriptors: Correlation, Essay Tests, Foreign Countries, Grade 7
Shann, Mary H. – 1985
This document provides a report on the evaluation of STEPS (Surviving Today's Experiences and Problems Successfully), the Waters Foundation Curriculum for teaching thinking skills and expository writing. Data were collected during the 1983-84 academic year, using a junior high school as a field site. Four experimental classes were taught STEPS two…
Descriptors: Cognitive Processes, Curriculum Evaluation, Instructional Effectiveness, Interrater Reliability
Oscarson, Mats – 1990
This report describes the results of adult education students and upper secondary school students on two recently introduced standardized English tests in Sweden. Comparisons of the results are made between these two categories of students because they are entitled to compete, on an equal basis, for admission into restricted intake programs of…
Descriptors: Adult Education, Adult Students, Comparative Analysis, English (Second Language)
An Observational Study of the Lecture Delivery Style Characteristics of High and Low Rated Lectures.
Albanese, Mark A.; And Others – 1986
This study identifies distinguishing differences in lecture delivery styles of lecturers rated by students in a large multi-instructor course: the Introduction to Clinical Medicine Course (ICM). The 20 lowest- and highest-rated lecturers of the 1982 and 1983 ICM courses served as the target group. Non-student raters observing the 1984 lectures…
Descriptors: Analysis of Variance, Behavior Rating Scales, Higher Education, Interrater Reliability
Robertson, Gary; And Others – 1989
A study was conducted to refine the writing scale incorporated into the Peabody Individual Achievement Test-Revised. The test uses a single scale for judging writing samples from students in grades 2 through 12. It was questioned whether a single, relatively brief rating scale could have the sensitivity required to discriminate among the range of…
Descriptors: Elementary School Students, Elementary Secondary Education, Interrater Reliability, Latent Trait Theory

Peer reviewed
Direct link
