Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
De Houwer, Annick; Bornstein, Marc H.; Leach, Diane B. – Journal of Child Language, 2005
Thirty middle- to upper middle-class monolingual Dutch speaking families consisting of at least a mother and a father completed the Infant Form "Words and Gestures" of the Dutch adaptation of the MacArthur Communicative Development Inventory for the same child at 1;1. Considerable inter- and intrafamily variation emerged in how two (or three)…
Descriptors: Monolingualism, Indo European Languages, Language Acquisition, Communicative Competence (Languages)
Oreck, Barry A.; Owen, Steven V.; Baum, Susan M. – Journal for the Education of the Gifted, 2003
The lack of valid, research-based methods to identify potential artistic talent hampers the inclusion of the arts in programs for the gifted and talented. The Talent Assessment Process in Dance, Music, and Theater (D/M/T TAP) was designed to identify potential performing arts talent in diverse populations, including bilingual and special education…
Descriptors: Talent, Art Education, Prediction, Construct Validity
Dike, Shelley E.; Kochan, Frances K.; Reed, Cynthia; Ross, Margaret – International Journal of Leadership in Education, 2006
This study gathered information regarding professional military educators' perceptions of the concept of critical thinking to determine whether there was a common definition and a shared meaning of the concept among them. Although there did not appear to be a common definition of critical thinking among this group, 10 categories and four themes…
Descriptors: Critical Thinking, Concept Formation, Military Personnel, Higher Education
Kang, Sang-Jo; Kang, Minsoo – Measurement in Physical Education and Exercise Science, 2006
In many countries, an athlete's performance at sporting competitions is often used as part of the selection criteria for entry into college. These criteria could be biased depending upon the procedures utilized by the authorities in a particular country. The purpose of this study was to calibrate, by using the Rasch rating scale model, the…
Descriptors: Athletes, Rating Scales, Weighted Scores, Judges
Noreau, Luc; Lepage, Celine; Boissiere, Lucie; Picard, Roger; Fougeyrollas, Patrick; Mathieu, Jean; Desmarais, Gilbert; Nadeau, Line – Developmental Medicine & Child Neurology, 2007
The objectives of this study were: (1) to examine the psychometric properties of the Assessment of Life Habits (LIFE-H) for children; and (2) to draw a profile of the level of participation among children of 5 to 13 years of age with various impairments. The research team adapted the adult version of the LIFE-H in order to render it more…
Descriptors: Genetic Disorders, Head Injuries, Neurological Impairments, Measurement Techniques
Meier, Anne; Spada, Hans; Rummel, Nikol – International Journal of Computer-Supported Collaborative Learning, 2007
The analysis of the process of collaboration is a central topic in current CSCL research. However, defining process characteristics relevant for collaboration quality and developing instruments capable of assessing these characteristics are no trivial tasks. In the assessment method presented in this paper, nine qualitatively defined dimensions of…
Descriptors: Interrater Reliability, Cooperation, Content Analysis, Cognitive Processes
Murdock, Linda C.; Cost, Hollie C.; Tieso, Carol – Focus on Autism and Other Developmental Disabilities, 2007
The "Social-Communication Assessment Tool" (S-CAT) was created as a direct observation instrument to quantify specific social and communication deficits of children with autism spectrum disorders (ASD) within educational settings. In this pilot study, the instrument's content validity and interrater reliability were investigated to determine the…
Descriptors: Nonverbal Communication, Autism, Content Validity, Test Validity
Erkens, Gijsbert; Janssen, Jeroen – International Journal of Computer-Supported Collaborative Learning, 2008
Although protocol analysis can be an important tool for researchers to investigate the process of collaboration and communication, the use of this method of analysis can be time consuming. Hence, an automatic coding procedure for coding dialogue acts was developed. This procedure helps to determine the communicative function of messages in online…
Descriptors: Protocol Analysis, Validity, Cooperation, Coding
Du, Yi; And Others – 1997
The FACETS equating model meets the complex requirements for equating writing performance assessment across both raters and prompts. This study is based on an equating of the 1996 writing performance assessment in the Minneapolis Public Schools (Minnesota). Raters and prompts were equated simultaneously using the FACETS model. About 3,000 fifth…
Descriptors: Elementary Education, Elementary School Students, Equated Scores, Grade 5
Raymond, Mark R.; Viswesvaran, Chockalingam – 1991
This study illustrates the use of three least-squares models to control for rater effects in performance evaluation: (1) ordinary least squares (OLS); (2) weighted least squares (WLS); and (3) OLS subsequent to applying a logistic transformation to observed ratings (LOG-OLS). The three models were applied to ratings obtained from four…
Descriptors: Evaluators, Higher Education, Interrater Reliability, Least Squares Statistics
Naizer, Gilbert – 1992
A measurement approach called generalizability theory (G-theory) is an important alternative to the more familiar classical measurement theory that yields less useful coefficients such as alpha or the KR-20 coefficient. G-theory is a theory about the dependability of behavioral measurements that allows the simultaneous estimation of multiple…
Descriptors: Error of Measurement, Estimation (Mathematics), Generalizability Theory, Higher Education
Bachman, Lyle F.; And Others – 1993
This paper outlines the development of a performance assessment measure of language speaking ability, the Language Ability Assessment System (LAAS), which is highly reliable and can be examined for reliability through modern measurement theories, such as generalizability theory (G-theory) and the many-facet Rasch theory. LAAS was developed to…
Descriptors: College Students, Higher Education, Interrater Reliability, Language Proficiency
Collins, Angelo – 1990
Since 1986, the Teacher Assessment Project (TAP) at Stanford University (California) has been exploring performance-based modes of assessment that capture the complexity of the practice of teaching. After a brief description of the rating procedures, the raters, and the situated-performances designed by the TAP for assessment, this paper describes…
Descriptors: Biology, Comparative Analysis, Evaluators, High Schools
Llabre, Maria M.; Forgan, Harry W. – Florida Journal of Educational Research, 1985
The interrater reliability and factor structure of colleague ratings of university faculty were studied for 46 faculty members from 4 departments within the School of Education and Allied Professions at the University of Miami (Florida). Within each department, each faculty member rated every other faculty member using two methods: (1) a global…
Descriptors: College Faculty, Evaluation Methods, Factor Analysis, Higher Education
Clark, John L. D. – 1986
A study of the reliability of the proficiency ratings scale and techniques used by three federal government agencies--the Central Intelligence Agency, the Defense Language Institute, and the Foreign Service Institute (FSI)--to test employees' oral language proficiency in French and German had two randomly selected two-person teams of testers from…
Descriptors: Comparative Analysis, Federal Government, French, German

Peer reviewed
Direct link
