Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Chuang, Tsung-Yen; Huang, Yun-Hsuan – Creativity Research Journal, 2015
Mobile technology has rapidly made digital games a popular entertainment to this digital generation, and thus digital game design received considerable attention in both the game industry and design education. Digital game design involves diverse dimensions in which digital game story design (DGSD) particularly attracts our interest, as the…
Descriptors: Creativity, Interrater Reliability, Construct Validity, Creativity Tests
Raczynski, Kevin R.; Cohen, Allan S.; Engelhard, George, Jr.; Lu, Zhenqiu – Journal of Educational Measurement, 2015
There is a large body of research on the effectiveness of rater training methods in the industrial and organizational psychology literature. Less has been reported in the measurement literature on large-scale writing assessments. This study compared the effectiveness of two widely used rater training methods--self-paced and collaborative…
Descriptors: Interrater Reliability, Writing Evaluation, Training Methods, Pacing
Anderson, Daniel; Irvin, Shawn; Alonzo, Julie; Tindal, Gerald A. – Educational Measurement: Issues and Practice, 2015
The alignment of test items to content standards is critical to the validity of decisions made from standards-based tests. Generally, alignment is determined based on judgments made by a panel of content experts with either ratings averaged or via a consensus reached through discussion. When the pool of items to be reviewed is large, or the…
Descriptors: Test Items, Alignment (Education), Standards, Online Systems
Stipancic, Kaila L.; Tjaden, Kris; Wilding, Gregory – Journal of Speech, Language, and Hearing Research, 2016
Purpose: This study obtained judgments of sentence intelligibility using orthographic transcription for comparison with previously reported intelligibility judgments obtained using a visual analog scale (VAS) for individuals with Parkinson's disease and multiple sclerosis and healthy controls (K. Tjaden, J. E. Sussman, & G. E. Wilding, 2014).…
Descriptors: Diseases, Neurological Impairments, Sentences, Measures (Individuals)
Derrick, Deirdre J. – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2016
Second language (L2) researchers often have to develop or change the instruments they use to measure numerous constructs (Norris & Ortega, 2012). Given the prevalence of researcher-developed and -adapted data collection instruments, and given the profound effect instrumentation can have on results, thorough reporting of instrumentation is…
Descriptors: Second Language Learning, Language Research, Research Methodology, Interrater Reliability
McGrane, Joshua Aaron; Humphry, Stephen Mark; Heldsinger, Sandra – Applied Measurement in Education, 2018
National standardized assessment programs have increasingly included extended written performances, amplifying the need for reliable, valid, and efficient methods of assessment. This article examines a two-stage method using comparative judgments and calibrated exemplars as a complement and alternative to existing methods of assessing writing.…
Descriptors: Standardized Tests, Foreign Countries, Writing Tests, Writing Evaluation
Cankoy, Osman; Özder, Hasan – EURASIA Journal of Mathematics, Science & Technology Education, 2017
The aim of this study is to develop a scoring rubric to assess primary school students' problem posing skills. The rubric including five dimensions namely solvability, reasonability, mathematical structure, context and language was used. The raters scored the students' problem posing skills both with and without the scoring rubric to test the…
Descriptors: Generalizability Theory, Elementary School Students, Foreign Countries, Problem Solving
Garte, Rebecca – International Journal of Progressive Education, 2017
This paper provides a historical analysis of the past century of progressive education, within the general socio-political context of schooling within the US. The purpose of this review is to create a social, historical and philosophical context for understanding the current narrative of progressive education that exists in educational policy…
Descriptors: Progressive Education, Educational History, Educational Practices, Philosophy
Wan, Ming Wai; Brooks, Ami; Green, Jonathan; Abel, Kathryn; Elmadih, Alya – International Journal of Behavioral Development, 2017
This study investigated the psychometrics of a recently developed global rating measure of videotaped parent-infant interaction, the "Manchester Assessment of Caregiver-Infant Interaction" (MACI), in a normative sample. Inter-rater reliability, stability over time, and convergent and discriminant validity were tested. Six-minute play…
Descriptors: Rating Scales, Parent Child Relationship, Infants, Interaction
Nehring, Andreas; Päßler, Andreas; Tiemann, Rüdiger – International Journal of Science and Mathematics Education, 2017
With regard to the moderate performance of German students in international large-scale assessments, one branch of German science education research is concerned with the construction and evaluation of competence models. Based on the theory-driven definition of competence levels, these models imply a correlation between the complexity of a…
Descriptors: Foreign Countries, Science Education, Chemistry, Science Teachers
Roberts, William L.; Boulet, John; Sandella, Jeanne – Advances in Health Sciences Education, 2017
When the safety of the public is at stake, it is particularly relevant for licensing and credentialing exam agencies to use defensible standard setting methods to categorize candidates into competence categories (e.g., pass/fail). The aim of this study was to gather evidence to support change to the Comprehensive Osteopathic Medical Licensing-USA…
Descriptors: Standard Setting, Comparative Analysis, Clinical Experience, Skill Analysis
Lorenz, Kent A.; van der Mars, Hans; Kulinna, Pamela H.; Ainsworth, Barbara E.; Hovell, Melbourne F. – Journal of School Health, 2017
Background: Behavioral support may be effective in increasing physical activity (PA) in school settings. However, there are no data collection systems to concurrently record PA and behavioral support. This paper describes the development and validation of the System for Observing Behavioral Ecology for Youth in Schools (SOBEYS)--an instrument used…
Descriptors: Physical Activities, Educational Environment, Observation, Validity
Özdas, Faysal; Batdi, Veli – Journal of Education and Training Studies, 2017
This thematic-based meta-analytic study aims to examine the effect of creativity on the academic success and learning retention scores of students. In the context of this aim, 18 out of 225 studies regarding creativity that were carried out between 2001 and 2011 have been obtained from certain national and international databases. The studies…
Descriptors: Meta Analysis, Creativity, Scores, Retention (Psychology)
Price, Keith; Coleman, Susan; Byrd, Gary R. – Administrative Issues Journal: Connecting Education, Practice, and Research, 2014
The study of capital juries remains a subject of critical interest for the public and for legislative and judicial policy makers as well as legal scholars and social scientists. Cowan, Thompson, and Ellsworth established one of the standard methodologies for examination of this topic in their 1984 seminal study by observing the subjects' debate…
Descriptors: Court Litigation, Death, Punishment, Bias
Karimi, Hamid; O'Brian, Sue; Onslow, Mark; Jones, Mark – Journal of Speech, Language, and Hearing Research, 2014
Purpose: Percentage of syllables stuttered (%SS) and severity rating (SR) scales are measures in common use to quantify stuttering severity and its changes during basic and clinical research conditions. However, their reliability has not been assessed with indices measuring both relative and absolute reliability. This study was designed to provide…
Descriptors: Reliability, Syllables, Stuttering, Severity (of Disability)

Peer reviewed
Direct link
