Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Marshall, Seth J.; Wodrich, David L.; Gorin, Joanna S. – Educational and Psychological Measurement, 2009
This study examined psychometric properties of the Tempe Sorting Task (TST), a new measure of executive function (EF) for children. To increase the meaningfulness of test score interpretations, an age-appropriate construct was employed to incorporate Denckla's description of EF. Multiple measures of EF, including the TST, were collected for…
Descriptors: Cognitive Tests, Cognitive Processes, Children, Attention Deficit Hyperactivity Disorder
Royal-Dawson, Lucy; Baird, Jo-Anne – Educational Measurement: Issues and Practice, 2009
Hundreds of thousands of raters are recruited internationally to score examinations, but little research has been conducted on the selection criteria for these raters. Many countries insist upon teaching experience as a selection criterion and this has frequently become embedded in the cultural expectations surrounding the tests. Shortages in…
Descriptors: National Curriculum, Scoring, Foreign Countries, Teaching Experience
McCluskey, Annie; Bishop, Bianca – Journal of Continuing Education in the Health Professions, 2009
Introduction: Health educators who teach professionals about evidence-based practice (EBP) need instruments to measure change in skills and knowledge. This study aimed to develop and evaluate the interrater reliability, internal consistency, and responsiveness of the Adapted Fresno Test (AFT) of competence in EBP. Methods: Reliability testing…
Descriptors: Interrater Reliability, Correlation, Psychometrics, Occupational Therapy
Katz, Nolan; Petscher, Yaacov; Welles, Theresa – Journal of Attention Disorders, 2009
Objective: Formal criteria for the use informant-ratings of adult ADHD symptoms have not been established yet they are commonplace in standard assessment batteries. Method: The current study explores the relationship between self- and informant-ratings and the impact of requiring interrater agreement in a sample comprised of 190 self-referred…
Descriptors: College Students, Hyperactivity, Interrater Reliability, Attention Deficit Disorders
Lee, Alice; Whitehill, Tara L.; Ciocca, Valter – Clinical Linguistics & Phonetics, 2009
Reliable perceptual judgement is important for documenting the severity of hypernasality, but high reliability can be difficult to obtain. This study investigated the effect of practice and feedback on intra-judge and inter-judge reliability of hypernasality judgements. The judges were 36 speech-language therapy students, who were randomly…
Descriptors: Auditory Perception, Listening, Speech Evaluation, Interrater Reliability
Diefenbach, Gretchen J.; Tolin, David F.; Meunier, Suzanne A.; Gilliam, Christina M. – Gerontologist, 2009
Purpose: This study determined the psychometric properties of a variety of anxiety measures administered to older adults receiving home care services. Design and Methods: Data were collected from 66 adults aged 65 years and older who were receiving home care services. Participants completed self-report and clinician-rated measures of anxiety and…
Descriptors: Mental Disorders, Interrater Reliability, Geriatrics, Psychometrics
McIver, Kerry L.; Brown, William H.; Pfeiffer, Karin A.; Dowda, Marsha; Pate, Russell R. – Journal of Applied Behavior Analysis, 2009
The present study describes the development and pilot testing of the Observation System for Recording Physical Activity in Children-Home version. This system was developed to document physical activity and related physical and social contexts while children are at home. An analysis of interobserver agreement and a description of children's…
Descriptors: Physical Activities, Observation, Family Environment, Physical Activity Level
Gomez-Garcia, Maria – ProQuest LLC, 2011
The design and validation of a classroom observation instrument to provide formative feedback for teachers of EFL in Spain is the overarching purpose of this study. This study proposes that a valid and reliable classroom observation instrument, based on effective practice in teaching EFL, can be developed and used in Spain to enable teachers to…
Descriptors: Expertise, Feedback (Response), Classroom Observation Techniques, Formative Evaluation
Zhu, Weimo; Rink, Judy; Placek, Judith H.; Graber, Kim C.; Fox, Connie; Fisette, Jennifer L.; Dyson, Ben; Park, Youngsik; Avery, Marybell; Franck, Marian; Raynes, De – Measurement in Physical Education and Exercise Science, 2011
New testing theories, concepts, and psychometric methods (e.g., item response theory, test equating, and item bank) developed during the past several decades have many advantages over previous theories and methods. In spite of their introduction to the field, they have not been fully accepted by physical educators. Further, the manner in which…
Descriptors: Physical Education, Quality Control, Psychometrics, Item Response Theory
Dixon-Krauss, Lisbeth; Januszka, Cynthia M.; Chae, Chan-Ho – Journal of Research in Childhood Education, 2010
This study reports the construction of the Dialogic Reading Inventory (DRI), a tool for assessing a parent and child's storybook reading behaviors. Twenty-three parent-child dyads participated in the study. The Adult-Child Interactive Reading Inventory (DeBruin-Parecki, 1999) items were grouped into four categories and revised to reflect current…
Descriptors: Graduate Students, Test Items, Early Reading, Phonological Awareness
Goldston, M. Jenice; Day, Jeanelle Bland; Sundberg, Cheryl; Dantzler, John – International Journal of Science and Mathematics Education, 2010
The purpose of this paper is to describe the procedures and the analysis of an instrument designed to measure preservice teachers' ability to develop appropriate 5E learning cycle lesson plans. The 5E "inquiry lesson plan" (ILP) rubric is comprised of 12 items with a scoring range of zero to four points per item. Content validity was…
Descriptors: Preservice Teachers, Lesson Plans, Construct Validity, Factor Analysis
Stewart, Tracie L.; Myers, Ashley C.; Culley, Marci R. – Teaching of Psychology, 2010
We assessed the benefits of employing microthemes--short in-class writing assignments designed to facilitate active learning--as pedagogical tools in psychology courses. Students in target course sections completed 10 in-class microthemes during a semester. We designed the microthemes to serve as active learning assignments that would enhance…
Descriptors: Feedback (Response), Writing Assignments, Active Learning, Psychology
Psychometric Evaluation of the Dutch Version of the Mood, Interest and Pleasure Questionnaire (MIPQ)
Petry, Katja; Kuppens, Sofie; Vos, Pieter; Maes, Bea – Research in Developmental Disabilities: A Multidisciplinary Journal, 2010
Recently, several instruments have been developed to measure the subjective component of the quality of life (QOL) of people with profound intellectual and multiple disabilities (PIMD). A next step, however, must be the further validation of these instruments. The present study aimed at evaluating the psychometric properties of one of these…
Descriptors: Check Lists, Multiple Disabilities, Severe Mental Retardation, Quality of Life
Gebril, Atta – Assessing Writing, 2010
Integrated tasks are currently employed in a number of L2 exams since they are perceived as an addition to the writing-only task type. Given this trend, the current study investigates composite score generalizability of both reading-to-write and writing-only tasks. For this purpose, a multivariate generalizability analysis is used to investigate…
Descriptors: Scoring, Scores, Second Language Instruction, Writing Evaluation
Incikabi, Lutfi; Sancar Tokmak, Hatice – Educational Media International, 2012
This case study examined the educational software evaluation processes of pre-service teachers who attended either expertise-based training (XBT) or traditional training in conjunction with a Software-Evaluation checklist. Forty-three mathematics teacher candidates and three experts participated in the study. All participants evaluated educational…
Descriptors: Foreign Countries, Novices, Check Lists, Mathematics Education

Peer reviewed
Direct link
