Publication Date
| In 2026 | 0 |
| Since 2025 | 13 |
| Since 2022 (last 5 years) | 48 |
| Since 2017 (last 10 years) | 151 |
| Since 2007 (last 20 years) | 301 |
Descriptor
| Interrater Reliability | 503 |
| Test Reliability | 503 |
| Test Validity | 260 |
| Test Construction | 106 |
| Foreign Countries | 103 |
| Psychometrics | 91 |
| Evaluation Methods | 90 |
| Scores | 67 |
| Correlation | 62 |
| Scoring | 61 |
| Rating Scales | 58 |
| More ▼ | |
Source
Author
| Epstein, Michael H. | 7 |
| Johnson, Evelyn S. | 4 |
| Matson, Johnny L. | 4 |
| Tasse, Marc J. | 4 |
| Aman, Michael G. | 3 |
| Canivez, Gary L. | 3 |
| Capie, William | 3 |
| Conroy, Maureen A. | 3 |
| Crawford, Angela R. | 3 |
| Lecavalier, Luc | 3 |
| McLeod, Bryce D. | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 41 |
| Practitioners | 8 |
| Administrators | 3 |
| Teachers | 3 |
| Counselors | 1 |
Location
| Turkey | 11 |
| Canada | 10 |
| Australia | 9 |
| United Kingdom | 9 |
| Pennsylvania | 7 |
| Florida | 6 |
| Netherlands | 6 |
| Sweden | 5 |
| United Kingdom (England) | 5 |
| China | 4 |
| Illinois | 4 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 2 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
McIntosh, Kent; Massar, Michelle M.; Algozzine, Robert F.; George, Heather Peshak; Horner, Robert H.; Lewis, Timothy J.; Swain-Bradway, Jessica – Journal of Positive Behavior Interventions, 2017
Full and durable implementation of school-based interventions is supported by regular evaluation of fidelity of implementation. Multiple assessments have been developed to evaluate the extent to which schools are applying the core features of school-wide positive behavioral interventions and supports (SWPBIS). The "SWPBIS Tiered Fidelity…
Descriptors: Positive Behavior Supports, Fidelity, Program Implementation, Program Evaluation
Bartelink, Cora; de Kwaadsteniet, Leontien; ten Berge, Ingrid J.; Witteman, Cilia L. M. – Child & Youth Care Forum, 2017
Background: The LIRIK, an instrument for the assessment of child safety and risk, is designed to improve assessments by guiding professionals through a structured evaluation of relevant signs, risk factors, and protective factors. Objective: We aimed to assess the interrater agreement and the predictive validity of professionals' judgments made…
Descriptors: Child Safety, Test Validity, Test Reliability, Risk
Lambie, Glenn W.; Mullen, Patrick R.; Swank, Jacqueline M.; Blount, Ashley – Measurement and Evaluation in Counseling and Development, 2018
Supervisors evaluated counselors-in-training at multiple points during their practicum experience using the Counseling Competencies Scale (CCS; N = 1,070). The CCS evaluations were randomly split to conduct exploratory factor analysis and confirmatory factor analysis, resulting in a 2-factor model (61.5% of the variance explained).
Descriptors: Counselor Training, Counseling, Measures (Individuals), Competence
Uzun, N. Bilge; Aktas, Mehtap; Asiret, Semih; Yormaz, Seha – Asian Journal of Education and Training, 2018
The goal of this study is to determine the reliability of the performance points of dentistry students regarding communication skills and to examine the scoring reliability by generalizability theory in balanced random and fixed facet (mixed design) data, considering also the interactions of student, rater and duty. The study group of the research…
Descriptors: Foreign Countries, Generalizability Theory, Scores, Test Reliability
Benton, Stephen L.; Li, Dan – IDEA Center, Inc., 2018
This technical report describes the results of analyses performed on data collected from 2013 to 2017, using the IDEA Feedback System for Administrators (FSA). The FSA is used to gather impressions from core constituents about an administrator's performance of relevant administrative roles, as well as her/his leadership style, interpersonal…
Descriptors: Feedback (Response), Administrators, Administrator Attitudes, Administrator Role
Maxwell, Bruce; Boon, Helen; Tanchuk, Nicolas; Rauwerda, Bryan – Journal of Moral Education, 2021
This article documents the adaptation, piloting and validation of a measure of teachers' ethical sensitivity. To create the test, we modified a measure from dentistry drawing on literature in teacher professional ethics and drew on the expertise of professional ethics scholars and practitioners. Based on the results of Rasch analysis combined with…
Descriptors: Ethics, Moral Values, Scores, Teacher Education Programs
Li, Zijia; Gooden, Caroline; Toland, Michael D. – Journal of Early Intervention, 2019
This study provides preliminary evidence for reliability and validity of the Hawaii Early Learning Profile Strands 0-3 (HELP Strands 0-3), an assessment instrument for young children. First, the degree of interobserver agreement for a sample of representative HELP items was examined; results indicated that HELP scoring was dependable and…
Descriptors: Measures (Individuals), Psychometrics, Early Childhood Education, Test Reliability
Massar, Michelle M.; McIntosh, Kent; Mercer, Sterett H. – Remedial and Special Education, 2019
Assessing fidelity of implementation of school-based interventions is a critical factor in successful implementation and sustainability. The Tiered Fidelity Inventory (TFI) was developed as a comprehensive measure of all three tiers of School-Wide Positive Behavioral Interventions and Supports (SWPBIS) and is intended to measure the extent to…
Descriptors: Fidelity, Intervention, Program Implementation, Positive Behavior Supports
Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020
Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…
Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus
van Kernebeek, Willem G.; de Schipper, Antoine W.; Savelsbergh, Geert J. P.; Toussaint, Huub M. – Measurement in Physical Education and Exercise Science, 2018
In The Netherlands, the 4-Skills Scan is an instrument for physical education teachers to assess gross motor skills of elementary school children. Little is known about its reliability. Therefore, in this study the test-retest and inter-rater reliability was determined. Respectively, 624 and 557 Dutch 6- to 12-year-old children were analyzed for…
Descriptors: Foreign Countries, Interrater Reliability, Pretests Posttests, Psychomotor Skills
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Ballard, Laura – ProQuest LLC, 2017
Rater scoring has an impact on writing test reliability and validity. Thus, there has been a continued call for researchers to investigate issues related to rating (Crusan, 2015). Investigating the scoring process and understanding how raters arrive at particular scores are critical "because the score is ultimately what will be used in making…
Descriptors: Evaluators, Schemata (Cognition), Eye Movements, Scoring Rubrics
Ziegler, Wolfram; Staiger, Anja; Schölderle, Theresa; Vogel, Mathias – Journal of Speech, Language, and Hearing Research, 2017
Purpose: Standardized clinical assessment of dysarthria is essential for management and research. We present a new, fully standardized dysarthria assessment, the Bogenhausen Dysarthria Scales (BoDyS). The measurement model of the BoDyS is based on auditory evaluations of connected speech using 9 scales (traits) assessed by 4 elicitation methods.…
Descriptors: Auditory Evaluation, Test Reliability, Test Validity, Rating Scales
Roseth, Nicholas E. – Research and Issues in Music Education, 2019
The primary purpose of this study was to design a reliable and valid continuous-time coding tool for measuring teacher use of space and teacher interactions based on prior research (Hesler, 1972; Martin, 2002). The tool captured teachers' use of space as they moved through 14 identified areas of the large instrumental ensemble classroom and…
Descriptors: Space Utilization, Measures (Individuals), Music Teachers, Test Reliability
Robinson, Anna; Elliott, Robert – Journal of Autism and Developmental Disorders, 2016
People with autism spectrum disorder (ASD), can have difficulties in emotion processing, including recognising their own and others' emotions, leading to problems in emotion regulation and interpersonal relating. This study reports the development and piloting of the Client Emotional Processing Scale-Autism Spectrum (CEPS-AS), a new observer…
Descriptors: Empathy, Autism, Pervasive Developmental Disorders, Measures (Individuals)

Peer reviewed
Direct link
