Publication Date
| In 2026 | 0 |
| Since 2025 | 13 |
| Since 2022 (last 5 years) | 48 |
| Since 2017 (last 10 years) | 151 |
| Since 2007 (last 20 years) | 301 |
Descriptor
| Interrater Reliability | 503 |
| Test Reliability | 503 |
| Test Validity | 260 |
| Test Construction | 106 |
| Foreign Countries | 103 |
| Psychometrics | 91 |
| Evaluation Methods | 90 |
| Scores | 67 |
| Correlation | 62 |
| Scoring | 61 |
| Rating Scales | 58 |
| More ▼ | |
Source
Author
| Epstein, Michael H. | 7 |
| Johnson, Evelyn S. | 4 |
| Matson, Johnny L. | 4 |
| Tasse, Marc J. | 4 |
| Aman, Michael G. | 3 |
| Canivez, Gary L. | 3 |
| Capie, William | 3 |
| Conroy, Maureen A. | 3 |
| Crawford, Angela R. | 3 |
| Lecavalier, Luc | 3 |
| McLeod, Bryce D. | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 41 |
| Practitioners | 8 |
| Administrators | 3 |
| Teachers | 3 |
| Counselors | 1 |
Location
| Turkey | 11 |
| Canada | 10 |
| Australia | 9 |
| United Kingdom | 9 |
| Pennsylvania | 7 |
| Florida | 6 |
| Netherlands | 6 |
| Sweden | 5 |
| United Kingdom (England) | 5 |
| China | 4 |
| Illinois | 4 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 2 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Weaver, R. Glenn; Webster, Collin A.; Erwin, Heather; Beighle, Aaron; Beets, Michael W.; Choukroun, Hadrien; Kaysing, Nicole – Measurement in Physical Education and Exercise Science, 2016
The System for Observing Fitness Instruction Time (SOFIT) is commonly used to measure variables related to physical activity during physical education (PE). However, SOFIT does not yield detailed information about teacher practices related to children's moderate-to-vigorous physical activity (MVPA). This study describes the modification of SOFIT…
Descriptors: Physical Education, Observation, Physical Activity Level, Teaching Methods
Iberri-Shea, Gina – Cogent Education, 2017
Prominent spoken language assessments such as the Oral Proficiency Interview and the Test of Spoken English have been primarily concerned with speaking ability as it relates to conversation. This paper looks at an additional aspect of spoken language ability, namely public speaking. This study used an adapted form of a public speaking rating scale…
Descriptors: Public Speaking, Rating Scales, Adoption (Ideas), English Instruction
Neal, Daniene; Matson, Johnny L.; Belva, Brian C. – Research in Autism Spectrum Disorders, 2013
The "autism spectrum disorder observation for children" ("ASD-OC") is a newly created 54-item observation measure for autism spectrum disorders (ASD). Due to the fact that many of the ASD observation measures currently available do not have established psychometric properties and require extensive time and training to administer, the "ASD-OC"…
Descriptors: Measures (Individuals), Autism, Observation, Psychometrics
Semmelroth, Carrie Lisa; Johnson, Evelyn – Assessment for Effective Intervention, 2014
This study used generalizability theory to measure reliability on the Recognizing Effective Special Education Teachers (RESET) observation tool designed to evaluate special education teacher effectiveness. At the time of this study, the RESET tool included three evidence-based instructional practices (direct, explicit instruction; whole-group…
Descriptors: Observation, Special Education Teachers, Teacher Effectiveness, Teacher Evaluation
Sointu, Erkko T.; Geležiniene, Renata; Lambert, Matthew C.; Nordness, Philip D. – International Journal of School & Educational Psychology, 2015
Educational professionals need assessments that yield psychometrically sound scores to assess students' behavioral and emotional functioning in order to guide data-driven decision-making processes. Rating scales have been found to be effective and economical, and often multiple informant perspectives can be obtained. The agreement between multiple…
Descriptors: Behavior Rating Scales, Indo European Languages, Translation, Test Reliability
Strang, Kenneth David – Journal of Information Technology Education: Innovations in Practice, 2015
An online Moodle Workshop was evaluated for peer assessment effectiveness. A quasiexperiment was designed using a Seminar in Professionalism course taught in face-to-face mode to undergraduate students across two campuses. The first goal was to determine if Moodle Workshop awarded a fair peer grader grade. The second objective was to estimate if…
Descriptors: Workshops, Online Courses, Program Effectiveness, Educational Practices
Consistency of Supervisor and Peer Ratings of Assessment Interviews Conducted by Psychology Trainees
Gonsalvez, Craig J.; Deane, Frank P.; Caputi, Peter – British Journal of Guidance & Counselling, 2016
Observation of counsellor skills through a one-way mirror, video or audio recording followed by supervisors and peers feedback is common in counsellor training. The nature and extent of agreement between supervisor-peer dyads are unclear. Using a standard scale, supervisors and peers rated 32 interviews by psychology trainees observed through a…
Descriptors: Interviews, Supervisory Methods, Trainees, Minimum Competency Testing
Brehmer-Rinderer, Barbara; Zeilinger, Elisabeth Lucia; Radaljevic, Ana; Weber, Germain – Research in Developmental Disabilities: A Multidisciplinary Journal, 2013
Frailty is a theoretical concept used to track individual age-related declines. Persons with intellectual disabilities (ID) often present with pre-existing deficits that would be considered frailty markers in the general population. The previously developed Vienna Frailty Questionnaire for Persons with ID (VFQ-ID) was aimed at assessing frailty in…
Descriptors: Questionnaires, Psychometrics, Mental Retardation, Factor Structure
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Wright, Heather Harris; Capilouto, Gilson J.; Koutsoftas, Anthony – International Journal of Language & Communication Disorders, 2013
Background: Discourse coherence is a reflection of the listener's ability to interpret the overall meaning conveyed by the speaker. Measuring global coherence (maintenance of thematic unity of the discourse) is useful for quantifying communication impairments at the discourse level in clinical populations and for measuring response to…
Descriptors: Measures (Individuals), Feasibility Studies, Test Reliability, Construct Validity
Aitken, Madison; Martinussen, Rhonda; Wolfe, Richard G.; Tannock, Rosemary – Assessment for Effective Intervention, 2015
The Strengths and Difficulties Questionnaire (SDQ) is a 25-item screening measure for emotional and behavioral problems in children and adolescents aged 4 to 16. Structural equation modeling was used to test the five-factor structure of teacher and parent ratings on the British version of the SDQ in a community sample of 501 Canadian children aged…
Descriptors: Foreign Countries, Factor Structure, Questionnaires, Elementary School Students
Glad, Johan; Kottorp, Anders; Jergeby, Ulla; Gustafsson, Carina; Sonnander, Karin – Research on Social Work Practice, 2014
Objectives: The aim of this pilot study was to explore psychometric properties of two versions of the Home Observation for Measurement of the Environment Inventory in a Swedish social service sample. Method: Social workers employed at 22 Swedish child protections agencies participated in the data collection. Both classic test theory approaches and…
Descriptors: Psychometrics, Item Response Theory, Foreign Countries, Social Services
Larsen, Linda; Kohnen, Saskia; Nickels, Lyndsey; McArthur, Genevieve – Australian Journal of Learning Difficulties, 2015
Children who have difficulty learning to read are at increased risk for academic failure, poor self-esteem, anxiety and depression, and unemployment. To help reduce these risks, it is important to identify and treat weaknesses in a child's reading as early as possible. The aim of this study was to develop a valid and reliable comprehensive…
Descriptors: Phoneme Grapheme Correspondence, Reading Tests, Standardized Tests, Test Reliability
Camilleri, Bernard; Botting, Nicola – International Journal of Language & Communication Disorders, 2013
Background: Children's low scores on vocabulary tests are often erroneously interpreted as reflecting poor cognitive and/or language skills. It may be necessary to incorporate the measurement of word-learning ability in estimating children's lexical abilities. Aims: To explore the reliability and validity of the Dynamic Assessment of…
Descriptors: Receptive Language, Vocabulary, Language Tests, Test Reliability
Srsen, Katja Groleger; Vidmar, Gaj; Pikl, Masa; Vrecar, Irena; Burja, Cirila; Krusec, Klavdija – International Journal of Rehabilitation Research, 2012
The Halliwick concept is widely used in different settings to promote joyful movement in water and swimming. To assess the swimming skills and progression of an individual swimmer, a valid and reliable measure should be used. The Halliwick-concept-based Swimming with Independent Measure (SWIM) was introduced for this purpose. We aimed to determine…
Descriptors: Content Validity, Interrater Reliability, Test Reliability, Test Validity

Peer reviewed
Direct link
