Publication Date
| In 2026 | 7 |
| Since 2025 | 690 |
| Since 2022 (last 5 years) | 3191 |
| Since 2017 (last 10 years) | 7432 |
| Since 2007 (last 20 years) | 15070 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10290 |
| Reliability | 9763 |
| Foreign Countries | 7150 |
| Test Construction | 4828 |
| Validity | 4192 |
| Measures (Individuals) | 3880 |
| Factor Analysis | 3826 |
| Psychometrics | 3532 |
| Interrater Reliability | 3126 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1329 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 224 |
| Spain | 218 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Abdullah Faruk Kiliç; Meltem Acar Güvendir; Gül Güler; Tugay Kaçak – Measurement: Interdisciplinary Research and Perspectives, 2025
In this study, the extent to wording effects impact structure and factor loadings, internal consistency and measurement invariance was outlined. The modified form, which includes items that semantically reversed, explains %21.5 more variance than the original form. Also, reversed items' factor loadings are higher. As a result of CFA, indexes…
Descriptors: Test Items, Factor Structure, Test Reliability, Semantics
Alexandra Jackson; Cheryl Bodnar; Elise Barrella; Juan Cruz; Krista Kecskemety – Journal of STEM Education: Innovations and Research, 2025
Recent curricular interventions in engineering education have focused on encouraging students to develop an entrepreneurial mindset (EM) to equip them with the skills needed to generate innovative ideas and address complex global problems upon entering the workforce. Methods to evaluate these interventions have been inconsistent due to the lack of…
Descriptors: Engineering Education, Entrepreneurship, Concept Mapping, Student Evaluation
Betul Aydin; Suleyman Sadi Seferoglu – Turkish Online Journal of Distance Education, 2025
This study aims to conduct validity and reliability of a measurement tool developed to determine university students' levels of digital risk taking. 646 undergraduate students from 8 different universities voluntarily participated in the study. Exploratory and confirmatory factor analyses were conducted to reveal the factor structure of the…
Descriptors: Undergraduate Students, Measures (Individuals), Risk, Test Validity
Jadiane Dionisio; Cristina dos Santos Cardoso de Sá; Carlos Luz; Bruno Silva; Luis Paulo Rodrigues; Rita Cordovil – Journal of Motor Learning and Development, 2025
This study aims to adapt the Motor Competence Assessment (MCA) instrument for autistic children. The adaptation was carried out in three stages: (a) a pilot test, (b) MCA assessment of 45 children with autism (Levels I and II) aged 5-11 years with documentation of each MCA test difficulties, and (c) adaptations to the original MCA tests were…
Descriptors: Psychomotor Skills, Children, Preadolescents, Autism Spectrum Disorders
Christopher J. Anthony; Stephen N. Elliott – School Mental Health, 2025
Stress is a complex construct that is related to resilience and general health starting in childhood. Despite its importance for student health and well-being, there are few measures of stress designed for school-based applications. In this study, we developed and initially validated a Stress Indicators Scale using five samples of teachers,…
Descriptors: Test Construction, Stress Variables, Test Validity, Test Items
Danwei Cai; Ben Naismith; Maria Kostromitina; Zhongwei Teng; Kevin P. Yancey; Geoffrey T. LaFlair – Language Learning, 2025
Globalization and increases in the numbers of English language learners have led to a growing demand for English proficiency assessments of spoken language. In this paper, we describe the development of an automatic pronunciation scorer built on state-of-the-art deep neural network models. The model is trained on a bespoke human-rated dataset that…
Descriptors: Automation, Scoring, Pronunciation, Speech Tests
Berna Kiliç; Mahmut Selvi – International Journal of Assessment Tools in Education, 2025
It is important to determine the level of pedagogical content knowledge of teachers regarding skills. The aim of this study is to establish the theoretical framework of skill-specific pedagogical content knowledge and to develop a reliable and valid scale to measure teachers' entrepreneurship pedagogical content knowledge. The draft scale form was…
Descriptors: Entrepreneurship, Pedagogical Content Knowledge, Teacher Competency Testing, Test Reliability
Janika Saretzki; Rosalie Andrae; Boris Forthmann; Mathias Benedek – Journal of Creative Behavior, 2025
Divergent thinking (DT) ability is widely regarded as a central cognitive capacity underlying creativity, but its assessment is challenged by the fact that DT tasks yield a variable number of responses. Various approaches for the scoring of DT tasks have been proposed, which differ in how responses are evaluated and aggregated within a task. The…
Descriptors: Creative Thinking, Creativity Tests, Scoring, Metacognition
Takashi Mori; Nami Ogawa; Ichiro Fujishima; Hidetaka Wakabayashi; Keishi Okamoto; Yuto Kameyama; Ai Hirano; Fumiko Oshima; Masataka Itoda; Sumito Ogawa; Tomohisa Ohno; Minoru Yamada; Kenjiro Kunieda; Takashi Shigematsu; Shinta Nishioka; Kazuki Fukuma; Akio Shimizu; Yoichiro Sugiyama – International Journal of Language & Communication Disorders, 2025
Purpose: Measurement of swallowing muscle mass is important in determining sarcopenic dysphagia. Ultrasound equipment can measure the cross-sectional area of the swallowing muscles, but the inter-instrument reliability is unknown. In this study, the inter-instrument reliability was investigated. Methods: Three ultrasound devices were used to…
Descriptors: Human Body, Motor Reactions, Diagnostic Tests, Acoustics
Guy B. deBrun – Journal of Outdoor Recreation, Education, and Leadership, 2025
Discussions of what it means to be an effective outdoor leader are common in outdoor education literature (Martin et al., 2025; Smith, 2021). Research has identified core competencies (Martin et al., 2025), conceptual frameworks (Pomfret et al., 2023), and course curricula/qualifications for effective leadership (Baker & O'Brien, 2019; Seaman…
Descriptors: Outdoor Leadership, Leadership Effectiveness, Evaluation Methods, Scoring Rubrics
Jennifer Manning; Jeffrey Baldwin; Natasha Powell – Innovations in Education and Teaching International, 2025
As ChatGPT continues to reshape student engagement and instructional design, it is crucial to examine its practical implications. This study aims to evaluate the effectiveness of ChatGPT3.5 and ChatGPT4 as potential automated essay scoring (AES) systems. Fifty authentic, student-written annotated bibliographies were evaluated by three human raters…
Descriptors: Foreign Countries, Essays, Writing Evaluation, Artificial Intelligence
Esma Esen Çiftçi; Esra Dasçi; Cansu Ayan; Zeynep Uludag – International Journal of Assessment Tools in Education, 2025
Men's participation is an important indicator for achieving gender equality. The purpose of this study is to adapt the "Support for Gender Equality among Men Scale (SGEMS)" developed by Sudkamper et al. (2020) to Turkish. The scale examines men's support for gender equality in two sub-dimensions: public space and household. In the study,…
Descriptors: Foreign Countries, Males, Gender Bias, Community
Yurdagül Aydinyer; Behiye Ubuz – International Journal of Science and Mathematics Education, 2025
This mixed methods study aimed to develop a test assessing middle school students' predominantly deep procedural knowledge of angles and polygons and then to determine its validity based on test content (expert evaluation and developmental field testing) and internal structure (dimensionality and internal consistency). "Deep procedural…
Descriptors: Geometric Concepts, Test Construction, Test Validity, Middle School Students
Jack D. Brett; David A. Preece; Rodrigo Becerra; Andrew Whitehouse; Murray T. Maybery – Journal of Autism and Developmental Disorders, 2025
Purpose: There is a common mischaracterisation that autistic individuals have reduced or absent empathy. Measurement issues may have influenced existing findings on the relationships between autism and empathy, and the structure of the empathy construct in autism remains unclear. Methods: The present study sought to address these gaps by examining…
Descriptors: Empathy, Autism Spectrum Disorders, Affective Measures, Psychometrics
Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024
Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…
Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques

Peer reviewed
Direct link
