Publication Date
| In 2026 | 0 |
| Since 2025 | 56 |
| Since 2022 (last 5 years) | 282 |
| Since 2017 (last 10 years) | 778 |
| Since 2007 (last 20 years) | 2040 |
Descriptor
| Interrater Reliability | 3122 |
| Foreign Countries | 654 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 24 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Hong, Szu-Wei; Chan, Roger W. – Journal of Speech, Language, and Hearing Research, 2022
Purpose: This study examined the acoustic properties of Taiwanese (Southern Min) lexical tones produced in esophageal speech (ES) and pneumatic artificial laryngeal speech (PAL), including onset fundamental frequency (F0), slope of F0 contour, duration, and amplitude (intensity) of the vowel portion of syllables carrying seven Taiwanese tones.…
Descriptors: Acoustics, Speech Communication, Intonation, Vowels
Frey, T. Kody; Tatum, Nicholas T. – Communication Education, 2022
Three studies (N = 1,346) detail the development of three theoretically grounded instruments operationalizing "instructor strictness." Using open-ended questionnaire data (n = 427), study 1 inductively derives an understanding of the instructor behaviors that students perceive as strict. These patterns of behavior are then condensed into…
Descriptors: Teaching Methods, Teacher Behavior, Student Attitudes, Behavior Patterns
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Mantzicopoulos, Panayota; French, Brian F.; Patrick, Helen – Grantee Submission, 2018
Research Findings: We evaluated the score stability of the Mathematical Quality of Instruction (MQI), an observational measure of mathematics instruction. Three raters each scored, independently, 100 video-recorded lessons taught by 20 kindergarten teachers in the spring. Using generalizability theory analyses, we decomposed the MQI's score…
Descriptors: Kindergarten, Mathematics Instruction, Educational Quality, Classroom Observation Techniques
Moeller, Julia; Viljaranta, Jaana; Kracke, Bärbel; Dietrich, Julia – Frontline Learning Research, 2020
This article proposes a study design developed to disentangle the objective characteristics of a learning situation from individuals' subjective perceptions of that situation. The term objective characteristics refers to the agreement across students, whereas subjective perceptions refers to inter-individual heterogeneity. We describe a novel…
Descriptors: Student Attitudes, College Students, Lecture Method, Student Interests
Holly Lee Allen – ProQuest LLC, 2020
Development of the Jordan Performance Appraisal System (JPAS) was completed in 1996. This study examined the factor structure of the classroom observation instrument used in the JPAS. Using observed classroom instructional quality ratings of 1220 elementary teachers of Grades 1-6 in the Jordan School District, this study estimated the factor…
Descriptors: Foreign Countries, Performance Based Assessment, Factor Structure, Classroom Observation Techniques
Sumner, Josh – Research-publishing.net, 2021
Comparative Judgement (CJ) has emerged as a technique that typically makes use of holistic judgement to assess difficult-to-specify constructs such as production (speaking and writing) in Modern Foreign Languages (MFL). In traditional approaches, markers assess candidates' work one-by-one in an absolute manner, assigning scores to different…
Descriptors: Holistic Approach, Student Evaluation, Comparative Analysis, Decision Making
Brocken, Johanna E. A.; van der Kamp, John; Wormhoudt, Rene; Lenoir, Matthieu L.; Savelsbergh, Geert J. P. – Journal of Teaching in Physical Education, 2023
Purpose: The aim of this study is to measure the concurrent validity of the Athletic Skills Track (AST) by examining whether its outcome score correlates with the holistic judgments of experts about the quality of movement. Method: Video recordings of children performing the AST were shown to physical education teachers who independently gave a…
Descriptors: Validity, Athletics, Correlation, Scores
Kang, Veronica Y.; Kim, Sunyoung – Journal of Early Intervention, 2023
Teaching vocabularies to young children is critical as vocabulary is related to long-term language, literacy, and academic skills. The current study used a multiple probe design to examine the effects of enhanced milieu teaching with book reading on the use of word approximations in four 2- to 4-year-old females with language delay. The first…
Descriptors: Teaching Methods, Word Frequency, Vocabulary Development, Story Reading
McLeod, Bryce D.; Sutherland, Kevin S.; Broda, Michael; Granger, Kristen L.; Martinez, Ruben G.; Conroy, Maureen A.; Snyder, Patricia A.; Southam-Gerow, Michael A. – Prevention Science, 2022
Though treatment integrity measurement is important for research intended to promote social and behavioral outcomes of children at risk for emotional and behavioral disorders (EBDs) in early childhood settings, measurement gaps exist in the field. This paper reports on the development and preliminary psychometric assessment of the treatment…
Descriptors: Psychometrics, Measures (Individuals), Fidelity, Integrity
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Guo, Daibao; Zimmer, Wendi; Matthews, Sharon D.; McTigue, Erin M. – Journal of Visual Literacy, 2019
Situated on the overlap between visual literacy and content area literacy, we conducted a systematic literature review regarding the impact of visuals for learning in K-12 content area classrooms. The purpose was to critically analyse the methodological rigor of this research base and provide direction for future research. Additionally, we provide…
Descriptors: Visual Literacy, Outcomes of Education, Content Area Reading, Elementary Secondary Education
Lambert, Matthew C.; Sointu, Erkko T.; Epstein, Michael H. – International Journal of School & Educational Psychology, 2019
Child assessment practices have undergone, and are continuing to undergo, significant changes. Among the most prominent changes is the movement toward measuring child well-being, in general, and emotional and behavioral strengths, in particular. The Behavioral and Emotional Rating Scale (BERS) is a strength-based instrument which is widely used in…
Descriptors: Behavior Rating Scales, Translation, Psychometrics, Scores
McGee, Monnie – Journal of Statistics Education, 2019
In several sporting events, the winner is chosen on the basis of a subjective score. These sports include gymnastics, ice skating, and diving. Unlike for other subjectively judged sports, diving competitions consist of multiple rounds in quick succession on the same apparatus. These multiple rounds lead to an extra layer of complexity in the data,…
Descriptors: Data Use, Visualization, Interrater Reliability, Introductory Courses
Sutherland, Rebecca; Trembath, David; Hodge, Marie Antoinette; Rose, Veronica; Roberts, Jacqueline – International Journal of Language & Communication Disorders, 2019
Background: Access to timely and appropriate speech-language pathology (SLP) services is a significant challenge for many families. Telehealth has been used successfully to treat a range of communication disorders in children and adults. Research examining the use of telehealth for children with autism has focused largely on diagnosis,…
Descriptors: Autism, Pervasive Developmental Disorders, Children, Reliability

Peer reviewed
Direct link
