Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 9 |
| Since 2007 (last 20 years) | 31 |
Descriptor
| Interrater Reliability | 61 |
| Standardized Tests | 61 |
| Test Reliability | 25 |
| Evaluation Methods | 17 |
| Scoring | 15 |
| Test Construction | 15 |
| Test Validity | 15 |
| Scores | 12 |
| Writing Evaluation | 12 |
| Evaluators | 11 |
| Foreign Countries | 11 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 7 |
| Early Childhood Education | 6 |
| Elementary Education | 5 |
| Postsecondary Education | 4 |
| Secondary Education | 4 |
| High Schools | 3 |
| Primary Education | 3 |
| Grade 3 | 2 |
| Grade 5 | 2 |
| Kindergarten | 2 |
| Preschool Education | 2 |
| More ▼ | |
Audience
| Practitioners | 3 |
| Teachers | 2 |
| Counselors | 1 |
| Researchers | 1 |
Location
| Australia | 2 |
| California | 2 |
| United Kingdom | 2 |
| California (Los Angeles) | 1 |
| Canada | 1 |
| China | 1 |
| Illinois | 1 |
| India | 1 |
| Kansas | 1 |
| Michigan | 1 |
| Netherlands | 1 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Ole J. Kemi – Advances in Physiology Education, 2025
Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…
Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards
Todaro, Francesca; Pizzorni, Nicole; Scarponi, Letizia; Ronzoni, Clara; Huckabee, Maggie-Lee; Schindler, Antonio – International Journal of Language & Communication Disorders, 2021
Background: The Test of Masticating and Swallowing Solids (TOMASS) is an international standardized swallowing assessment tool. However, its psychometric characteristics have not been analysed in patients with dysphagia. Aims: To analyse TOMASS's (1) inter- and intra-rater reliability in a clinical population of patients with dysphagia, (2)…
Descriptors: Physical Disabilities, Test Reliability, Test Validity, Standardized Tests
Jones, Nathan; Bell, Courtney; Qi, Yi; Lewis, Jennifer; Kirui, David; Stickler, Leslie; Redash, Amanda – ETS Research Report Series, 2021
The observation systems being used in all 50 states require administrators to learn to accurately and reliably score their teachers' instruction using standardized observation systems. Although the literature on observation systems is growing, relatively few studies have examined the outcomes of trainings focused on developing administrators'…
Descriptors: Observation, Standardized Tests, Teacher Evaluation, Test Reliability
Lichtenstein, Robert – Communique, 2020
Appropriate interpretation of assessment data requires an appreciation that tools are subject to measurement error. School psychologists recognize, at least on an intellectual level, that measures are imperfect--that test scores and other quantitative measures (e.g., rating scales, systematic behavioral observations) are best estimates of…
Descriptors: Error of Measurement, Test Reliability, Pretests Posttests, Standardized Tests
Burkhardt, Amy; Lottridge, Susan; Woolf, Sherri – Educational Measurement: Issues and Practice, 2021
For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as "crisis papers," testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to…
Descriptors: Scoring Rubrics, Identification, At Risk Students, Standardized Tests
Pentimonti, Jill M.; Bowles, Ryan P.; Zucker, Tricia A.; Tambyraja, Sherine R.; Justice, Laura M. – Grantee Submission, 2021
Measuring the quality of classroom-based interactive shared book reading within the early childhood classroom represents a specific dimension of teacher-child interactions that is of great interest to researchers. This interest reflects decades of research demonstrating the benefit of reading to young children in both the home and the classroom.…
Descriptors: Standardized Tests, Test Construction, Construct Validity, Predictive Validity
Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022
Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…
Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning
Mailend, Marja-Liisa; Plante, Elena; Anderson, Michele A.; Applegate, E. Brooks; Nelson, Nickola W. – International Journal of Language & Communication Disorders, 2016
Background: As new standardized tests become commercially available, it is critical that clinicians have access to the information about a test's psychometric properties, including aspects of reliability. Aims: The purpose of the three studies reported in this article was to investigate the reliability of a new test, the Test of Integrated…
Descriptors: Standardized Tests, Psychometrics, Reliability, Language Skills
He, Peng; Liu, Xiufeng; Zheng, Changlong; Jia, Mengying – Chemistry Education Research and Practice, 2016
This study intends to develop a standardized instrument for measuring classroom teaching and learning in secondary chemistry lessons. Based on previous studies and interviews with expert teachers, the progression of five quality levels was constructed hypothetically to represent the quality of chemistry lessons in Chinese secondary schools. The…
Descriptors: Foreign Countries, Secondary School Science, Science Instruction, Chemistry
Jay Schyler Raadt – ProQuest LLC, 2020
In response to concerns about using only standardized multiple-choice assessments, some school districts have moved to using alternative ratings of student achievement with authentic assessments. However, such assessments are often limited in terms of the psychometric validity data supporting their use. The present study mixed quantitative and…
Descriptors: Performance Based Assessment, Middle School Students, Scoring Rubrics, Content Validity
Larsen, Linda; Kohnen, Saskia; Nickels, Lyndsey; McArthur, Genevieve – Australian Journal of Learning Difficulties, 2015
Children who have difficulty learning to read are at increased risk for academic failure, poor self-esteem, anxiety and depression, and unemployment. To help reduce these risks, it is important to identify and treat weaknesses in a child's reading as early as possible. The aim of this study was to develop a valid and reliable comprehensive…
Descriptors: Phoneme Grapheme Correspondence, Reading Tests, Standardized Tests, Test Reliability
McGrane, Joshua Aaron; Humphry, Stephen Mark; Heldsinger, Sandra – Applied Measurement in Education, 2018
National standardized assessment programs have increasingly included extended written performances, amplifying the need for reliable, valid, and efficient methods of assessment. This article examines a two-stage method using comparative judgments and calibrated exemplars as a complement and alternative to existing methods of assessing writing.…
Descriptors: Standardized Tests, Foreign Countries, Writing Tests, Writing Evaluation
Attali, Yigal – Language Testing, 2016
A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…
Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators
Pentimonti, Jill M.; Zucker, Tricia A.; Justice, Laura M.; Petscher, Yaacov; Piasta, Shayne B.; Kaderavek, Joan N. – Early Childhood Research Quarterly, 2012
Participation in shared-reading experiences is associated with children's language and literacy outcomes, yet few standardized assessments of shared-reading quality exist. The purpose of this study was to describe the psychometric characteristics of the Systematic Assessment of Book Reading (SABR), an observational tool designed to characterize…
Descriptors: Test Validity, Construct Validity, Interrater Reliability, Factor Structure
Hasson, Natalie; Dodd, Barbara; Botting, Nicola – International Journal of Language & Communication Disorders, 2012
Background: Sentence construction and syntactic organization are known to be poor in children with specific language impairments (SLI), but little is known about the way in which children with SLI approach language tasks, and static standardized tests contribute little to the differentiation of skills within the population of children with…
Descriptors: Alternative Assessment, Sentence Structure, Syntax, Language Processing

Peer reviewed
Direct link
