Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 5 |
| Since 2007 (last 20 years) | 16 |
Descriptor
| Interrater Reliability | 25 |
| Test Items | 25 |
| Test Validity | 25 |
| Test Reliability | 19 |
| Test Construction | 17 |
| Scoring | 11 |
| Foreign Countries | 10 |
| Psychometrics | 6 |
| Scores | 6 |
| Correlation | 5 |
| Difficulty Level | 5 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Elementary Education | 3 |
| Higher Education | 3 |
| Elementary Secondary Education | 2 |
| Grade 9 | 2 |
| Postsecondary Education | 2 |
| Secondary Education | 2 |
| Grade 1 | 1 |
| Grade 2 | 1 |
| Grade 3 | 1 |
| High Schools | 1 |
| Junior High Schools | 1 |
| More ▼ | |
Audience
Location
| New Mexico | 2 |
| Canada | 1 |
| Florida | 1 |
| Germany | 1 |
| India | 1 |
| Japan | 1 |
| Sweden | 1 |
| United Kingdom (England) | 1 |
| United Kingdom (London) | 1 |
| United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Hampton, Lauren H.; Curtis, Philip R.; Roberts, Megan Y. – Autism: The International Journal of Research and Practice, 2019
Borrowing from a clinical psychology observational methodology, thin-slice observations were used to assess autism characteristics in toddlers. Thin-slices are short observations taken from a longer behavior stream which are assigned ratings by multiple raters using a 5-point scale. The raters' observations are averaged together to assign a…
Descriptors: Autism, Pervasive Developmental Disorders, Observation, Toddlers
Tengberg, Michael – Language Assessment Quarterly, 2018
Reading comprehension is often treated as a multidimensional construct. In many reading tests, items are distributed over reading process categories to represent the subskills expected to constitute comprehension. This study explores (a) the extent to which specified subskills of reading comprehension tests are conceptually conceivable to…
Descriptors: Reading Tests, Reading Comprehension, Scores, Test Results
Berger, Jean-Louis; Karabenick, Stuart A. – Educational Assessment, 2016
Despite their significant contributions to research on self-regulated learning, those favoring online and trace approaches have questioned the use of self-report to assess learners' use of learning strategies. An important rejoinder to such criticisms consists of examining the validity of self-report items. The present study was designed to assess…
Descriptors: Construct Validity, Metacognition, Learning Strategies, Self Disclosure (Individuals)
Edward Paul Getman – Online Submission, 2020
Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
Rindermann, Heiner; Baumeister, Antonia E. E. – International Journal of Testing, 2015
Scholastic tests regard cognitive abilities to be domain-specific competences. However, high correlations between competences indicate either high task similarity or a dependence on common factors. The present rating study examined the validity of 12 Programme for International Student Assessment (PISA) and Third or Trends in International…
Descriptors: Test Validity, Test Interpretation, Competence, Reading Tests
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Slepkov, Aaron D.; Shiell, Ralph C. – Physical Review Special Topics - Physics Education Research, 2014
Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of the time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently developed…
Descriptors: Science Tests, Physics, Responses, Multiple Choice Tests
Taubner, Svenja; Horz, Susanne; Fischer-Kern, Melitta; Doering, Stephan; Buchheim, Anna; Zimmermann, Johannes – Psychological Assessment, 2013
The Reflective Functioning Scale (RFS) was developed to assess individual differences in the ability to mentalize attachment relationships. The RFS assesses mentalization from transcripts of the Adult Attachment Interview (AAI). A global score is given by trained coders on an 11-point scale ranging from antireflective to exceptionally reflective.…
Descriptors: Measures (Individuals), Attachment Behavior, Individual Differences, Adults
Williams, Lunetta M.; Hall, Katrina W.; Hedrick, Wanda B.; Lamkin, Marcia; Abendroth, Jennifer – Journal of Language and Literacy Education, 2013
The purpose of the present study was to develop an instrument to measure reading during in-school independent reading (ISIR). Procedures to establish validity and reliability of the instrument included videotaping and observing students during ISIR, gathering feedback from literacy experts, establishing interrater reliability, crosschecking…
Descriptors: Test Construction, Test Validity, Test Reliability, Video Technology
Hough, Heather J.; Kerbow, David; Bryk, Anthony; Pinnell, Gay Su; Rodgers, Emily; Dexter, Emily; Hung, Carrie; Scharer, Patricia L.; Fountas, Irene – School Effectiveness and School Improvement, 2013
In this paper, we report on 2 studies developing, testing, and using an observation tool for measuring primary literacy instruction, the Developing Language and Literacy Teaching (DLLT) rubrics. In Study 1 (an instrumentation study), we show that the DLLT has a high level of internal consistency, that there are high levels of inter-rater…
Descriptors: Literacy Education, Teacher Evaluation, Observation, Scoring Rubrics
Rufino, Katrina A.; Boccaccini, Marcus T.; Guy, Laura S. – Assessment, 2011
Although reliability is essential to validity, most research on violence risk assessment tools has paid little attention to strategies for improving rater agreement. The authors evaluated the degree to which perceived subjectivity in scoring guidelines for items from two measures--the Psychopathy Checklist-Revised (PCL-R) and the Historical,…
Descriptors: Risk Management, Predictive Validity, Interrater Reliability, Scoring
Hasson, Natalie; Dodd, Barbara; Botting, Nicola – International Journal of Language & Communication Disorders, 2012
Background: Sentence construction and syntactic organization are known to be poor in children with specific language impairments (SLI), but little is known about the way in which children with SLI approach language tasks, and static standardized tests contribute little to the differentiation of skills within the population of children with…
Descriptors: Alternative Assessment, Sentence Structure, Syntax, Language Processing
Sood, Vishal – Journal on Educational Psychology, 2013
For identifying children with four major kinds of verbal learning disabilities viz. reading disability, speech and language comprehension disability, writing disability and mathematics disability, the present task was undertaken to construct and standardize verbal learning disabilities checklist. This checklist was developed by keeping in view the…
Descriptors: Verbal Learning, Learning Disabilities, Children, Disability Identification
OECD Publishing (NJ1), 2009
The Organisation for Economic Cooperation and Development's (OECD's) Programme for International Student Assessment (PISA) surveys, which take place every three years, have been designed to collect information about 15-year-old students in participating countries. PISA examines how well students are prepared to meet the challenges of the future,…
Descriptors: Policy Formation, Scaling, Academic Achievement, Interrater Reliability
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
