Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 4 |
Descriptor
| Educational Testing | 11 |
| Psychometrics | 11 |
| Test Reliability | 6 |
| Evaluation Methods | 5 |
| Test Construction | 5 |
| Reliability | 4 |
| Test Validity | 4 |
| Validity | 4 |
| Models | 3 |
| Scores | 3 |
| Adaptive Testing | 2 |
| More ▼ | |
Source
| Educational Assessment | 1 |
| Educational Evaluation and… | 1 |
| Journal of Applied Testing… | 1 |
| Journal of Faculty Development | 1 |
| Measurement:… | 1 |
| Multivariate Behavioral… | 1 |
Author
| Haberman, Shelby J. | 2 |
| Sinharay, Sandip | 2 |
| Berk, Ronald A. | 1 |
| Koch, William R. | 1 |
| Luecht, Richard M. | 1 |
| Lyman, Howard B. | 1 |
| Puhan, Gautam | 1 |
| Raggio, Donald J. | 1 |
| Reckase, Mark D. | 1 |
| Schutz, Richard E. | 1 |
| Sigel, Irving E. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 6 |
| Reports - Research | 4 |
| Speeches/Meeting Papers | 3 |
| Guides - Non-Classroom | 2 |
| Opinion Papers | 2 |
| Reports - Evaluative | 2 |
| Books | 1 |
| Reports - Descriptive | 1 |
| Reports - General | 1 |
Education Level
| Elementary Secondary Education | 1 |
Audience
| Community | 1 |
| Practitioners | 1 |
Location
Laws, Policies, & Programs
| Race to the Top | 1 |
Assessments and Surveys
| Continuous Performance Test | 1 |
What Works Clearinghouse Rating
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Berk, Ronald A. – Journal of Faculty Development, 2016
Recently, student outcomes have bubbled to the top of debates about how to evaluate teaching in community and liberal arts colleges, universities, and professional schools, but even more international attention has been riveted on how outcomes are being used to evaluate teachers and administrators K-12 (Harris, 2012; Rowen & Raudenbush, 2016;…
Descriptors: Value Added Models, Academic Achievement, Outcomes of Education, Teacher Evaluation
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010
Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Descriptors: Educational Testing, Scores, Reports, Psychometrics
Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009
In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.
Descriptors: Scoring, Reliability, Validity, Classification
Peer reviewedSchutz, Richard E. – Educational Evaluation and Policy Analysis, 1985
This paper updates the concept of test validity. This new conception entails a set of 10 categories combined together in pairs: curriculum and instructional validity, statutory and forensic validity, media and journalistic validity, political and legislative validity, and partisan and activist validity. (Author/DWH)
Descriptors: Educational Testing, Politics of Education, Predictive Validity, Psychometrics
Wilcox, Rand R. – 1982
This document contains three papers from the Methodology Project of the Center for the Study of Evaluation. Methods for characterizing test accuracy are reported in the first two papers. "Bounds on the K Out of N Reliability of a Test, and an Exact Test for Hierarchically Related Items" describes and illustrates how an extension of a…
Descriptors: Educational Testing, Evaluation Methods, Guessing (Tests), Latent Trait Theory
Lyman, Howard B. – 1998
The first edition of this book was written to give information about testing to people whose work gave them access to test results, but whose training included little or nothing about the use and interpretation of tests. Later editions have been intended for a broader audience as the need for understanding what test scores really mean has…
Descriptors: Educational Testing, Norm Referenced Tests, Performance Based Assessment, Psychometrics
Raggio, Donald J.; Whitten, Janice M. – 1994
The Raggio Evaluation of Attention Deficit Disorder (READD) is an objective measure for the diagnosis and management of attention deficit disorder (ADD) in children. Extensive research has been conducted on its clinical and psychometric properties, as described in Chapter 3, "Development and Standardization." The READD is a microcomputer…
Descriptors: Attention Deficit Disorders, Behavior Disorders, Children, Clinical Diagnosis
Koch, William R.; Reckase, Mark D. – 1979
Tailored testing procedures for achievement testing were applied in a situation that failed to meet some of the specifications generally considered to be necessary for tailored testing. Discrepancies from the appropriate conditions included the use of small samples for calibrating items, and the use of an item pool that was not designed to be…
Descriptors: Achievement Tests, Adaptive Testing, Educational Testing, Higher Education
Sigel, Irving E. – 1978
This paper provides a theoretical discussion of educational program evaluation. Psychometric theory and developmental psychology are compared as they pertain to the testing of children. The nature of change in childhood makes it necessary to examine the assumptions and goals related to the testing of children as a means of evaluating educational…
Descriptors: Child Development, Cognitive Measurement, Developmental Psychology, Developmental Stages
Luecht, Richard M. – Journal of Applied Testing Technology, 2005
Computer-based testing (CBT) is typically implemented using one of three general test delivery models: (1) multiple fixed testing (MFT); (2) computer-adaptive testing (CAT); or (3) multistage testing (MSTs). This article reviews some of the real cost drivers associated with CBT implementation--focusing on item production costs, the costs…
Descriptors: Adaptive Testing, Computer Assisted Testing, Quality Control, Costs

Direct link
