ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	4

Descriptor

Educational Testing	11
Psychometrics	11
Test Reliability	6
Evaluation Methods	5
Test Construction	5
Reliability	4
Test Validity	4
Validity	4
Models	3
Scores	3
Adaptive Testing	2
Computer Assisted Testing	2
Goodness of Fit	2
Item Response Theory	2
Mathematical Models	2
Multiple Choice Tests	2
Program Evaluation	2
Scoring	2
Test Interpretation	2
Test Use	2
Academic Achievement	1
Accountability	1
Achievement Tests	1
Attention Deficit Disorders	1
Behavior Disorders	1
More ▼

Source

Educational Assessment	1
Educational Evaluation and…	1
Journal of Applied Testing…	1
Journal of Faculty Development	1
Measurement:…	1
Multivariate Behavioral…	1

Author

Haberman, Shelby J.	2
Sinharay, Sandip	2
Berk, Ronald A.	1
Koch, William R.	1
Luecht, Richard M.	1
Lyman, Howard B.	1
Puhan, Gautam	1
Raggio, Donald J.	1
Reckase, Mark D.	1
Schutz, Richard E.	1
Sigel, Irving E.	1
Stefanie A. Wind	1
Whitten, Janice M.	1
Wilcox, Rand R.	1
Yangmeng Xu	1
More ▼

Publication Type

Journal Articles	6
Reports - Research	4
Speeches/Meeting Papers	3
Guides - Non-Classroom	2
Opinion Papers	2
Reports - Evaluative	2
Books	1
Reports - Descriptive	1
Reports - General	1

Education Level

Elementary Secondary Education

Audience

Community	1
Practitioners	1

Location

Laws, Policies, & Programs

Race to the Top

Assessments and Surveys

Continuous Performance Test

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Resolving and Re-Scoring Constructed Response Items in Mixed-Format Assessments: An Exploration of Three Approaches

Peer reviewed

Direct link

Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024

We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…

Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners

Value of Value-Added Models Based on Student Outcomes to Evaluate Teaching

Peer reviewed

Direct link

Berk, Ronald A. – Journal of Faculty Development, 2016

Recently, student outcomes have bubbled to the top of debates about how to evaluate teaching in community and liberal arts colleges, universities, and professional schools, but even more international attention has been riveted on how outcomes are being used to evaluate teachers and administrators K-12 (Harris, 2012; Rowen & Raudenbush, 2016;…

Descriptors: Value Added Models, Academic Achievement, Outcomes of Education, Teacher Evaluation

Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

Peer reviewed

Direct link

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

Descriptors: Educational Testing, Scores, Reports, Psychometrics

How Much Can We Reliably Know about What Examinees Know?

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009

In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.

Descriptors: Scoring, Reliability, Validity, Classification

New Faces of Validity of Educational Tests.

Peer reviewed

Schutz, Richard E. – Educational Evaluation and Policy Analysis, 1985

This paper updates the concept of test validity. This new conception entails a set of 10 categories combined together in pairs: curriculum and instructional validity, statutory and forensic validity, media and journalistic validity, political and legislative validity, and partisan and activist validity. (Author/DWH)

Descriptors: Educational Testing, Politics of Education, Predictive Validity, Psychometrics

R. & D. in Psychometrics: Technical Reports on Latent Structure Models.

Download full text

Wilcox, Rand R. – 1982

This document contains three papers from the Methodology Project of the Center for the Study of Evaluation. Methods for characterizing test accuracy are reported in the first two papers. "Bounds on the K Out of N Reliability of a Test, and an Exact Test for Hierarchically Related Items" describes and illustrates how an extension of a…

Descriptors: Educational Testing, Evaluation Methods, Guessing (Tests), Latent Trait Theory

Test Scores and What They Mean. Sixth Edition.

Lyman, Howard B. – 1998

The first edition of this book was written to give information about testing to people whose work gave them access to test results, but whose training included little or nothing about the use and interpretation of tests. Later editions have been intended for a broader audience as the need for understanding what test scores really mean has…

Descriptors: Educational Testing, Norm Referenced Tests, Performance Based Assessment, Psychometrics

Raggio Evaluation of Attention Deficit Disorder (READD).

Download full text

Raggio, Donald J.; Whitten, Janice M. – 1994

The Raggio Evaluation of Attention Deficit Disorder (READD) is an objective measure for the diagnosis and management of attention deficit disorder (ADD) in children. Extensive research has been conducted on its clinical and psychometric properties, as described in Chapter 3, "Development and Standardization." The READD is a microcomputer…

Descriptors: Attention Deficit Disorders, Behavior Disorders, Children, Clinical Diagnosis

Problems in Application of Latent Trait Models to Tailored Testing.

Download full text

Koch, William R.; Reckase, Mark D. – 1979

Tailored testing procedures for achievement testing were applied in a situation that failed to meet some of the specifications generally considered to be necessary for tailored testing. Discrepancies from the appropriate conditions included the use of small samples for calibrating items, and the use of an item pool that was not designed to be…

Descriptors: Achievement Tests, Adaptive Testing, Educational Testing, Higher Education

A Developmental Perspective in Evaluating Educational Programs.

Sigel, Irving E. – 1978

This paper provides a theoretical discussion of educational program evaluation. Psychometric theory and developmental psychology are compared as they pertain to the testing of children. The nature of change in childhood makes it necessary to examine the assumptions and goals related to the testing of children as a means of evaluating educational…

Descriptors: Child Development, Cognitive Measurement, Developmental Psychology, Developmental Stages

Some Useful Cost-Benefit Criteria for Evaluating Computer-Based Test Delivery Models and Systems

Peer reviewed

Direct link

Luecht, Richard M. – Journal of Applied Testing Technology, 2005

Computer-based testing (CBT) is typically implemented using one of three general test delivery models: (1) multiple fixed testing (MFT); (2) computer-adaptive testing (CAT); or (3) multistage testing (MSTs). This article reviews some of the real cost drivers associated with CBT implementation--focusing on item production costs, the costs…

Descriptors: Adaptive Testing, Computer Assisted Testing, Quality Control, Costs