Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Data | 6 |
Scores | 3 |
Bayesian Statistics | 2 |
Comparative Analysis | 2 |
Correlation | 2 |
Goodness of Fit | 2 |
Item Response Theory | 2 |
Methods | 2 |
Models | 2 |
Psychometrics | 2 |
Simulation | 2 |
More ▼ |
Source
Educational Measurement:… | 2 |
ETS Research Report Series | 1 |
Educational Testing Service | 1 |
Journal of Educational… | 1 |
Journal of Educational and… | 1 |
Author
Sinharay, Sandip | 6 |
Haberman, Shelby J. | 2 |
Blew, Edwin O. | 1 |
Dorans, Neil J. | 1 |
Grant, Mary C. | 1 |
Holland, Paul W. | 1 |
Puhan, Gautam | 1 |
Publication Type
Journal Articles | 5 |
Reports - Research | 3 |
Reports - Evaluative | 2 |
Reports - Descriptive | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Pre Professional Skills Tests | 1 |
What Works Clearinghouse Rating
Sinharay, Sandip; Haberman, Shelby J. – Educational Measurement: Issues and Practice, 2014
Standard 3.9 of the Standards for Educational and Psychological Testing ([, 1999]) demands evidence of model fit when item response theory (IRT) models are employed to data from tests. Hambleton and Han ([Hambleton, R. K., 2005]) and Sinharay ([Sinharay, S., 2005]) recommended the assessment of practical significance of misfit of IRT models, but…
Descriptors: Item Response Theory, Goodness of Fit, Models, Tests
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Educational Measurement: Issues and Practice, 2011
The purpose of this ITEMS module is to provide an introduction to subscores. First, examples of subscores from an operational test are provided. Then, a review of methods that can be used to examine if subscores have adequate psychometric quality is provided. It is demonstrated, using results from operational and simulated data, that subscores…
Descriptors: Scores, Psychometrics, Tests, Data
Sinharay, Sandip – Journal of Educational Measurement, 2010
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman suggested a method based on classical test theory to determine whether subscores have added value over total scores. In this article I first provide a rich collection of results regarding when subscores were found to have added…
Descriptors: Scores, Test Theory, Simulation, Reliability
Sinharay, Sandip; Dorans, Neil J.; Grant, Mary C.; Blew, Edwin O. – Journal of Educational and Behavioral Statistics, 2009
Test administrators often face the challenge of detecting differential item functioning (DIF) with samples of size smaller than that recommended by experts. A Bayesian approach can incorporate, in the form of a prior distribution, existing information on the inference problem at hand, which yields more stable estimation, especially for small…
Descriptors: Test Bias, Computation, Bayesian Statistics, Data
Sinharay, Sandip; Holland, Paul W. – Educational Testing Service, 2008
The nonequivalent groups with anchor test (NEAT) design involves missing data that are missing by design. Three popular equating methods that can be used with a NEAT design are the poststratification equating method, the chain equipercentile equating method, and the item-response-theory observed-score-equating method. These three methods each…
Descriptors: Equated Scores, Test Items, Item Response Theory, Data
Sinharay, Sandip – ETS Research Report Series, 2004
Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and…
Descriptors: Bayesian Statistics, Networks, Models, Goodness of Fit