ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	37

Descriptor

Comparative Analysis	63
Item Response Theory	59
Simulation	26
Test Items	21
Models	19
Computation	10
Computer Assisted Testing	10
Evaluation Methods	10
Monte Carlo Methods	10
Adaptive Testing	9
Equations (Mathematics)	8
Estimation (Mathematics)	8
Goodness of Fit	8
Statistical Analysis	8
Computer Simulation	7
Mathematical Models	7
Maximum Likelihood Statistics	7
Sample Size	7
Bayesian Statistics	6
Error of Measurement	6
Nonparametric Statistics	6
Test Length	6
Equated Scores	5
Foreign Countries	5
Item Analysis	5
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	63
Reports - Research	32
Reports - Evaluative	28
Reports - Descriptive	3
Speeches/Meeting Papers	2

Education Level

Higher Education	2
High Schools	1
Secondary Education	1

Audience

Practitioners	1
Researchers	1

Location

Netherlands	3
Singapore	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Center for Epidemiologic…	1
Iowa Tests of Educational…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 63 results Save | Export

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

An Empirical Evaluation of the Slip Correction in the Four Parameter Logistic Models with Computerized Adaptive Testing

Peer reviewed

Direct link

Yen, Yung-Chin; Ho, Rong-Guey; Laio, Wen-Wei; Chen, Li-Ju; Kuo, Ching-Chin – Applied Psychological Measurement, 2012

In a selected response test, aberrant responses such as careless errors and lucky guesses might cause error in ability estimation because these responses do not actually reflect the knowledge that examinees possess. In a computerized adaptive test (CAT), these aberrant responses could further cause serious estimation error due to dynamic item…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Response Style (Tests)

Recovery of Graded Response Model Parameters: A Comparison of Marginal Maximum Likelihood and Markov Chain Monte Carlo Estimation

Peer reviewed

Direct link

Kieftenbeld, Vincent; Natesan, Prathiba – Applied Psychological Measurement, 2012

Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…

Descriptors: Test Length, Markov Processes, Item Response Theory, Monte Carlo Methods

Observed Score and True Score Equating Procedures for Multidimensional Item Response Theory

Peer reviewed

Direct link

Brossman, Bradley G.; Lee, Won-Chan – Applied Psychological Measurement, 2013

The purpose of this research was to develop observed score and true score equating procedures to be used in conjunction with the multidimensional item response theory (MIRT) framework. Three equating procedures--two observed score procedures and one true score procedure--were created and described in detail. One observed score procedure was…

Descriptors: Equated Scores, True Scores, Item Response Theory, Mathematics Tests

Confirming Testlet Effects

Peer reviewed

Direct link

DeMars, Christine E. – Applied Psychological Measurement, 2012

A testlet is a cluster of items that share a common passage, scenario, or other context. These items might measure something in common beyond the trait measured by the test as a whole; if so, the model for the item responses should allow for this testlet trait. But modeling testlet effects that are negligible makes the model unnecessarily…

Descriptors: Test Items, Item Response Theory, Comparative Analysis, Models

Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

Peer reviewed

Direct link

He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei – Applied Psychological Measurement, 2013

Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…

Descriptors: Regression (Statistics), Item Response Theory, Test Items, Equated Scores

A Comparison of Four Methods of IRT Subscoring

Peer reviewed

Direct link

de la Torre, Jimmy; Song, Hao; Hong, Yuan – Applied Psychological Measurement, 2011

Lack of sufficient reliability is the primary impediment for generating and reporting subtest scores. Several current methods of subscore estimation do so either by incorporating the correlational structure among the subtest abilities or by using the examinee's performance on the overall test. This article conducted a systematic comparison of four…

Descriptors: Item Response Theory, Scoring, Methods, Comparative Analysis

Iterative Linking with the Differential Functioning of Items and Tests (DFIT) Method: Comparison of Testwide and Item Parameter Replication (IPR) Critical Values

Peer reviewed

Direct link

Seybert, Jacob; Stark, Stephen – Applied Psychological Measurement, 2012

A Monte Carlo study was conducted to examine the accuracy of differential item functioning (DIF) detection using the differential functioning of items and tests (DFIT) method. Specifically, the performance of DFIT was compared using "testwide" critical values suggested by Flowers, Oshima, and Raju, based on simulations involving large numbers of…

Descriptors: Test Bias, Monte Carlo Methods, Form Classes (Languages), Simulation

DIF Testing for Ordinal Items with Poly-SIBTEST, the Mantel and GMH Tests, and IRT-LR-DIF when the Latent Distribution Is Nonnormal for Both Groups

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2011

Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another. One way to test items with ordinal response scales for DIF is likelihood ratio (LR) testing using item response theory (IRT), or IRT-LR-DIF. Despite the various advantages of…

Descriptors: Test Bias, Test Items, Item Response Theory, Nonparametric Statistics

Fitting IRT Models to Dichotomous and Polytomous Data: Assessing the Relative Model-Data Fit of Ideal Point and Dominance Models

Peer reviewed

Direct link

Tay, Louis; Ali, Usama S.; Drasgow, Fritz; Williams, Bruce – Applied Psychological Measurement, 2011

This study investigated the relative model-data fit of an ideal point item response theory (IRT) model (the generalized graded unfolding model [GGUM]) and dominance IRT models (e.g., the two-parameter logistic model [2PLM] and Samejima's graded response model [GRM]) to simulated dichotomous and polytomous data generated from each of these models.…

Descriptors: Item Response Theory, Data, Models, Goodness of Fit

Curtailment and Stochastic Curtailment to Shorten the CES-D

Peer reviewed

Direct link

Finkelman, Matthew D.; Smits, Niels; Kim, Wonsuk; Riley, Barth – Applied Psychological Measurement, 2012

The Center for Epidemiologic Studies-Depression (CES-D) scale is a well-known self-report instrument that is used to measure depressive symptomatology. Respondents who take the full-length version of the CES-D are administered a total of 20 items. This article investigates the use of curtailment and stochastic curtailment (SC), two sequential…

Descriptors: Measures (Individuals), Depression (Psychology), Test Length, Computer Assisted Testing

A Parametric Cumulative Sum Statistic for Person Fit

Peer reviewed

Direct link

Armstrong, Ronald D.; Shi, Min – Applied Psychological Measurement, 2009

This article develops a new cumulative sum (CUSUM) statistic to detect aberrant item response behavior. Shifts in behavior are modeled with quadratic functions and a series of likelihood ratio tests are used to detect aberrancy. The new CUSUM statistic is compared against another CUSUM approach as well as traditional person-fit statistics. A…

Descriptors: Simulation, Item Response Theory, Personality Theories, High Stakes Tests

The Comparative Performance of Conditional Independence Indices

Peer reviewed

Direct link

Kim, Doyoung; De Ayala, R. J.; Ferdous, Abdullah A.; Nering, Michael L. – Applied Psychological Measurement, 2011

To realize the benefits of item response theory (IRT), one must have model-data fit. One facet of a model-data fit investigation involves assessing the tenability of the conditional item independence (CII) assumption. In this Monte Carlo study, the comparative performance of 10 indices for identifying conditional item dependence is assessed. The…

Descriptors: Item Response Theory, Monte Carlo Methods, Error of Measurement, Statistical Analysis

Model Selection Indices for Polytomous Items

Peer reviewed

Direct link

Kang, Taehoon; Cohen, Allan S.; Sung, Hyun-Jung – Applied Psychological Measurement, 2009

This study examines the utility of four indices for use in model selection with nested and nonnested polytomous item response theory (IRT) models: a cross-validation index and three information-based indices. Four commonly used polytomous IRT models are considered: the graded response model, the generalized partial credit model, the partial credit…

Descriptors: Item Response Theory, Models, Selection, Simulation

An Extension of Least Squares Estimation of IRT Linking Coefficients for the Graded Response Model

Peer reviewed

Direct link

Kim, Seonghoon – Applied Psychological Measurement, 2010

The three types (generalized, unweighted, and weighted) of least squares methods, proposed by Ogasawara, for estimating item response theory (IRT) linking coefficients under dichotomous models are extended to the graded response model. A simulation study was conducted to confirm the accuracy of the extended formulas, and a real data study was…

Descriptors: Least Squares Statistics, Computation, Item Response Theory, Models

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Cohen, Allan S.	5
Woods, Carol M.	5
Brennan, Robert L.	3
Lee, Won-Chan	3
Drasgow, Fritz	2
Finkelman, Matthew D.	2
Kang, Taehoon	2
Kim, Seock-Ho	2
Meijer, Rob R.	2
Nering, Michael L.	2
Sijtsma, Klaas	2
Stark, Stephen	2
Wang, Tianyou	2
de Gruijter, Dato N. M.	2
Abad, Francisco J.	1
Abdous, Belkacem	1
Ali, Usama S.	1
Anderson Koenig, Judith	1
Andrich, David	1
Armstrong, Ronald D.	1
Bargmann, Jens	1
Belov, Dmitry I.	1
Berger, Martjin P. F.	1
Brossman, Bradley G.	1
More ▼