Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 0 |
| Since 2007 (last 20 years) | 11 |
Descriptor
| Data Analysis | 11 |
| Item Response Theory | 9 |
| Computation | 7 |
| Models | 6 |
| Correlation | 5 |
| Simulation | 5 |
| Test Items | 4 |
| Bayesian Statistics | 3 |
| Evaluation Methods | 3 |
| Markov Processes | 3 |
| Monte Carlo Methods | 3 |
| More ▼ | |
Source
| Applied Psychological… | 11 |
Author
| de la Torre, Jimmy | 3 |
| Chan, Tsze | 1 |
| Chen, Cheng-Te | 1 |
| Cohen, Jon | 1 |
| Fukuhara, Hirotaka | 1 |
| Green, Bert F. | 1 |
| Gu, Fei | 1 |
| Hong, Yuan | 1 |
| Hoyle, Larry | 1 |
| Jiang, Tao | 1 |
| Kamata, Akihito | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 11 |
| Reports - Research | 9 |
| Reports - Evaluative | 2 |
Education Level
| Elementary Secondary Education | 1 |
| Grade 4 | 1 |
| High Schools | 1 |
Audience
| Researchers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
| SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Lei, Pui-Wa; Li, Hongli – Applied Psychological Measurement, 2013
Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST),…
Descriptors: Sample Size, Test Bias, Computation, Accuracy
Fukuhara, Hirotaka; Kamata, Akihito – Applied Psychological Measurement, 2011
A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…
Descriptors: Item Response Theory, Test Bias, Test Items, Bayesian Statistics
Zhang, Jinming – Applied Psychological Measurement, 2012
It is common to assume during a statistical analysis of a multiscale assessment that the assessment is composed of several unidimensional subtests or that it has simple structure. Under this assumption, the unidimensional and multidimensional approaches can be used to estimate item parameters. These two approaches are equivalent in parameter…
Descriptors: Simulation, Computation, Models, Statistical Analysis
Gu, Fei; Skorupski, William P.; Hoyle, Larry; Kingston, Neal M. – Applied Psychological Measurement, 2011
Ramsay-curve item response theory (RC-IRT) is a nonparametric procedure that estimates the latent trait using splines, and no distributional assumption about the latent trait is required. For item parameters of the two-parameter logistic (2-PL), three-parameter logistic (3-PL), and polytomous IRT models, RC-IRT can provide more accurate estimates…
Descriptors: Intervals, Item Response Theory, Models, Evaluation Methods
Green, Bert F. – Applied Psychological Measurement, 2011
This article refutes a recent claim that computer-based tests produce biased scores for very proficient test takers who make mistakes on one or two initial items and that the "bias" can be reduced by using a four-parameter IRT model. Because the same effect occurs with pattern scores on nonadaptive tests, the effect results from IRT scoring, not…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Bias, Item Response Theory
de la Torre, Jimmy; Hong, Yuan – Applied Psychological Measurement, 2010
Sample size ranks as one of the most important factors that affect the item calibration task. However, due to practical concerns (e.g., item exposure) items are typically calibrated with much smaller samples than what is desired. To address the need for a more flexible framework that can be used in small sample item calibration, this article…
Descriptors: Sample Size, Markov Processes, Tests, Data Analysis
de la Torre, Jimmy; Song, Hao – Applied Psychological Measurement, 2009
Assessments consisting of different domains (e.g., content areas, objectives) are typically multidimensional in nature but are commonly assumed to be unidimensional for estimation purposes. The different domains of these assessments are further treated as multi-unidimensional tests for the purpose of obtaining diagnostic information. However, when…
Descriptors: Ability, Tests, Item Response Theory, Data Analysis
Effects of Ignoring Item Interaction on Item Parameter Estimation and Detection of Interacting Items
Chen, Cheng-Te; Wang, Wen-Chung – Applied Psychological Measurement, 2007
This study explores the effects of ignoring item interaction on item parameter estimation and the efficiency of using the local dependence index Q[subscript 3] and the SAS NLMIXED procedure to detect item interaction under the three-parameter logistic model and the generalized partial credit model. Through simulations, it was found that ignoring…
Descriptors: Models, Item Response Theory, Simulation, Generalization
Zhang, Bo; Walker, Cindy M. – Applied Psychological Measurement, 2008
The purpose of this research was to examine the effects of missing data on person-model fit and person trait estimation in tests with dichotomous items. Under the missing-completely-at-random framework, four missing data treatment techniques were investigated including pairwise deletion, coding missing responses as incorrect, hotdeck imputation,…
Descriptors: Item Response Theory, Computation, Goodness of Fit, Test Items
Cohen, Jon; Chan, Tsze; Jiang, Tao; Seburn, Mary – Applied Psychological Measurement, 2008
U.S. state educational testing programs administer tests to track student progress and hold schools accountable for educational outcomes. Methods from item response theory, especially Rasch models, are usually used to equate different forms of a test. The most popular method for estimating Rasch models yields inconsistent estimates and relies on…
Descriptors: Testing Programs, Educational Testing, Item Response Theory, Computation
de la Torre, Jimmy – Applied Psychological Measurement, 2008
Recent work has shown that multidimensionally scoring responses from different tests can provide better ability estimates. For educational assessment data, applications of this approach have been limited to binary scores. Of the different variants, the de la Torre and Patz model is considered more general because implementing the scoring procedure…
Descriptors: Markov Processes, Scoring, Data Analysis, Item Response Theory

Peer reviewed
Direct link
