ERIC Number: ED490464
Record Type: Non-Journal
Publication Date: 2004-Jan
Pages: 35
Abstractor: Author
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: N/A
Evaluating 1-, 2- and 3- Parameter Logistic Models Using Model-Based and Empirically-Based Simulations under Homogeneous and Heterogeneous Set Conditions
Rizavi, Saba; Way, Walter D.; Lu, Ying; Pitoniak, Mary; Steffen, Manfred
Online Submission, Paper presented at the Annual Meeting of National Council on Measurement in Education (Chicago, IL, Apr 10, 2003)
The purpose of this study was to use realistically simulated data to evaluate various CAT designs for use with the verbal reasoning measure of the Medical College Admissions Test (MCAT). Factors such as item pool depth, content constraints, and item formats often cause repeated adaptive administrations of an item at ability levels that are not matched to its difficulty, in which case the model-data misfit, if existing, might contribute to bias in examinees? final ability estimates since the data generated from the model might not represent the real response patterns of examinees. By incorporating the model-data misfit into adaptive testing simulations, this study introduced a simulation methodology in a real-world situation to address this important issue. The CAT simulations that were carried out suggested that measurement precision equivalent to the current paper-and-pencil MCAT Verbal Reasoning test could be achieved with a 32-item adaptive test based on the 2-PL or 3-PL models. Although the 2-PL and 3-PL simulations made slightly less uniform use of the item pools, the differences between these models and the 1-PL model were surprisingly small. The results showed that when there was considerable amount of model data misfit, the model-based simulations gave smaller CSEMs at certain ability levels, which are misleading. The empirically-based simulations provided a more reliable way of evaluating a CAT design before it's implementation. An administration of a 2-PL CAT with reliability comparable to P&P reliability using almost half the P&P test length is a very positive finding of the study. The use of a 2-PL instead of a 3-PL model is recommended because of the simplicity of the 2-PL model. (Contains 37 tables.) [This paper was also presented at the Graduate Student Research Panel of American Association of Medical Colleges (Washington, DC, October 11, 2002). Research finalized in 2004.]
Publication Type: Numerical/Quantitative Data; Reports - Research; Speeches/Meeting Papers
Education Level: Higher Education
Audience: N/A
Language: English
Sponsor: Association of American Medical Colleges, Washington, DC.; Educational Testing Service, Princeton, NJ.
Authoring Institution: N/A
Identifiers - Assessments and Surveys: Medical College Admission Test; California Achievement Tests
Grant or Contract Numbers: N/A
Author Affiliations: N/A