NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Kyung Yong; Lim, Euijin; Lee, Won-Chan – International Journal of Testing, 2019
For passage-based tests, items that belong to a common passage often violate the local independence assumption of unidimensional item response theory (UIRT). In this case, ignoring local item dependence (LID) and estimating item parameters using a UIRT model could be problematic because doing so might result in inaccurate parameter estimates,…
Descriptors: Item Response Theory, Equated Scores, Test Items, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Shin, Jinnie; Gierl, Mark J. – International Journal of Testing, 2022
Over the last five years, tremendous strides have been made in advancing the AIG methodology required to produce items in diverse content areas. However, the one content area where enormous problems remain unsolved is language arts, generally, and reading comprehension, more specifically. While reading comprehension test items can be created using…
Descriptors: Reading Comprehension, Test Construction, Test Items, Natural Language Processing
Peer reviewed Peer reviewed
Direct linkDirect link
Moon, Jung Aa; Sinharay, Sandip; Keehner, Madeleine; Katz, Irvin R. – International Journal of Testing, 2020
The current study examined the relationship between test-taker cognition and psychometric item properties in multiple-selection multiple-choice and grid items. In a study with content-equivalent mathematics items in alternative item formats, adult participants' tendency to respond to an item was affected by the presence of a grid and variations of…
Descriptors: Computer Assisted Testing, Multiple Choice Tests, Test Wiseness, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sohee; Cole, Ki Lynn; Mwavita, Mwarumba – International Journal of Testing, 2018
This study investigated the effects of linking potentially multidimensional test forms using the fixed item parameter calibration. Forms had equal or unequal total test difficulty with and without confounding difficulty. The mean square errors and bias of estimated item and ability parameters were compared across the various confounding tests. The…
Descriptors: Test Items, Item Response Theory, Test Format, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013
The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…
Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Moshinsky, Avital; Ziegler, David; Gafni, Naomi – International Journal of Testing, 2017
Many medical schools have adopted multiple mini-interviews (MMI) as an advanced selection tool. MMIs are expensive and used to test only a few dozen candidates per day, making it infeasible to develop a different test version for each test administration. Therefore, some items are reused both within and across years. This study investigated the…
Descriptors: Interviews, Medical Schools, Test Validity, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Baghaei, Purya; Aryadoust, Vahid – International Journal of Testing, 2015
Research shows that test method can exert a significant impact on test takers' performance and thereby contaminate test scores. We argue that common test method can exert the same effect as common stimuli and violate the conditional independence assumption of item response theory models because, in general, subsets of items which have a shared…
Descriptors: Test Format, Item Response Theory, Models, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Hess, Brian J.; Johnston, Mary M.; Lipner, Rebecca S. – International Journal of Testing, 2013
Current research on examination response time has focused on tests comprised of traditional multiple-choice items. Consequently, the impact of other innovative or complex item formats on examinee response time is not understood. The present study used multilevel growth modeling to investigate examinee characteristics associated with response time…
Descriptors: Test Items, Test Format, Reaction Time, Individual Characteristics
Peer reviewed Peer reviewed
Direct linkDirect link
Talento-Miller, Eileen; Guo, Fanmin; Han, Kyung T. – International Journal of Testing, 2013
When power tests include a time limit, it is important to assess the possibility of speededness for examinees. Past research on differential speededness has examined gender and ethnic subgroups in the United States on paper and pencil tests. When considering the needs of a global audience, research regarding different native language speakers is…
Descriptors: Adaptive Testing, Computer Assisted Testing, English, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
He, Wei; Wolfe, Edward W. – International Journal of Testing, 2010
This article reports the results of a study of potential sources of item nonequivalence between English and Chinese language versions of a cognitive development test for preschool-aged children. Items were flagged for potential nonequivalence through statistical and judgment-based procedures, and the relationship between flag status and item…
Descriptors: Preschool Children, Mandarin Chinese, Cognitive Development, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Allalouf, Avi; Rapp, Joel; Stoller, Reuven – International Journal of Testing, 2009
When a test is adapted from a source language (SL) into a target language (TL), the two forms are usually not psychometrically equivalent. If linking between test forms is necessary, those items that have had their psychometric characteristics altered by the translation (differential item functioning [DIF] items) should be eliminated from the…
Descriptors: Test Items, Test Format, Verbal Tests, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Papanastasiou, Elena C.; Reckase, Mark D. – International Journal of Testing, 2007
Because of the increased popularity of computerized adaptive testing (CAT), many admissions tests, as well as certification and licensure examinations, have been transformed from their paper-and-pencil versions to computerized adaptive versions. A major difference between paper-and-pencil tests and CAT from an examinee's point of view is that in…
Descriptors: Simulation, Adaptive Testing, Computer Assisted Testing, Test Items
Peer reviewed Peer reviewed
Ercikan, Kadriye – International Journal of Testing, 2002
Disentangled sources of differential item functioning (DIF) in a multilanguage assessment for which multiple factors were expected to be causing DIF. Data for the Third International Mathematics and Science study for four countries and two languages (3,000 to 11,000 cases in each comparison group) reveal amounts and sources of DIF. (SLD)
Descriptors: Cross Cultural Studies, English, French, International Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Osterlind, Steven J.; Miao, Danmin; Sheng, Yanyan; Chia, Rosina C. – International Journal of Testing, 2004
This study investigated the interaction between different cultural groups and item type, and the ensuing effect on construct validity for a psychological inventory, the Myers-Briggs Type Indicator (MBTI, Form G). The authors analyzed 94 items from 2 Chinese-translated versions of the MBTI (Form G) for factorial differences among groups of…
Descriptors: Test Format, Undergraduate Students, Cultural Differences, Test Validity