NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 21 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Tang, Xiaodan; Karabatsos, George; Chen, Haiqin – Applied Measurement in Education, 2020
In applications of item response theory (IRT) models, it is known that empirical violations of the local independence (LI) assumption can significantly bias parameter estimates. To address this issue, we propose a threshold-autoregressive item response theory (TAR-IRT) model that additionally accounts for order dependence among the item responses…
Descriptors: Item Response Theory, Test Items, Models, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Chunyan; Jurich, Daniel; Morrison, Carol; Grabovsky, Irina – Applied Measurement in Education, 2021
The existence of outliers in the anchor items can be detrimental to the estimation of examinee ability and undermine the validity of score interpretation across forms. However, in practice, anchor item performance can become distorted due to various reasons. This study compares the performance of modified "INFIT" and "OUTFIT"…
Descriptors: Equated Scores, Test Items, Item Response Theory, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Lozano, José H.; Revuelta, Javier – Applied Measurement in Education, 2021
The present study proposes a Bayesian approach for estimating and testing the operation-specific learning model, a variant of the linear logistic test model that allows for the measurement of the learning that occurs during a test as a result of the repeated use of the operations involved in the items. The advantages of using a Bayesian framework…
Descriptors: Bayesian Statistics, Computation, Learning, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Bjermo, Jonas; Miller, Frank – Applied Measurement in Education, 2021
In recent years, the interest in measuring growth in student ability in various subjects between different grades in school has increased. Therefore, good precision in the estimated growth is of importance. This paper aims to compare estimation methods and test designs when it comes to precision and bias of the estimated growth of mean ability…
Descriptors: Scaling, Ability, Computation, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020
The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…
Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrara, Steve; Steedle, Jeffrey T.; Frantz, Roger S. – Applied Measurement in Education, 2022
Item difficulty modeling studies involve (a) hypothesizing item features, or item response demands, that are likely to predict item difficulty with some degree of accuracy; and (b) entering the features as independent variables into a regression equation or other statistical model to predict difficulty. In this review, we report findings from 13…
Descriptors: Reading Comprehension, Reading Tests, Test Items, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Pham, Duy N.; Wells, Craig S.; Bauer, Malcolm I.; Wylie, E. Caroline; Monroe, Scott – Applied Measurement in Education, 2021
Assessments built on a theory of learning progressions are promising formative tools to support learning and teaching. The quality and usefulness of those assessments depend, in large part, on the validity of the theory-informed inferences about student learning made from the assessment results. In this study, we introduced an approach to address…
Descriptors: Formative Evaluation, Mathematics Instruction, Mathematics Achievement, Middle School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu – Applied Measurement in Education, 2016
Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…
Descriptors: Guessing (Tests), Reaction Time, Nonparametric Statistics, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019
The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…
Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Nijlen, Daniel Van; Janssen, Rianne – Applied Measurement in Education, 2015
In this study it is investigated to what extent contextualized and non-contextualized mathematics test items have a differential impact on examinee effort. Mixture item response theory (IRT) models are applied to two subsets of items from a national assessment on mathematics in the second grade of the pre-vocational track in secondary education in…
Descriptors: Mathematics Tests, Measurement, Item Response Theory, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Applied Measurement in Education, 2018
This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…
Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)
Peer reviewed Peer reviewed
Direct linkDirect link
Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014
In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…
Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Koziol, Natalie A. – Applied Measurement in Education, 2016
Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…
Descriptors: Classification, Accuracy, Comparative Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Previous Page | Next Page »
Pages: 1  |  2