NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 76 to 90 of 172 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Seo, Dong Gi; Weiss, David J. – Educational and Psychological Measurement, 2013
The usefulness of the l[subscript z] person-fit index was investigated with achievement test data from 20 exams given to more than 3,200 college students. Results for three methods of estimating ? showed that the distributions of l[subscript z] were not consistent with its theoretical distribution, resulting in general overfit to the item response…
Descriptors: Achievement Tests, College Students, Goodness of Fit, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Qian, Xiaoyu; Nandakumar, Ratna; Glutting, Joseoph; Ford, Danielle; Fifield, Steve – ETS Research Report Series, 2017
In this study, we investigated gender and minority achievement gaps on 8th-grade science items employing a multilevel item response methodology. Both gaps were wider on physics and earth science items than on biology and chemistry items. Larger gender gaps were found on items with specific topics favoring male students than other items, for…
Descriptors: Item Analysis, Gender Differences, Achievement Gap, Grade 8
Md Desa, Zairul Nor Deana – ProQuest LLC, 2012
In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…
Descriptors: Item Response Theory, Computation, Reliability, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Wauters, Kelly; Desmet, Piet; Van Den Noortgate, Wim – Computers & Education, 2012
The evolution from static to dynamic electronic learning environments has stimulated the research on adaptive item sequencing. A prerequisite for adaptive item sequencing, in which the difficulty of the item is constantly matched to the ability level of the learner, is to have items with a known difficulty level. The difficulty level can be…
Descriptors: Expertise, Electronic Learning, Feedback (Response), Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Jihyun; Corter, James E. – Applied Psychological Measurement, 2011
Diagnosis of misconceptions or "bugs" in procedural skills is difficult because of their unstable nature. This study addresses this problem by proposing and evaluating a probability-based approach to the diagnosis of bugs in children's multicolumn subtraction performance using Bayesian networks. This approach assumes a causal network relating…
Descriptors: Misconceptions, Probability, Children, Subtraction
Peer reviewed Peer reviewed
Direct linkDirect link
He, Wei; Wolfe, Edward W. – Educational and Psychological Measurement, 2012
In administration of individually administered intelligence tests, items are commonly presented in a sequence of increasing difficulty, and test administration is terminated after a predetermined number of incorrect answers. This practice produces stochastically censored data, a form of nonignorable missing data. By manipulating four factors…
Descriptors: Individual Testing, Intelligence Tests, Test Items, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Babcock, Ben – Applied Psychological Measurement, 2011
Relatively little research has been conducted with the noncompensatory class of multidimensional item response theory (MIRT) models. A Monte Carlo simulation study was conducted exploring the estimation of a two-parameter noncompensatory item response theory (IRT) model. The estimation method used was a Metropolis-Hastings within Gibbs algorithm…
Descriptors: Item Response Theory, Sampling, Computation, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Maguire, Angela M.; Humphreys, Michael S.; Dennis, Simon; Lee, Michael D. – Journal of Memory and Language, 2010
This paper addresses two Global Matching predictions in embedded-category designs: the within-category choice advantage in forced-choice recognition (superior discrimination for test choices comprising a same-category distractor); and the category length effect in forced-choice and old/new recognition (a loss in discriminability with increases in…
Descriptors: Bayesian Statistics, Models, Prediction, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Frederickx, Sofie; Tuerlinckx, Francis; De Boeck, Paul; Magis, David – Journal of Educational Measurement, 2010
In this paper we present a new methodology for detecting differential item functioning (DIF). We introduce a DIF model, called the random item mixture (RIM), that is based on a Rasch model with random item difficulties (besides the common random person abilities). In addition, a mixture model is assumed for the item difficulties such that the…
Descriptors: Test Bias, Models, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Huang, Hung-Yu; Wang, Wen-Chung – Educational and Psychological Measurement, 2013
Both testlet design and hierarchical latent traits are fairly common in educational and psychological measurements. This study aimed to develop a new class of higher order testlet response models that consider both local item dependence within testlets and a hierarchy of latent traits. Due to high dimensionality, the authors adopted the Bayesian…
Descriptors: Item Response Theory, Models, Bayesian Statistics, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Wu, Huey-Min; Kuo, Bor-Chen; Yang, Jinn-Min – Educational Technology & Society, 2012
In recent years, many computerized test systems have been developed for diagnosing students' learning profiles. Nevertheless, it remains a challenging issue to find an adaptive testing algorithm to both shorten testing time and precisely diagnose the knowledge status of students. In order to find a suitable algorithm, four adaptive testing…
Descriptors: Adaptive Testing, Test Items, Computer Assisted Testing, Mathematics
Peer reviewed Peer reviewed
Direct linkDirect link
Ip, Edward H. – Applied Psychological Measurement, 2010
The testlet response model is designed for handling items that are clustered, such as those embedded within the same reading passage. Although the testlet is a powerful tool for handling item clusters in educational and psychological testing, the interpretations of its item parameters, the conditional correlation between item pairs, and the…
Descriptors: Item Response Theory, Models, Test Items, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Hooker, Giles; Finkelman, Matthew – Psychometrika, 2010
Hooker, Finkelman, and Schwartzman ("Psychometrika," 2009, in press) defined a paradoxical result as the attainment of a higher test score by changing answers from correct to incorrect and demonstrated that such results are unavoidable for maximum likelihood estimates in multidimensional item response theory. The potential for these results to…
Descriptors: Models, Scores, Item Response Theory, Psychometrics
Kim, Hyun Seok John – ProQuest LLC, 2011
Cognitive diagnostic assessment (CDA) is a new theoretical framework for psychological and educational testing that is designed to provide detailed information about examinees' strengths and weaknesses in specific knowledge structures and processing skills. During the last three decades, more than a dozen psychometric models have been developed…
Descriptors: Cognitive Measurement, Diagnostic Tests, Bayesian Statistics, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
DeCarlo, Lawrence T. – Applied Psychological Measurement, 2011
Cognitive diagnostic models (CDMs) attempt to uncover latent skills or attributes that examinees must possess in order to answer test items correctly. The DINA (deterministic input, noisy "and") model is a popular CDM that has been widely used. It is shown here that a logistic version of the model can easily be fit with standard software for…
Descriptors: Bayesian Statistics, Computation, Cognitive Tests, Diagnostic Tests
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  12