ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	5

Descriptor

Bayesian Statistics	6
Statistical Analysis	6
Item Response Theory	4
Models	4
Test Items	4
Computation	3
Classification	2
Markov Processes	2
Monte Carlo Methods	2
Simulation	2
Adaptive Testing	1
Cognitive Tests	1
Comparative Analysis	1
Computer Assisted Testing	1
Computer Software	1
Criteria	1
Diagnostic Tests	1
Intervals	1
Mastery Tests	1
Mathematics	1
Mathematics Tests	1
Matrices	1
Maximum Likelihood Statistics	1
Meta Analysis	1
National Competency Tests	1
More ▼

Source

Applied Psychological…

Author

Babcock, Ben	1
Chang, Wanchen	1
DeCarlo, Lawrence T.	1
DeMars, Christine E.	1
Dodd, Barbara G.	1
Finkelman, Matthew David	1
Millsap, Roger E.	1
Whittaker, Tiffany A.	1

Publication Type

Journal Articles	6
Reports - Evaluative	4
Reports - Research	2

Education Level

Elementary Secondary Education	1
Grade 12	1
Grade 4	1
Grade 8	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Confirming Testlet Effects

Peer reviewed

Direct link

DeMars, Christine E. – Applied Psychological Measurement, 2012

A testlet is a cluster of items that share a common passage, scenario, or other context. These items might measure something in common beyond the trait measured by the test as a whole; if so, the model for the item responses should allow for this testlet trait. But modeling testlet effects that are negligible makes the model unnecessarily…

Descriptors: Test Items, Item Response Theory, Comparative Analysis, Models

The Performance of IRT Model Selection Methods with Mixed-Format Tests

Peer reviewed

Direct link

Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G. – Applied Psychological Measurement, 2012

When tests consist of multiple-choice and constructed-response items, researchers are confronted with the question of which item response theory (IRT) model combination will appropriately represent the data collected from these mixed-format tests. This simulation study examined the performance of six model selection criteria, including the…

Descriptors: Item Response Theory, Models, Selection, Criteria

Estimating a Noncompensatory IRT Model Using Metropolis within Gibbs Sampling

Peer reviewed

Direct link

Babcock, Ben – Applied Psychological Measurement, 2011

Relatively little research has been conducted with the noncompensatory class of multidimensional item response theory (MIRT) models. A Monte Carlo simulation study was conducted exploring the estimation of a two-parameter noncompensatory item response theory (IRT) model. The estimation method used was a Metropolis-Hastings within Gibbs algorithm…

Descriptors: Item Response Theory, Sampling, Computation, Statistical Analysis

On the Analysis of Fraction Subtraction Data: The DINA Model, Classification, Latent Class Sizes, and the Q-Matrix

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Applied Psychological Measurement, 2011

Cognitive diagnostic models (CDMs) attempt to uncover latent skills or attributes that examinees must possess in order to answer test items correctly. The DINA (deterministic input, noisy "and") model is a popular CDM that has been widely used. It is shown here that a logistic version of the model can easily be fit with standard software for…

Descriptors: Bayesian Statistics, Computation, Cognitive Tests, Diagnostic Tests

Variations on Stochastic Curtailment in Sequential Mastery Testing

Peer reviewed

Direct link

Finkelman, Matthew David – Applied Psychological Measurement, 2010

In sequential mastery testing (SMT), assessment via computer is used to classify examinees into one of two mutually exclusive categories. Unlike paper-and-pencil tests, SMT has the capability to use variable-length stopping rules. One approach to shortening variable-length tests is stochastic curtailment, which halts examination if the probability…

Descriptors: Mastery Tests, Computer Assisted Testing, Adaptive Testing, Test Length

Tolerance Intervals: Alternatives to Credibility Intervals in Validity Generalization Research.

Peer reviewed

Millsap, Roger E. – Applied Psychological Measurement, 1988

Two new methods for constructing a credibility interval (CI)--an interval containing a specified proportion of true validity description--are discussed, from a frequentist perspective. Tolerance intervals, unlike the current method of constructing the CI, have performance characteristics across repeated applications and may be useful in validity…

Descriptors: Bayesian Statistics, Meta Analysis, Statistical Analysis, Test Reliability