ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	19

Descriptor

Bayesian Statistics	31
Difficulty Level	31
Test Items	31
Item Response Theory	20
Computation	10
Models	10
Comparative Analysis	9
Maximum Likelihood Statistics	8
Monte Carlo Methods	6
Simulation	6
Statistical Analysis	6
Test Construction	6
Achievement Tests	5
Estimation (Mathematics)	5
Item Analysis	5
Latent Trait Theory	4
Sample Size	4
Statistical Inference	4
Test Bias	4
Ability	3
Accuracy	3
Adaptive Testing	3
Classification	3
Computer Assisted Testing	3
Goodness of Fit	3
More ▼

Source

Educational and Psychological…	6
Journal of Educational…	3
ProQuest LLC	3
Applied Measurement in…	2
ETS Research Report Series	2
Psychometrika	2
Applied Psychological…	1
Assessment & Evaluation in…	1
Computers & Education	1
International Working Group…	1
Measurement:…	1
More ▼

Publication Type

Journal Articles	19
Reports - Research	18
Reports - Evaluative	7
Speeches/Meeting Papers	5
Dissertations/Theses -…	3
Reports - Descriptive	2
Collected Works - Proceedings	1
Information Analyses	1
Numerical/Quantitative Data	1

Education Level

Higher Education	4
Postsecondary Education	3
Grade 8	1
Secondary Education	1

Audience

Location

Germany (Berlin)	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

California Achievement Tests	1
Graduate Record Examinations	1
Michigan Test of English…	1
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 31 results Save | Export

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

Test Fraud: Practical Applications and Operational Considerations for the Detection of Item Preknowledge and Compromised Content with Real Data

Direct link

Ross, Linette P. – ProQuest LLC, 2022

One of the most serious forms of cheating occurs when examinees have item preknowledge and prior access to secure test material before taking an exam for the purpose of obtaining an inflated test score. Examinees that cheat and have prior knowledge of test content before testing may have an unfair advantage over examinees that do not cheat. Item…

Descriptors: Testing, Deception, Cheating, Identification

Bayesian Estimation and Testing of a Linear Logistic Test Model for Learning during the Test

Peer reviewed

Direct link

Lozano, José H.; Revuelta, Javier – Applied Measurement in Education, 2021

The present study proposes a Bayesian approach for estimating and testing the operation-specific learning model, a variant of the linear logistic test model that allows for the measurement of the learning that occurs during a test as a result of the repeated use of the operations involved in the items. The advantages of using a Bayesian framework…

Descriptors: Bayesian Statistics, Computation, Learning, Testing

Careful with Those Priors: A Note on Bayesian Estimation in Two-Parameter Logistic Item Response Theory Models

Peer reviewed

Direct link

Marcoulides, Katerina M. – Measurement: Interdisciplinary Research and Perspectives, 2018

This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…

Descriptors: Bayesian Statistics, Item Response Theory, Probability, Difficulty Level

Multidimensional Classification of Examinees Using the Mixture Random Weights Linear Logistic Test Model

Peer reviewed

Direct link

Choi, In-Hee; Wilson, Mark – Educational and Psychological Measurement, 2015

An essential feature of the linear logistic test model (LLTM) is that item difficulties are explained using item design properties. By taking advantage of this explanatory aspect of the LLTM, in a mixture extension of the LLTM, the meaning of latent classes is specified by how item properties affect item difficulties within each class. To improve…

Descriptors: Classification, Test Items, Difficulty Level, Statistical Analysis

Reweighting Data in the Spirit of Tukey: Using Bayesian Posterior Probabilities as Rasch Residuals for Studying Misfit

Peer reviewed

Direct link

Dardick, William R.; Mislevy, Robert J. – Educational and Psychological Measurement, 2016

A new variant of the iterative "data = fit + residual" data-analytical approach described by Mosteller and Tukey is proposed and implemented in the context of item response theory psychometric models. Posterior probabilities from a Bayesian mixture model of a Rasch item response theory model and an unscalable latent class are expressed…

Descriptors: Bayesian Statistics, Probability, Data Analysis, Item Response Theory

Rasch Model Parameter Estimation in the Presence of a Nonnormal Latent Trait Using a Nonparametric Bayesian Approach

Peer reviewed

Direct link

Finch, Holmes; Edwards, Julianne M. – Educational and Psychological Measurement, 2016

Standard approaches for estimating item response theory (IRT) model parameters generally work under the assumption that the latent trait being measured by a set of items follows the normal distribution. Estimation of IRT parameters in the presence of nonnormal latent traits has been shown to generate biased person and item parameter estimates. A…

Descriptors: Item Response Theory, Computation, Nonparametric Statistics, Bayesian Statistics

Parameter Recovery and Classification Accuracy under Conditions of Testlet Dependency: A Comparison of the Traditional 2PL, Testlet, and Bi-Factor Models

Peer reviewed

Direct link

Koziol, Natalie A. – Applied Measurement in Education, 2016

Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…

Descriptors: Classification, Accuracy, Comparative Analysis, Models

Optimal Designs for the Rasch Model

Peer reviewed

Direct link

Grasshoff, Ulrike; Holling, Heinz; Schwabe, Rainer – Psychometrika, 2012

In this paper, optimal designs will be derived for estimating the ability parameters of the Rasch model when difficulty parameters are known. It is well established that a design is locally D-optimal if the ability and difficulty coincide. But locally optimal designs require that the ability parameters to be estimated are known. To attenuate this…

Descriptors: Item Response Theory, Test Items, Psychometrics, Statistical Analysis

The Performance of the Linear Logistic Test Model When the Q-Matrix Is Misspecified: A Simulation Study

Direct link

MacDonald, George T. – ProQuest LLC, 2014

A simulation study was conducted to explore the performance of the linear logistic test model (LLTM) when the relationships between items and cognitive components were misspecified. Factors manipulated included percent of misspecification (0%, 1%, 5%, 10%, and 15%), form of misspecification (under-specification, balanced misspecification, and…

Descriptors: Simulation, Item Response Theory, Models, Test Items

Item Pool Design for an Operational Variable-Length Computerized Adaptive Test

Peer reviewed

Direct link

He, Wei; Reckase, Mark D. – Educational and Psychological Measurement, 2014

For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…

Descriptors: Item Banks, Test Length, Computer Assisted Testing, Adaptive Testing

Assessing Scientific Reasoning: A Comprehensive Evaluation of Item Features That Affect Item Difficulty

Peer reviewed

Direct link

Stiller, Jurik; Hartmann, Stefan; Mathesius, Sabrina; Straube, Philipp; Tiemann, Rüdiger; Nordmeier, Volkhard; Krüger, Dirk; Upmeier zu Belzen, Annette – Assessment & Evaluation in Higher Education, 2016

The aim of this study was to improve the criterion-related test score interpretation of a text-based assessment of scientific reasoning competencies in higher education by evaluating factors which systematically affect item difficulty. To provide evidence about the specific demands which test items of various difficulty make on pre-service…

Descriptors: Logical Thinking, Scientific Concepts, Difficulty Level, Test Items

l[subscript z] Person-Fit Index to Identify Misfit Students with Achievement Test Data

Peer reviewed

Direct link

Seo, Dong Gi; Weiss, David J. – Educational and Psychological Measurement, 2013

The usefulness of the l[subscript z] person-fit index was investigated with achievement test data from 20 exams given to more than 3,200 college students. Results for three methods of estimating ? showed that the distributions of l[subscript z] were not consistent with its theoretical distribution, resulting in general overfit to the item response…

Descriptors: Achievement Tests, College Students, Goodness of Fit, Item Response Theory

Gender and Minority Achievement Gaps in Science in Eighth Grade: Item Analyses of Nationally Representative Data. Research Report. ETS RR-17-36

Peer reviewed
PDF on ERIC

Download full text

Qian, Xiaoyu; Nandakumar, Ratna; Glutting, Joseoph; Ford, Danielle; Fifield, Steve – ETS Research Report Series, 2017

In this study, we investigated gender and minority achievement gaps on 8th-grade science items employing a multilevel item response methodology. Both gaps were wider on physics and earth science items than on biology and chemistry items. Larger gender gaps were found on items with specific topics favoring male students than other items, for…

Descriptors: Item Analysis, Gender Differences, Achievement Gap, Grade 8

Item Difficulty Estimation: An Auspicious Collaboration between Data and Judgment

Peer reviewed

Direct link

Wauters, Kelly; Desmet, Piet; Van Den Noortgate, Wim – Computers & Education, 2012

The evolution from static to dynamic electronic learning environments has stimulated the research on adaptive item sequencing. A prerequisite for adaptive item sequencing, in which the difficulty of the item is constantly matched to the ability level of the learner, is to have items with a known difficulty level. The difficulty level can be…

Descriptors: Expertise, Electronic Learning, Feedback (Response), Sample Size

Previous Page | Next Page »

Pages: 1 | 2 | 3

Mislevy, Robert J.	2
Revuelta, Javier	2
Abdel-fattah, Abdel-fattah A.	1
Bejar, Isaac I.	1
Calders, Toon	1
Choi, In-Hee	1
Conati, Cristina	1
Dardick, William R.	1
De Ayala, R. J.	1
De Boeck, Paul	1
Desmet, Piet	1
Edwards, Julianne M.	1
Fifield, Steve	1
Finch, Holmes	1
Ford, Danielle	1
Frederickx, Sofie	1
Glutting, Joseoph	1
Grasshoff, Ulrike	1
Haladyna, Tom	1
Hartmann, Stefan	1
He, Wei	1
Holling, Heinz	1
Hsu, Tse-Chi	1
Jiayi Deng	1
More ▼