ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	6
Since 2017 (last 10 years)	12
Since 2007 (last 20 years)	18

Descriptor

Classification	20
Sample Size	20
Test Items	20
Accuracy	12
Item Response Theory	11
Statistical Analysis	7
Computation	5
Correlation	5
Item Analysis	5
Simulation	5
Comparative Analysis	4
Data Analysis	4
Effect Size	4
Goodness of Fit	4
Models	4
Bayesian Statistics	3
Computer Software	3
Diagnostic Tests	3
Difficulty Level	3
Error of Measurement	3
Identification	3
Reliability	3
Responses	3
Cognitive Measurement	2
Data	2
More ▼

Source

Educational and Psychological…	5
ProQuest LLC	4
Applied Measurement in…	3
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Measurement:…	1

Publication Type

Journal Articles	14
Reports - Research	13
Dissertations/Theses -…	4
Reports - Evaluative	3
Speeches/Meeting Papers	2

Education Level

Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

A New Stopping Criterion for Rasch Trees Based on the Mantel-Haenszel Effect Size Measure for Differential Item Functioning

Peer reviewed

Direct link

Henninger, Mirka; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

To detect differential item functioning (DIF), Rasch trees search for optimal split-points in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF…

Descriptors: Item Response Theory, Test Items, Effect Size, Statistical Significance

Comparing Drift Detection Methods for Accurate Rasch Equating in Different Sample Sizes

Peer reviewed

Direct link

Alahmadi, Sarah; Jones, Andrew T.; Barry, Carol L.; Ibáñez, Beatriz – Applied Measurement in Education, 2023

Rasch common-item equating is often used in high-stakes testing to maintain equivalent passing standards across test administrations. If unaddressed, item parameter drift poses a major threat to the accuracy of Rasch common-item equating. We compared the performance of well-established and newly developed drift detection methods in small and large…

Descriptors: Equated Scores, Item Response Theory, Sample Size, Test Items

An Investigation of Item Calibration Methods in Multistage Testing

Peer reviewed

Direct link

Cai, Liuhan; Albano, Anthony D.; Roussos, Louis A. – Measurement: Interdisciplinary Research and Perspectives, 2021

Multistage testing (MST), an adaptive test delivery mode that involves algorithmic selection of predefined item modules rather than individual items, offers a practical alternative to linear and fully computerized adaptive testing. However, interactions across stages between item modules and examinee groups can lead to challenges in item…

Descriptors: Adaptive Testing, Test Items, Item Response Theory, Test Construction

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models

Peer reviewed

Direct link

Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…

Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification

Nonparametric Classification Method for Multiple-Choice Items in Cognitive Diagnosis

Peer reviewed

Direct link

Wang, Yu; Chiu, Chia-Yi; Köhn, Hans Friedrich – Journal of Educational and Behavioral Statistics, 2023

The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format…

Descriptors: Multiple Choice Tests, Nonparametric Statistics, Test Format, Educational Assessment

Investigation of the Effect of Parameter Estimation and Classification Accuracy in Mixture IRT Models under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Saatcioglu, Fatima Munevver; Atar, Hakan Yavuz – International Journal of Assessment Tools in Education, 2022

This study aims to examine the effects of mixture item response theory (IRT) models on item parameter estimation and classification accuracy under different conditions. The manipulated variables of the simulation study are set as mixture IRT models (Rasch, 2PL, 3PL); sample size (600, 1000); the number of items (10, 30); the number of latent…

Descriptors: Accuracy, Classification, Item Response Theory, Programming Languages

Investigating the Classification Accuracy of Rasch and Nominal Weights Mean Equating with Very Small Samples

Peer reviewed

Direct link

Furter, Robert T.; Dwyer, Andrew C. – Applied Measurement in Education, 2020

Maintaining equivalent performance standards across forms is a psychometric challenge exacerbated by small samples. In this study, the accuracy of two equating methods (Rasch anchored calibration and nominal weights mean) and four anchor item selection methods were investigated in the context of very small samples (N = 10). Overall, nominal…

Descriptors: Classification, Accuracy, Item Response Theory, Equated Scores

Evaluating the Effectiveness of the Expectation-Maximization (EM) Algorithm for Bayesian Network Calibration

Direct link

Tingir, Seyfullah – ProQuest LLC, 2019

Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…

Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability

The Impact of Different Missing Data Handling Methods on DINA Model

Peer reviewed
PDF on ERIC

Download full text

Sünbül, Seçil Ömür – International Journal of Evaluation and Research in Education, 2018

In this study, it was aimed to investigate the impact of different missing data handling methods on DINA model parameter estimation and classification accuracy. In the study, simulated data were used and the data were generated by manipulating the number of items and sample size. In the generated data, two different missing data mechanisms…

Descriptors: Data, Test Items, Sample Size, Statistical Analysis

Diagnostic Classification Models: Recent Developments, Practical Issues, and Prospects

Peer reviewed

Direct link

Ravand, Hamdollah; Baghaei, Purya – International Journal of Testing, 2020

More than three decades after their introduction, diagnostic classification models (DCM) do not seem to have been implemented in educational systems for the purposes they were devised. Most DCM research is either methodological for model development and refinement or retrofitting to existing nondiagnostic tests and, in the latter case, basically…

Descriptors: Classification, Models, Diagnostic Tests, Test Construction

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Fitting Large Factor Analysis Models with Ordinal Data

Peer reviewed

Direct link

DiStefano, Christine; McDaniel, Heather L.; Zhang, Liyun; Shi, Dexin; Jiang, Zhehan – Educational and Psychological Measurement, 2019

A simulation study was conducted to investigate the model size effect when confirmatory factor analysis (CFA) models include many ordinal items. CFA models including between 15 and 120 ordinal items were analyzed with mean- and variance-adjusted weighted least squares to determine how varying sample size, number of ordered categories, and…

Descriptors: Factor Analysis, Effect Size, Data, Sample Size

Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

Peer reviewed

Direct link

Suh, Youngsuk – Journal of Educational Measurement, 2016

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance

Effectiveness of Combining Statistical Tests and Effect Sizes When Using Logistic Discriminant Function Regression to Detect Differential Item Functioning for Polytomous Items

Peer reviewed

Direct link

Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D. – Educational and Psychological Measurement, 2013

The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…

Descriptors: Item Analysis, Test Items, Effect Size, Statistical Analysis

Assessing Dimensionality of Noncompensatory Multidimensional Item Response Theory with Complex Structures

Peer reviewed

Direct link

Svetina, Dubravka – Educational and Psychological Measurement, 2013

The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in noncompensatory multidimensional item response models using dimensionality assessment procedures based on DETECT (dimensionality evaluation to enumerate contributing traits) and NOHARM (normal ogive harmonic analysis robust method). Five…

Descriptors: Item Response Theory, Statistical Analysis, Computation, Test Length

Previous Page | Next Page »

Pages: 1 | 2

Svetina, Dubravka	2
Abulela, Mohammed A. A.	1
Alahmadi, Sarah	1
Albano, Anthony D.	1
Allan S. Cohen	1
Atar, Hakan Yavuz	1
Baghaei, Purya	1
Barry, Carol L.	1
Bezruczko, Nikolaus	1
Cai, Liuhan	1
Chiu, Chia-Yi	1
Cohen, Allan S.	1
Debelak, Rudolf	1
DiStefano, Christine	1
Dwyer, Andrew C.	1
Furter, Robert T.	1
Gómez-Benito, Juana	1
Henninger, Mirka	1
Hidalgo, Maria Dolores	1
Ibáñez, Beatriz	1
Jiang, Zhehan	1
Jones, Andrew T.	1
Kim, Hyun Seok John	1
Kim, Seock-Ho	1
More ▼