ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	11
Since 2017 (last 10 years)	27
Since 2007 (last 20 years)	58

Descriptor

Sample Size	80
Simulation	80
Test Items	80
Item Response Theory	46
Comparative Analysis	20
Test Length	20
Difficulty Level	19
Item Analysis	19
Models	19
Error of Measurement	18
Statistical Analysis	17
Correlation	16
Test Bias	13
Goodness of Fit	11
Computation	10
Scores	10
Ability	9
Sampling	9
Accuracy	8
Computer Assisted Testing	8
Monte Carlo Methods	8
Adaptive Testing	7
Equated Scores	7
Nonparametric Statistics	7
Probability	7
More ▼

Publication Type

Journal Articles	57
Reports - Research	52
Reports - Evaluative	17
Dissertations/Theses -…	10
Speeches/Meeting Papers	8
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Taiwan

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Test of English as a Foreign…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 80 results Save | Export

Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery

Peer reviewed

Direct link

Mostafa Hosseinzadeh; Ki Lynn Matlock Cole – Educational and Psychological Measurement, 2024

In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was…

Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Algorithms

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

Identifying Problematic Item Characteristics with Small Samples Using Mokken Scale Analysis

Peer reviewed

Direct link

Wind, Stefanie A. – Educational and Psychological Measurement, 2022

Researchers frequently use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, when they have relatively small samples of examinees. Researchers have provided some guidance regarding the minimum sample size for applications of MSA under various conditions. However, these studies have not focused on item-level…

Descriptors: Nonparametric Statistics, Item Response Theory, Sample Size, Test Items

There Are Many Greater Lower Bounds than Cronbach's [alpha]: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023

A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…

Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation

A New Stopping Criterion for Rasch Trees Based on the Mantel-Haenszel Effect Size Measure for Differential Item Functioning

Peer reviewed

Direct link

Henninger, Mirka; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

To detect differential item functioning (DIF), Rasch trees search for optimal split-points in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF…

Descriptors: Item Response Theory, Test Items, Effect Size, Statistical Significance

Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models

Peer reviewed

Direct link

Paek, Insu; Lin, Zhongtian; Chalmers, Robert Philip – Educational and Psychological Measurement, 2023

To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori…

Descriptors: Models, Item Response Theory, Test Items, Intervals

An Investigation of Item Calibration Methods in Multistage Testing

Peer reviewed

Direct link

Cai, Liuhan; Albano, Anthony D.; Roussos, Louis A. – Measurement: Interdisciplinary Research and Perspectives, 2021

Multistage testing (MST), an adaptive test delivery mode that involves algorithmic selection of predefined item modules rather than individual items, offers a practical alternative to linear and fully computerized adaptive testing. However, interactions across stages between item modules and examinee groups can lead to challenges in item…

Descriptors: Adaptive Testing, Test Items, Item Response Theory, Test Construction

The Study of the Effect of Item Parameter Drift on Ability Estimation Obtained from Adaptive Testing under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Sahin Kursad, Merve; Cokluk Bokeoglu, Omay; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022

Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and…

Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Error of Measurement

The Recovery of Correlation between Latent Abilities Using Compensatory and Noncompensatory Multidimensional IRT Models

Peer reviewed

Direct link

Fu, Yanyan; Strachan, Tyler; Ip, Edward H.; Willse, John T.; Chen, Shyh-Huei; Ackerman, Terry – International Journal of Testing, 2020

This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and…

Descriptors: Item Response Theory, Models, Test Items, Simulation

Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions

Peer reviewed
PDF on ERIC

Download full text

Gurdil Ege, Hatice; Demir, Ergul – Eurasian Journal of Educational Research, 2020

Purpose: The present study aims to evaluate how the reliabilities computed using a, Stratified a, Angoff-Feldt, and Feldt-Raju estimators may differ when sample size (500, 1000, and 2000) and item type ratio of dichotomous to polytomous items (2:1; 1:1, 1:2) included in the scale are varied. Research Methods: In this study, Cronbach's a,…

Descriptors: Test Format, Simulation, Test Reliability, Sample Size

Can Auxiliary Information Improve Rasch Estimation at Small Sample Sizes?

Direct link

Derek Sauder – ProQuest LLC, 2020

The Rasch model is commonly used to calibrate multiple choice items. However, the sample sizes needed to estimate the Rasch model can be difficult to attain (e.g., consider a small testing company trying to pretest new items). With small sample sizes, auxiliary information besides the item responses may improve estimation of the item parameters.…

Descriptors: Item Response Theory, Sample Size, Computation, Test Length

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Investigation of the Effect of Parameter Estimation and Classification Accuracy in Mixture IRT Models under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Saatcioglu, Fatima Munevver; Atar, Hakan Yavuz – International Journal of Assessment Tools in Education, 2022

This study aims to examine the effects of mixture item response theory (IRT) models on item parameter estimation and classification accuracy under different conditions. The manipulated variables of the simulation study are set as mixture IRT models (Rasch, 2PL, 3PL); sample size (600, 1000); the number of items (10, 30); the number of latent…

Descriptors: Accuracy, Classification, Item Response Theory, Programming Languages

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Educational and Psychological…	10
Journal of Educational…	10
ProQuest LLC	10
Applied Psychological…	6
International Journal of…	5
Applied Measurement in…	4
ETS Research Report Series	4
International Journal of…	3
Measurement:…	3
Structural Equation Modeling:…	2
American Journal of…	1
Asia Pacific Education Review	1
Educational Technology &…	1
Eurasian Journal of…	1
Hacettepe University Journal…	1
Journal of Educational and…	1
Journal of Psychoeducational…	1
Measurement and Evaluation in…	1
Quality Assurance in…	1
Research Matters	1
More ▼

Chen, Ping	2
Cho, Sun-Joo	2
Cohen, Allan S.	2
Dorans, Neil J.	2
Lee, Young-Sun	2
Paek, Insu	2
Suh, Youngsuk	2
Wells, Craig S.	2
Willse, John T.	2
Wollack, James A.	2
Zwick, Rebecca	2
Abulela, Mohammed A. A.	1
Ackerman, Terry	1
Albano, Anthony D.	1
Alhija, Fadia Nasser-Abu	1
Ames, Allison J.	1
Andersson, Björn	1
Asparouhov, Tihomir	1
Atar, Burcu	1
Atar, Hakan Yavuz	1
Baker, Frank B.	1
Boldt, R. F.	1
Bolt, Daniel M.	1
Bramley, Tom	1
More ▼