ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	11
Since 2017 (last 10 years)	27
Since 2007 (last 20 years)	76

Descriptor

Simulation	107
Test Length	107
Item Response Theory	65
Test Items	51
Sample Size	44
Computer Assisted Testing	28
Comparative Analysis	27
Adaptive Testing	23
Computation	19
Goodness of Fit	19
Models	19
Correlation	17
Error of Measurement	17
Test Bias	16
Accuracy	15
Evaluation Methods	15
Ability	14
Probability	14
Bayesian Statistics	13
Item Analysis	13
Scores	13
Classification	12
Statistical Analysis	12
Test Construction	11
Difficulty Level	10
More ▼

Publication Type

Journal Articles	70
Reports - Research	65
Reports - Evaluative	25
Speeches/Meeting Papers	15
Dissertations/Theses -…	14
Reports - Descriptive	2
Information Analyses	1
Numerical/Quantitative Data	1
Reports - General	1

Education Level

Elementary Secondary Education	2
Higher Education	2
Postsecondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Preschool Education	1
Secondary Education	1

Audience

Location

Netherlands	1
Taiwan	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	3
Advanced Placement…	1
Armed Forces Qualification…	1
COMPASS (Computer Assisted…	1
Center for Epidemiologic…	1
SAT (College Admission Test)	1
Stanford Binet Intelligence…	1
Test of English as a Foreign…	1
Trends in International…	1

What Works Clearinghouse Rating

Simulation X

Showing 31 to 45 of 107 results Save | Export

Effect of Differential Item Functioning on Test Equating

Peer reviewed
PDF on ERIC

Download full text

Kabasakal, Kübra Atalay; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2015

This study examines the effect of differential item functioning (DIF) items on test equating through multilevel item response models (MIRMs) and traditional IRMs. The performances of three different equating models were investigated under 24 different simulation conditions, and the variables whose effects were examined included sample size, test…

Descriptors: Test Bias, Equated Scores, Item Response Theory, Simulation

A Nonparametric Approach to Estimate Classification Accuracy and Consistency

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Journal of Educational Measurement, 2014

When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

Descriptors: Cutting Scores, Classification, Computation, Nonparametric Statistics

Comparing Three Estimation Methods for the Three-Parameter Logistic IRT Model

Direct link

Lamsal, Sunil – ProQuest LLC, 2015

Different estimation procedures have been developed for the unidimensional three-parameter item response theory (IRT) model. These techniques include the marginal maximum likelihood estimation, the fully Bayesian estimation using Markov chain Monte Carlo simulation techniques, and the Metropolis-Hastings Robbin-Monro estimation. With each…

Descriptors: Item Response Theory, Monte Carlo Methods, Maximum Likelihood Statistics, Markov Processes

Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Applied Psychological Measurement, 2013

Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…

Descriptors: Item Response Theory, Accuracy, Classification, Computation

The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi – Applied Psychological Measurement, 2013

Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Length, Ability

An Assessment of the Nonparametric Approach for Evaluating the Fit of Item Response Models

Peer reviewed

Direct link

Liang, Tie; Wells, Craig S.; Hambleton, Ronald K. – Journal of Educational Measurement, 2014

As item response theory has been more widely applied, investigating the fit of a parametric model becomes an important part of the measurement process. There is a lack of promising solutions to the detection of model misfit in IRT. Douglas and Cohen introduced a general nonparametric approach, RISE (Root Integrated Squared Error), for detecting…

Descriptors: Item Response Theory, Measurement Techniques, Nonparametric Statistics, Models

Examination of the Parameter Estimate Bias When Violating the Orthogonality Assumption of the Bifactor Model

Direct link

Zheng, Chunmei – ProQuest LLC, 2013

Educational and psychological constructs are normally measured by multifaceted dimensions. The measured construct is defined and measured by a set of related subdomains. A bifactor model can accurately describe such data with both the measured construct and the related subdomains. However, a limitation of the bifactor model is the orthogonality…

Descriptors: Educational Testing, Measurement Techniques, Test Items, Models

Minimum Sample Size Requirements for Mokken Scale Analysis

Peer reviewed

Direct link

Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014

An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…

Descriptors: Sampling, Test Items, Effect Size, Scaling

Deriving Stopping Rules for Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Wang, Chun; Chang, Hua-Hua; Boughton, Keith A. – Applied Psychological Measurement, 2013

Multidimensional computerized adaptive testing (MCAT) is able to provide a vector of ability estimates for each examinee, which could be used to provide a more informative profile of an examinee's performance. The current literature on MCAT focuses on the fixed-length tests, which can generate less accurate results for those examinees whose…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Length, Item Banks

Comparing Performances (Type I Error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Atalay Kabasakal, Kübra; Arsan, Nihan; Gök, Bilge; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2014

This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying…

Descriptors: Comparative Analysis, Item Response Theory, Statistical Analysis, Test Bias

The Random-Threshold Generalized Unfolding Model and Its Application of Computerized Adaptive Testing

Peer reviewed

Direct link

Wang, Wen-Chung; Liu, Chen-Wei; Wu, Shiu-Lien – Applied Psychological Measurement, 2013

The random-threshold generalized unfolding model (RTGUM) was developed by treating the thresholds in the generalized unfolding model as random effects rather than fixed effects to account for the subjective nature of the selection of categories in Likert items. The parameters of the new model can be estimated with the JAGS (Just Another Gibbs…

Descriptors: Computer Assisted Testing, Adaptive Testing, Models, Bayesian Statistics

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Equating Multidimensional Tests under a Random Groups Design: A Comparison of Various Equating Procedures

Direct link

Lee, Eunjung – ProQuest LLC, 2013

The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…

Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory

Cognitive Diagnostic Analysis Using Hierarchically Structured Skills

Direct link

Su, Yu-Lan – ProQuest LLC, 2013

This dissertation proposes two modified cognitive diagnostic models (CDMs), the deterministic, inputs, noisy, "and" gate with hierarchy (DINA-H) model and the deterministic, inputs, noisy, "or" gate with hierarchy (DINO-H) model. Both models incorporate the hierarchical structures of the cognitive skills in the model estimation…

Descriptors: Models, Diagnostic Tests, Cognitive Processes, Thinking Skills

Adjusting the Adjusted X[superscript 2]/df Ratio Statistic for Dichotomous Item Response Theory Analyses: Does the Model Fit?

Peer reviewed

Direct link

Tay, Louis; Drasgow, Fritz – Educational and Psychological Measurement, 2012

Two Monte Carlo simulation studies investigated the effectiveness of the mean adjusted X[superscript 2]/df statistic proposed by Drasgow and colleagues and, because of problems with the method, a new approach for assessing the goodness of fit of an item response theory model was developed. It has been previously recommended that mean adjusted…

Descriptors: Test Length, Monte Carlo Methods, Goodness of Fit, Item Response Theory

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Applied Psychological…	17
Educational and Psychological…	16
ProQuest LLC	14
Journal of Educational…	7
Applied Measurement in…	6
International Journal of…	5
ETS Research Report Series	4
Educational Sciences: Theory…	2
Grantee Submission	2
Measurement:…	2
Psychometrika	2
Education Sciences	1
Education and Information…	1
International Educational…	1
International Journal of…	1
International Journal of…	1
Journal of Applied Measurement	1
Journal of Educational and…	1
Pearson	1
Quality Assurance in…	1
Turkish Journal of Education	1
More ▼

Cheng, Ying	4
Hambleton, Ronald K.	4
Wang, Wen-Chung	4
De Champlain, Andre	3
Drasgow, Fritz	3
Schumacker, Randall E.	3
Tay, Louis	3
Wells, Craig S.	3
Chun Wang	2
Cliff, Norman	2
Cui, Ying	2
Gessaroli, Marc E.	2
Kelecioglu, Hülya	2
Lathrop, Quinn N.	2
Meijer, Rob R.	2
Paek, Insu	2
Sijtsma, Klaas	2
Steinheiser, Frederick H., Jr.	2
Weiss, David J.	2
Yao, Lihua	2
A. Corinne Huggins-Manley	1
Ackerman, Terry	1
Ames, Allison J.	1
Andersson, Björn	1
More ▼