ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	21
Since 2017 (last 10 years)	50
Since 2007 (last 20 years)	98

Descriptor

Sample Size	139
Test Length	139
Item Response Theory	93
Test Items	66
Simulation	44
Comparative Analysis	31
Error of Measurement	30
Statistical Analysis	28
Monte Carlo Methods	27
Computation	26
Models	26
Correlation	25
Goodness of Fit	25
Accuracy	23
Test Bias	22
Difficulty Level	18
Equated Scores	18
Ability	15
Statistical Bias	15
Maximum Likelihood Statistics	13
Test Construction	13
Item Analysis	12
Statistical Distributions	12
Bayesian Statistics	11
Estimation (Mathematics)	11
More ▼

Publication Type

Journal Articles	98
Reports - Research	97
Reports - Evaluative	29
Speeches/Meeting Papers	26
Dissertations/Theses -…	10
Reports - Descriptive	2
Guides - Non-Classroom	1
Numerical/Quantitative Data	1

Education Level

Higher Education	3
Postsecondary Education	3
Secondary Education	2

Audience

Researchers

Location

Taiwan	2
Turkey	2
Colombia	1
Indonesia	1
Jordan	1
Peru	1
Qatar	1

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	3
Program for International…	2
Comprehensive Tests of Basic…	1
Iowa Tests of Basic Skills	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Sample Size X

Showing 91 to 105 of 139 results Save | Export

Item Parameter Estimation for the MIRT Model: Bias and Precision of Confirmatory Factor Analysis-Based Models

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2010

The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…

Descriptors: Item Response Theory, Computation, Factor Analysis, Models

A Simulation Study on the Performance of Four Multidimensional IRT Scale Linking Methods

Direct link

Wei, Youhua – ProQuest LLC, 2008

Scale linking is the process of developing the connection between scales of two or more sets of parameter estimates obtained from separate test calibrations. It is the prerequisite for many applications of IRT, such as test equating and differential item functioning analysis. Unidimensional scale linking methods have been studied and applied…

Descriptors: Test Length, Test Items, Sample Size, Simulation

Ramsay-Curve Item Response Theory for the Three-Parameter Logistic Item Response Model

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2008

In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters of a unidimensional item response model using marginal maximum likelihood estimation. This study evaluates RC-IRT for the three-parameter logistic (3PL) model with comparisons to the normal model and to the empirical…

Descriptors: Test Length, Computation, Item Response Theory, Maximum Likelihood Statistics

Modeling Nonignorable Missing Data in Speeded Tests

Peer reviewed

Direct link

Glas, Cees A. W.; Pimentel, Jonald L. – Educational and Psychological Measurement, 2008

In tests with time limits, items at the end are often not reached. Usually, the pattern of missing responses depends on the ability level of the respondents; therefore, missing data are not ignorable in statistical inference. This study models data using a combination of two item response theory (IRT) models: one for the observed response data and…

Descriptors: Intelligence Tests, Statistical Inference, Item Response Theory, Modeling (Psychology)

Investigation of a Nonparametric Procedure for Assessing Goodness-of-Fit in Item Response Theory

Peer reviewed

Direct link

Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008

Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…

Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics

Ramsay Curve IRT for Likert-Type Data

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2007

Ramsay curve item response theory (RC-IRT) was recently developed to detect and correct for nonnormal latent variables when unidimensional IRT models are fitted to data using maximum marginal likelihood estimation. The purpose of this research is to evaluate the performance of RC-IRT for Likert-type item responses with varying test lengths, sample…

Descriptors: Test Length, Item Response Theory, Sample Size, Comparative Analysis

Assessing the Dimensionality of Item Response Matrices with Small Sample Sizes and Short Test Lengths.

Peer reviewed

De Champlain, Andre; Gessaroli, Marc E. – Applied Measurement in Education, 1998

Type I error rates and rejection rates for three-dimensionality assessment procedures were studied with data sets simulated to reflect short tests and small samples. Results show that the G-squared difference test (D. Bock, R. Gibbons, and E. Muraki, 1988) suffered from a severely inflated Type I error rate at all conditions simulated. (SLD)

Descriptors: Item Response Theory, Matrices, Sample Size, Simulation

Bias of Exploratory and Cross-Validated DETECT Index under Unidimensionality

Peer reviewed

Direct link

Monahan, Patrick O.; Stump, Timothy E.; Finch, Holmes; Hambleton, Ronald K. – Applied Psychological Measurement, 2007

DETECT is a nonparametric "full" dimensionality assessment procedure that clusters dichotomously scored items into dimensions and provides a DETECT index of magnitude of multidimensionality. Four factors (test length, sample size, item response theory [IRT] model, and DETECT index) were manipulated in a Monte Carlo study of bias, standard error,…

Descriptors: Test Length, Sample Size, Monte Carlo Methods, Geometric Concepts

An Investigation of the Performance of the Generalized S-X[superscript 2] Item-Fit Index for Polytomous IRT Models. ACT Research Report Series, 2007-1

Download full text

Kang, Taehoon; Chen, Troy T. – ACT, Inc., 2007

Orlando and Thissen (2000, 2003) proposed an item-fit index, S-X[superscript 2], for dichotomous item response theory (IRT) models, which has performed better than traditional item-fit statistics such as Yen's (1981) Q[subscript 1] and McKinley and Mill's (1985) G[superscript 2]. This study extends the utility of S-X[superscript 2] to polytomous…

Descriptors: Item Response Theory, Models, Computer Software, Statistical Analysis

Assessing the Dimensionality of Polytomous Item Responses with Small Sample Sizes and Short Test Lengths: A Comparison of Procedures.

PDF pending restoration

De Champlain, Andre F.; Gessaroli, Marc E.; Tang, K. Linda; De Champlain, Judy E. – 1998

The empirical Type I error rates of Poly-DIMTEST (H. Li and W. Stout, 1995) and the LISREL8 chi square fit statistic (K. Joreskog and D. Sorbom, 1993) were compared with polytomous unidimensional data sets simulated to vary as a function of test length and sample size. The rejection rates for both statistics were also studied with two-dimensional…

Descriptors: Chi Square, Goodness of Fit, Item Response Theory, Sample Size

The Influence of Multidimensionality on the Graded Response Model.

Peer reviewed

De Ayala, R. J. – Applied Psychological Measurement, 1994

Previous work on the effects of dimensionality on parameter estimation for dichotomous models is extended to the graded response model. Datasets are generated that differ in the number of latent factors as well as their interdimensional association, number of test items, and sample size. (SLD)

Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Sample Size

The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items.

Peer reviewed

Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001

Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…

Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability

A Comparison of Item Parameter Estimates and ICCs Produced with TESTGRAF and BILOG under Different Test Lengths and Sample Sizes.

Download full text

Patsula, Liane N.; Gessaroli, Marc E. – 1995

Among the most popular techniques used to estimate item response theory (IRT) parameters are those used in the LOGIST and BILOG computer programs. Because of its accuracy with smaller sample sizes or differing test lengths, BILOG has become the standard to which new estimation programs are compared. However, BILOG is still complex and…

Descriptors: Comparative Analysis, Effect Size, Estimation (Mathematics), Item Response Theory

A Comparison of Logistic Regression and Analysis of Variance Differential Item Functioning Decision Methods.

Peer reviewed

Whitmore, Marjorie L.; Schumacker, Randall E. – Educational and Psychological Measurement, 1999

Compared differential item functioning detection rates for logistic regression and analysis of variance for dichotomously scored items using simulated data and varying test length, sample size, discrimination rate, and underlying ability. Explains why the logistic regression method is recommended for most applications. (SLD)

Descriptors: Ability, Analysis of Variance, Comparative Analysis, Item Bias

Simultaneous Use of Multiple Answer Copying Indexes to Improve Detection Rates

Peer reviewed

Direct link

Wollack, James A. – Applied Measurement in Education, 2006

Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…

Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Educational and Psychological…	30
Applied Psychological…	14
Applied Measurement in…	11
ProQuest LLC	10
International Journal of…	7
Journal of Educational…	7
Measurement:…	6
Educational Sciences: Theory…	5
ETS Research Report Series	3
International Journal of…	3
Journal of Educational and…	2
Journal of Experimental…	2
ACT, Inc.	1
Educational Testing Service	1
Eurasian Journal of…	1
Grantee Submission	1
International Journal of…	1
International Journal of…	1
Multivariate Behavioral…	1
Participatory Educational…	1
Psychometrika	1
Quality Assurance in…	1
Turkish Journal of Education	1
More ▼

Gessaroli, Marc E.	5
Lee, Won-Chan	5
Hambleton, Ronald K.	4
Kim, Seock-Ho	4
Wells, Craig S.	4
Cohen, Allan S.	3
De Champlain, Andre	3
De Champlain, Andre F.	3
Schumacker, Randall E.	3
Uysal, Ibrahim	3
Chon, Kyong Hee	2
De Ayala, R. J.	2
DeMars, Christine E.	2
Drasgow, Fritz	2
Finch, Holmes	2
Huggins-Manley, Anne Corinne	2
Kelecioglu, Hülya	2
Kilic, Abdullah Faruk	2
Kiliç, Abdullah Faruk	2
Lee, Yi-Hsuan	2
Liang, Tie	2
Nandakumar, Ratna	2
Paek, Insu	2
Sengul Avsar, Asiye	2
More ▼