ERIC - Search Results

Descriptor

Statistical Distributions	11
Test Length	11
Sample Size	7
Estimation (Mathematics)	6
Ability	5
Comparative Analysis	4
Computer Simulation	4
Item Response Theory	4
Mathematical Models	4
Maximum Likelihood Statistics	4
Test Items	4
Item Bias	3
Monte Carlo Methods	3
Bayesian Statistics	2
Difficulty Level	2
Equations (Mathematics)	2
Probability	2
Scores	2
Simulation	2
Test Validity	2
Adaptive Testing	1
Classification	1
Comparative Testing	1
Computer Assisted Testing	1
Content Validity	1
More ▼

Source

Journal of Educational…	2
Applied Psychological…	1

Author

Abdel-fattah, Abdel-fattah A.	1
Ankenmann, Robert D.	1
Bush, M. Joan	1
Junker, Brian W.	1
Kim, Seock-Ho	1
Lewis, Charles	1
Livingston, Samuel A.	1
Nandakumar, Ratna	1
Noonan, Brian W.	1
Pommerich, Mary	1
Schumacker, Randall E.	1
Seong, Tae-Je	1
Stone, Clement A.	1
Wainer, Howard	1
Yu, Feng	1
More ▼

Publication Type

Reports - Evaluative	11
Speeches/Meeting Papers	7
Journal Articles	3

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Estimating the Consistency and Accuracy of Classifications Based on Test Scores.

Peer reviewed

Livingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995

A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)

Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions

An Analytical Evaluation of Two Common-Odds Ratios as Population Indicators of DIF.

Download full text

Pommerich, Mary; And Others – 1995

The Mantel-Haenszel (MH) statistic for identifying differential item functioning (DIF) commonly conditions on the observed test score as a surrogate for conditioning on latent ability. When the comparison group distributions are not completely overlapping (i.e., are incongruent), the observed score represents different levels of latent ability…

Descriptors: Ability, Comparative Analysis, Difficulty Level, Item Bias

Comparing BILOG and LOGIST Estimates for Normal, Truncated Normal, and Beta Ability Distributions.

Download full text

Abdel-fattah, Abdel-fattah A. – 1994

The accuracy of estimation procedures in item response theory was studied using Monte Carlo methods and varying sample size, number of subjects, and distribution of ability parameters for: (1) joint maximum likelihood as implemented in the computer program LOGIST; (2) marginal maximum likelihood; and (3) marginal Bayesian procedures as implemented…

Descriptors: Ability, Bayesian Statistics, Estimation (Mathematics), Maximum Likelihood Statistics

The Effect of Test Length and IRT Model on the Distribution and Stability of Three Appropriateness Indexes.

Peer reviewed

Noonan, Brian W.; And Others – Applied Psychological Measurement, 1992

Studied the extent to which three appropriateness indexes, Z(sub 3), ECIZ4, and W, are well standardized in a Monte Carlo study. The ECIZ4 most closely approximated a normal distribution, and its skewness and kurtosis were more stable and less affected by test length and item response theory model than the others. (SLD)

Descriptors: Comparative Analysis, Item Response Theory, Mathematical Models, Maximum Likelihood Statistics

A Comparison of Procedures for Ability Estimation under the Graded Response Model.

Download full text

Seong, Tae-Je; And Others – 1997

This study was designed to compare the accuracy of three commonly used ability estimation procedures under the graded response model. The three methods, maximum likelihood (ML), expected a posteriori (EAP), and maximum a posteriori (MAP), were compared using a recovery study design for two sample sizes, two underlying ability distributions, and…

Descriptors: Ability, Comparative Analysis, Difficulty Level, Estimation (Mathematics)

A Note on Recovering the Ability Distribution from Test Scores.

Download full text

Junker, Brian W. – 1992

A simple scheme is proposed for smoothly approximating the ability distribution for relatively long tests, assuming that the item characteristic curves (ICCs) are known or well estimated. The scheme works for a general class of ICCs and is guaranteed to completely recover the theta distribution as the test length increases. The proposed method of…

Descriptors: Computer Simulation, Equations (Mathematics), Estimation (Mathematics), Item Bias

Quick Norms with Rasch Measurement.

PDF pending restoration

Bush, M. Joan; Schumacker, Randall E. – 1993

The feasibility of quick norms derived by the procedure described by B. D. Wright and M. H. Stone (1979) was investigated. Norming differences between traditionally calculated means and Rasch "quick" means were examined for simulated data sets of varying sample size, test length, and type of distribution. A 5 by 5 by 2 design with a…

Descriptors: Computer Simulation, Item Response Theory, Norm Referenced Tests, Sample Size

A Monte Carlo Study of Marginal Maximum Likelihood Parameter Estimates for the Graded Model.

Download full text

Ankenmann, Robert D.; Stone, Clement A. – 1992

Effects of test length, sample size, and assumed ability distribution were investigated in a multiple replication Monte Carlo study under the 1-parameter (1P) and 2-parameter (2P) logistic graded model with five score levels. Accuracy and variability of item parameter and ability estimates were examined. Monte Carlo methods were used to evaluate…

Descriptors: Computer Simulation, Estimation (Mathematics), Item Bias, Mathematical Models

A Comparison of the Performance of Simulated Hierarchical and Linear Testlets.

Peer reviewed

Wainer, Howard; And Others – Journal of Educational Measurement, 1992

Computer simulations were run to measure the relationship between testlet validity and factors of item pool size and testlet length for both adaptive and linearly constructed testlets. Making a testlet adaptive yields only modest increases in aggregate validity because of the peakedness of the typical proficiency distribution. (Author/SLD)

Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Computer Simulation

Testing the Robustness of DIMTEST on Nonnormal Ability Distributions.

Download full text

Nandakumar, Ratna; Yu, Feng – 1994

DIMTEST is a statistical test procedure for assessing essential unidimensionality of binary test item responses. The test statistic T used for testing the null hypothesis of essential unidimensionality is a nonparametric statistic. That is, there is no particular parametric distribution assumed for the underlying ability distribution or for the…

Descriptors: Ability, Content Validity, Correlation, Nonparametric Statistics

An Investigation of Hierarchical Bayes Procedures in Item Response Theory.

Download full text

Kim, Seock-Ho; And Others – 1992

Hierarchical Bayes procedures were compared for estimating item and ability parameters in item response theory. Simulated data sets from the two-parameter logistic model were analyzed using three different hierarchical Bayes procedures: (1) the joint Bayesian with known hyperparameters (JB1); (2) the joint Bayesian with information hyperpriors…

Descriptors: Ability, Bayesian Statistics, Comparative Analysis, Equations (Mathematics)