ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	12
Since 2017 (last 10 years)	27
Since 2007 (last 20 years)	50

Descriptor

Error of Measurement	78
Test Length	78
Test Items	41
Item Response Theory	36
Sample Size	30
Test Reliability	20
Models	18
Comparative Analysis	17
Simulation	17
Scores	16
Monte Carlo Methods	15
Computation	14
Computer Assisted Testing	14
Statistical Analysis	13
Adaptive Testing	12
Test Bias	11
Estimation (Mathematics)	10
Item Analysis	10
Statistical Bias	10
Goodness of Fit	8
Ability	7
Accuracy	7
Foreign Countries	7
Probability	7
Sampling	7
More ▼

Publication Type

Journal Articles	58
Reports - Research	53
Reports - Evaluative	16
Dissertations/Theses -…	4
Speeches/Meeting Papers	4
Reports - Descriptive	2

Education Level

Grade 3	2
Higher Education	2
Postsecondary Education	2
Secondary Education	2
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Primary Education	1

Audience

Researchers

Location

Taiwan	2
Turkey	2
Iran	1
Japan	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Advanced Placement…	1
Armed Forces Qualification…	1
California Psychological…	1
Comprehensive Tests of Basic…	1
National Assessment of…	1
National Longitudinal Study…	1
Program for International…	1
Test of English as a Foreign…	1
Trends in International…	1
Wechsler Adult Intelligence…	1
More ▼

What Works Clearinghouse Rating

Error of Measurement X

Showing 61 to 75 of 78 results Save | Export

On the Theory of a Set of Tests Which Differ Only in Length

Peer reviewed

Kristof, Walter – Psychometrika, 1971

Descriptors: Cognitive Measurement, Error of Measurement, Mathematical Models, Psychological Testing

The Use of Moment Estimators for Mixtures of Two Binomials with One Known Success Parameter.

Peer reviewed

Van Der Linden, Wim J. – Educational and Psychological Measurement, 1983

This paper focuses on mixtures of two binomials with one known success parameter. It is shown how moment estimators can be obtained for the remaining unknown parameters of such mixtures, and results are presented from a Monte Carlo study carried out to explore the statistical properties of these estimators. (PN)

Descriptors: Educational Testing, Error of Measurement, Estimation (Mathematics), Guessing (Tests)

Markov Chain Monte Carlo Estimation of Item Parameters for the Generalized Graded Unfolding Model

Peer reviewed

Direct link

de la Torre, Jimmy; Stark, Stephen; Chernyshenko, Oleksandr S. – Applied Psychological Measurement, 2006

The authors present a Markov Chain Monte Carlo (MCMC) parameter estimation procedure for the generalized graded unfolding model (GGUM) and compare it to the marginal maximum likelihood (MML) approach implemented in the GGUM2000 computer program, using simulated and real personality data. In the simulation study, test length, number of response…

Descriptors: Computation, Monte Carlo Methods, Markov Processes, Item Response Theory

The Standard Errors of the Feldt-Gilmer Congeneric Reliability Coefficients: Iowa Testing Programs Occasional Papers. Number 31.

PDF pending restoration

Gilmer, Jerry S.; Feldt, Leonard S. – 1982

The Feldt-Gilmer congeneric reliability coefficients make it possible to estimate the reliability of a test composed of parts of unequal, unknown length. The approximate standard errors of the Feldt-Gilmer coefficients are derived via a method using the multivariate Taylor's expansion. Monte Carlo simulation is employed to corroborate the…

Descriptors: Educational Testing, Error of Measurement, Mathematical Formulas, Mathematical Models

Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking

Peer reviewed

Direct link

Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004

The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…

Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items

A Comparison of Two Item Selection Procedures for Building Criterion-Referenced Tests.

Download full text

Haladyna, Tom; Roid, Gale – 1981

Two approaches to criterion-referenced test construction are compared. Classical test theory is based on the practice of random sampling from a well-defined domain of test items; latent trait theory suggests that the difficulty of the items should be matched to the achievement level of the student. In addition to these two methods of test…

Descriptors: Criterion Referenced Tests, Error of Measurement, Latent Trait Theory, Test Construction

Consideration for Sample Size in Reliability Studies for Mastery Tests. Publication Series in Mastery Testing.

Download full text

Saunders, Joseph C.; Huynh, Huynh – 1980

In most reliability studies, the precision of a reliability estimate varies inversely with the number of examinees (sample size). Thus, to achieve a given level of accuracy, some minimum sample size is required. An approximation for this minimum size may be made if some reasonable assumptions regarding the mean and standard deviation of the test…

Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests

The MIMIC Model as a Method for Detecting DIF: Comparison With Mantel-Haenszel, SIBTEST, and the IRT Likelihood Ratio

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2005

This study compares the ability of the multiple indicators, multiple causes (MIMIC) confirmatory factor analysis model to correctly identify cases of differential item functioning (DIF) with more established methods. Although the MIMIC model might have application in identifying DIF for multiple grouping variables, there has been little…

Descriptors: Identification, Factor Analysis, Test Bias, Models

Factors Influencing the Mantel and Generalized Mantel-Haenszel Methods for the Assessment of Differential Item Functioning in Polytomous Items

Peer reviewed

Direct link

Wang, Wen-Chung; Su, Ya-Hui – Applied Psychological Measurement, 2004

Eight independent variables (differential item functioning [DIF] detection method, purification procedure, item response model, mean latent trait difference between groups, test length, DIF pattern, magnitude of DIF, and percentage of DIF items) were manipulated, and two dependent variables (Type I error and power) were assessed through…

Descriptors: Test Length, Test Bias, Simulation, Item Response Theory

Item Parameter Recovery, Standard Error Estimates, and Fit Statistics of the Winsteps Program for the Family of Rasch Models

Peer reviewed

Direct link

Wang, Wen-Chung; Chen, Cheng-Te – Educational and Psychological Measurement, 2005

This study investigates item parameter recovery, standard error estimates, and fit statistics yielded by the WINSTEPS program under the Rasch model and the rating scale model through Monte Carlo simulations. The independent variables were item response model, test length, and sample size. WINSTEPS yielded practically unbiased estimates for the…

Descriptors: Statistics, Test Length, Rating Scales, Item Response Theory

The Standardized Mean Difference within the Framework of Item Response Theory

Peer reviewed

Direct link

Wang, Wen-Chung; Chen, Hsueh-Chu – Educational and Psychological Measurement, 2004

As item response theory (IRT) becomes popular in educational and psychological testing, there is a need of reporting IRT-based effect size measures. In this study, we show how the standardized mean difference can be generalized into such a measure. A disattenuation procedure based on the IRT test reliability is proposed to correct the attenuation…

Descriptors: Test Reliability, Rating Scales, Sample Size, Error of Measurement

A Method for Determining the Length of Criterion-Referenced Tests Using Reliability and Validity Indices.

Download full text

Mills, Craig N.; Simon, Robert – 1981

When criterion-referenced tests are used to assign examinees to states reflecting their performance level on a test, the better known methods for determining test length, which consider relationships among domain scores and errors of measurement, have their limitations. The purpose of this paper is to present a computer system named TESTLEN, which…

Descriptors: Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores, Error of Measurement

Relationship Among the Number of Sub-Tests; Skewness, Kurtosis, and Size of Population; And Magnitude of Errors of Estimate in Multiple Matrix Sampling. (Revised Version).

PDF pending restoration

Misanchuk, Earl R. – 1978

Multiple matrix sampling of three subscales of the California Psychological Inventory was used to investigate the effects of four variables on error estimates of the mean (EEM) and variance (EEV). The four variables were examinee population size (600, 450, 300, 150, 100, and 75); number of subtests, (2, 3, 4, 5, 6, and 7), hence the number of…

Descriptors: Adults, Analysis of Variance, Error of Measurement, Item Sampling

The Influence of Dimensionality on CAT Ability Estimation.

Peer reviewed

De Ayala, R. J. – Educational and Psychological Measurement, 1992

Effects of dimensionality on ability estimation of an adaptive test were examined using generated data in Bayesian computerized adaptive testing (CAT) simulations. Generally, increasing interdimensional difficulty association produced a slight decrease in test length and an increase in accuracy of ability estimation as assessed by root mean square…

Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Computer Simulation

An Investigation of Methods for Reducing Sampling Error in Certain IRT Procedures.

Download full text

Wingersky, Marilyn S.; Lord, Frederic M. – 1983

The sampling errors of maximum likelihood estimates of item-response theory parameters are studied in the case where both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Banks, Latent Trait Theory

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Educational and Psychological…	13
Applied Psychological…	9
ETS Research Report Series	7
Journal of Educational…	6
International Journal of…	5
Applied Measurement in…	4
ProQuest LLC	4
International Journal of…	3
Psychometrika	3
Educational Sciences: Theory…	2
Journal of Educational and…	2
ACT Education Corp.	1
Assessment & Evaluation in…	1
Education and Information…	1
Grantee Submission	1
Journal of Psychoeducational…	1
Physical Review Physics…	1
Psychological Assessment	1
Psychological Methods	1
More ▼

Sijtsma, Klaas	3
Wang, Wen-Chung	3
DeMars, Christine E.	2
Emons, Wilco H. M.	2
Finch, Holmes	2
Gu, Lixiong	2
Kilic, Abdullah Faruk	2
Lee, Won-Chan	2
Lee, Yi-Hsuan	2
Livingston, Samuel A.	2
Stark, Stephen	2
Wingersky, Marilyn S.	2
Yao, Lihua	2
Zhang, Jinming	2
A. Corinne Huggins-Manley	1
Abad, Francisco J.	1
Allison, Paul A.	1
Andersson, Björn	1
Arsan, Nihan	1
Atalay Kabasakal, Kübra	1
Atar, Burcu	1
Axelrod, Bradley N.	1
Ayse Bilicioglu Gunes	1
Ban, Jae-Chun	1
More ▼