NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
Smith, Richard M. – Educational and Psychological Measurement, 1994
Rasch model total-fit statistics and between-item fit statistics were compared for their ability to detect measurement disturbances through the use of simulated data. Results indicate that the between-fit statistic appears more sensitive to systematic measurement disturbances and the total-fit statistic is more sensitive to random measurement…
Descriptors: Comparative Analysis, Goodness of Fit, Item Response Theory, Measurement Techniques
Peer reviewed Peer reviewed
Cohen, Allan S.; And Others – Applied Psychological Measurement, 1993
Three measures of differential item functioning for the dichotomous response model are extended to include Samejima's graded response model. Two are based on area differences between item true score functions, and one is a chi-square statistic for comparing differences in item parameters. (SLD)
Descriptors: Chi Square, Comparative Analysis, Identification, Item Bias
Meisner, Richard; And Others – 1993
This paper presents a study on the generation of mathematics test items using algorithmic methods. The history of this approach is briefly reviewed and is followed by a survey of the research to date on the statistical parallelism of algorithmically generated mathematics items. Results are presented for 8 parallel test forms generated using 16…
Descriptors: Algorithms, Comparative Analysis, Computer Assisted Testing, Item Banks
Peer reviewed Peer reviewed
Nandakumar, Ratna – Journal of Educational Measurement, 1994
Using simulated and real data, this study compares the performance of three methodologies for assessing unidimensionality: (1) DIMTEST; (2) the approach of Holland and Rosenbaum; and (3) nonlinear factor analysis. All three models correctly confirm unidimensionality, but they differ in their ability to detect the lack of unidimensionality.…
Descriptors: Ability, Comparative Analysis, Evaluation Methods, Factor Analysis
Woodruff, David – 1993
Two analyses of variance (ANOVA) models for item scores are compared. The first is an items by subject random effect ANOVA. The second is a mixed effects ANOVA with items fixed and subjects random. Comparisons regarding reliability, Cronbach's alpha coefficient, psychometric inference, and inter-item covariance structure are made between the…
Descriptors: Analysis of Covariance, Analysis of Variance, Comparative Analysis, Factor Analysis
Schumacker, Randall E.; And Others – 1994
Rasch between and total weighted and unweighted fit statistics were compared using varying test lengths and sample sizes. Two test lengths (20 and 50 items) and three sample sizes (150, 500, and 1,000 were crossed. Each of the six combinations were replicated 100 times. In addition, power comparisons were made. Results indicated that there were no…
Descriptors: Comparative Analysis, Goodness of Fit, Item Response Theory, Power (Statistics)
Peer reviewed Peer reviewed
Huynh, Huynh; Ferrara, Steven – Journal of Educational Measurement, 1994
Equal percentile (EP) and partial credit (PC) equatings for raw scores from performance-based assessments with free-response items are compared through the use of data from the Maryland School Performance Assessment Program. Results suggest that EP and PC methods do not give equivalent results when distributions are markedly skewed. (SLD)
Descriptors: Comparative Analysis, Equated Scores, Mathematics Tests, Performance Based Assessment
Wang, Tianyou; Kolen, Michael J. – 1994
In this paper a quadratic curve equating method for different test forms under a random-group data-collection design is proposed. Procedures for implementing this method and related issues are described and discussed. The quadratic-curve method was evaluated with real test data (from two 30-item subtests for a professional licensure examination…
Descriptors: Comparative Analysis, Data Collection, Equated Scores, Goodness of Fit
Mills, Craig N.; Melican, Gerald J. – 1987
The study compares three methods for establishing cut-off scores that effect a compromise between absolute cut-offs based on item difficulty and relative cut-offs based on expected passing rates. Each method coordinates these two types of information differently. The Beuk method obtains judges' estimates of an absolute cut-off and an expected…
Descriptors: Academic Standards, Certification, Comparative Analysis, Cutting Scores
Muraki, Eiji – 1984
The TESTFACT computer program and full-information factor analysis of test items were used in a computer simulation conducted to correct for the guessing effect. Full-information factor analysis also corrects for omitted items. The present version of TESTFACT handles up to five factors and 150 items. A preliminary smoothing of the tetrachoric…
Descriptors: Comparative Analysis, Computer Simulation, Computer Software, Correlation
Peer reviewed Peer reviewed
Shepard, Lorrie; And Others – Journal of Educational Statistics, 1984
Item response theory bias detection procedures were applied to data from Black and White seniors on the High School and Beyond data files. Overall, the sums-of-squares statistics (weighted by the inverse of the variance errors) were the best indices for quantifying item characteristic curve differences between groups (Author/BW)
Descriptors: Achievement Tests, Black Students, Comparative Analysis, Evaluation Methods
Hambleton, Ronald K.; And Others – 1987
The study compared two promising item response theory (IRT) item-selection methods, optimal and content-optimal, with two non-IRT item selection methods, random and classical, for use in fixed-length certification exams. The four methods were used to construct 20-item exams from a pool of approximately 250 items taken from a 1985 certification…
Descriptors: Comparative Analysis, Content Validity, Cutting Scores, Difficulty Level
Wang, Yu-Chung Lawrence; Hocevar, Dennis – 1994
The major goal of this study is to apply the essential unidimensionality statistic of W. Stout and the corresponding computer program (DIMTEST) to a hierarchical level mathematics achievement data set and to determine the extent to which the undimensional assumption can be accurately applied to mathematics achievement data. The study also…
Descriptors: Ability, Comparative Analysis, Elementary Education, Elementary School Students
Sarvela, Paul D. – 1986
Four discrimination indices were compared, using score distributions which were normal, bimodal, and negatively skewed. The score distributions were systematically varied to represent the common circumstances of a military training situation using criterion-referenced mastery tests. Three 20-item tests were administered to 110 simulated subjects.…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Analysis, Mastery Tests