NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 9 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Bramley, Tom – Research Matters, 2020
The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…
Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Batanero, Carmen; Begué, Nuria; Borovcnik, Manfred; Gea, María M. – Statistics Education Research Journal, 2020
In Spain, curricular guidelines as well as the university-entrance tests for social-science high-school students (17-18 years old) include sampling distributions. To analyse the understanding of this concept we investigated a sample of 234 students. We administered a questionnaire to them and ask half for justifications of their answers. The…
Descriptors: High School Students, Adolescents, Statistics Education, Sampling
Soysal, Sümeyra; Arikan, Çigdem Akin; Inal, Hatice – Online Submission, 2016
This study aims to investigate the effect of methods to deal with missing data on item difficulty estimations under different test length conditions and sampling sizes. In this line, a data set including 10, 20 and 40 items with 100 and 5000 sampling size was prepared. Deletion process was applied at the rates of 5%, 10% and 20% under conditions…
Descriptors: Research Problems, Data Analysis, Item Response Theory, Test Items
Wu, Yi-Fang – ProQuest LLC, 2015
Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…
Descriptors: Item Response Theory, Test Items, Accuracy, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Sunnassee, Devdass – ProQuest LLC, 2011
Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…
Descriptors: Test Length, Test Format, Sample Size, Simulation
Kolen, Michael J.; Whitney, Douglas R. – 1978
The application of latent trait theory to classroom tests necessitates the use of small sample sizes for parameter estimation. Computer generated data were used to assess the accuracy of estimation of the slope and location parameters in the two parameter logistic model with fixed abilities and varying small sample sizes. The maximum likelihood…
Descriptors: Difficulty Level, Item Analysis, Latent Trait Theory, Mathematical Models
Farish, Stephen J. – 1984
The stability of Rasch test item difficulty parameters was investigated under varying conditions. Data were taken from a mathematics achievement test administered to over 2,000 Australian students. The experiments included: (1) relative stability of the Rasch, traditional, and z-item difficulty parameters using different sample sizes and designs;…
Descriptors: Achievement Tests, Difficulty Level, Estimation (Mathematics), Foreign Countries