Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 8 |
Descriptor
| Sampling | 12 |
| Test Length | 12 |
| Error of Measurement | 5 |
| Test Items | 5 |
| Computation | 4 |
| Item Response Theory | 4 |
| Sample Size | 4 |
| Correlation | 3 |
| Data Analysis | 3 |
| Statistical Analysis | 3 |
| Test Construction | 3 |
| More ▼ | |
Source
Author
| Abad, Francisco J. | 1 |
| Berk, Ronald A. | 1 |
| Bradburn, Norman | 1 |
| Cheng, Ying | 1 |
| Diao, Qi | 1 |
| Dorans, Neil J. | 1 |
| Ebru Dogruöz | 1 |
| Guo, Hongwen | 1 |
| Hong, Maxwell | 1 |
| Hülya Kelecioglu | 1 |
| Kandathil, Mathew | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 12 |
| Reports - Research | 9 |
| Reports - Evaluative | 2 |
| Guides - Non-Classroom | 1 |
| Opinion Papers | 1 |
Education Level
| Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 1 |
| National Longitudinal Study… | 1 |
| Program for International… | 1 |
What Works Clearinghouse Rating
Ebru Dogruöz; Hülya Kelecioglu – International Journal of Assessment Tools in Education, 2024
In this research, multistage adaptive tests (MST) were compared according to sample size, panel pattern and module length for top-down and bottom-up test assembly methods. Within the scope of the research, data from PISA 2015 were used and simulation studies were conducted according to the parameters estimated from these data. Analysis results for…
Descriptors: Adaptive Testing, Test Construction, Foreign Countries, Achievement Tests
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Patton, Jeffrey M.; Cheng, Ying; Hong, Maxwell; Diao, Qi – Journal of Educational and Behavioral Statistics, 2019
In psychological and survey research, the prevalence and serious consequences of careless responses from unmotivated participants are well known. In this study, we propose to iteratively detect careless responders and cleanse the data by removing their responses. The careless responders are detected using person-fit statistics. In two simulation…
Descriptors: Test Items, Response Style (Tests), Identification, Computation
Paek, Insu – Educational and Psychological Measurement, 2016
The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test…
Descriptors: Guessing (Tests), Computation, Statistical Analysis, Test Length
Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015
The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…
Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items
Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014
An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…
Descriptors: Sampling, Test Items, Effect Size, Scaling
Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability
Oranje, Andreas; Li, Deping; Kandathil, Mathew – ETS Research Report Series, 2009
Several complex sample standard error estimators based on linearization and resampling for the latent regression model of the National Assessment of Educational Progress (NAEP) are studied with respect to design choices such as number of items, number of regressors, and the efficiency of the sample. This paper provides an evaluation of the extent…
Descriptors: Error of Measurement, Computation, Regression (Statistics), National Competency Tests
Peer reviewedMayer, John D. – Perceptual and Motor Skills, 1983
Kelly's formula estimates sampling variance of correlation corrected for attenuation by using split-half reliabilities. In some cases, coefficient alpha estimate of reliability is preferable. A simulation study suggests a variation of Kelly's formula can be used appropriately with coefficient alpha. Kelly's formula is modified to accept…
Descriptors: Correlation, Measurement Techniques, Reliability, Sampling
Peer reviewedSudman, Seymour; Bradburn, Norman – New Directions for Program Evaluation, 1984
Situations in which mailed questionnaires are most appropriate are identified. Population variables, characteristics of questionnaires, and social desirability variables are examined in depth. (Author)
Descriptors: Attitude Measures, Evaluation Methods, Program Evaluation, Research Methodology
Peer reviewedBerk, Ronald A. – Journal of Experimental Education, 1980
A sampling methodology is proposed for determining lengths of tests designed to assess the comprehension of written discourse. It is based on Bormuth's transformational analysis, within a domain-referenced framework. Guidelines are provided for computing sample size and selecting sentences to which the transformational rules can be applied.…
Descriptors: Reading Comprehension, Reading Tests, Sampling, Test Construction
Wang, Wen-Chung – Educational and Psychological Measurement, 2004
The Pearson correlation is used to depict effect sizes in the context of item response theory. Amultidimensional Rasch model is used to directly estimate the correlation between latent traits. Monte Carlo simulations were conducted to investigate whether the population correlation could be accurately estimated and whether the bootstrap method…
Descriptors: Test Length, Sampling, Effect Size, Correlation

Direct link
