Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 20 |
| Since 2007 (last 20 years) | 44 |
Descriptor
| Statistical Analysis | 71 |
| Test Length | 71 |
| Item Response Theory | 30 |
| Test Items | 29 |
| Sample Size | 28 |
| Comparative Analysis | 16 |
| Test Reliability | 16 |
| Correlation | 15 |
| Error of Measurement | 13 |
| Scores | 13 |
| Computation | 12 |
| More ▼ | |
Source
Author
| Bulut, Okan | 2 |
| Cohen, Allan S. | 2 |
| Huggins-Manley, Anne Corinne | 2 |
| Paek, Insu | 2 |
| Svetina, Dubravka | 2 |
| Tay, Louis | 2 |
| Wang, Wen-Chung | 2 |
| Weiss, David J. | 2 |
| Yormaz, Seha | 2 |
| de Jong, John H. A. L. | 2 |
| Abad, Francisco J. | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 55 |
| Journal Articles | 47 |
| Reports - Evaluative | 9 |
| Speeches/Meeting Papers | 5 |
| Dissertations/Theses -… | 3 |
| Tests/Questionnaires | 2 |
| Information Analyses | 1 |
| Numerical/Quantitative Data | 1 |
Education Level
| Higher Education | 5 |
| Postsecondary Education | 4 |
| Secondary Education | 3 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| Grade 3 | 1 |
| High Schools | 1 |
Audience
| Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| SAT (College Admission Test) | 2 |
| California Psychological… | 1 |
| Program for International… | 1 |
| Stanford Binet Intelligence… | 1 |
| Test of English as a Foreign… | 1 |
| Wechsler Adult Intelligence… | 1 |
| Wechsler Individual… | 1 |
What Works Clearinghouse Rating
Huang, Hung-Yu – Educational and Psychological Measurement, 2017
Mixture item response theory (IRT) models have been suggested as an efficient method of detecting the different response patterns derived from latent classes when developing a test. In testing situations, multiple latent traits measured by a battery of tests can exhibit a higher-order structure, and mixtures of latent classes may occur on…
Descriptors: Item Response Theory, Models, Bayesian Statistics, Computation
Svetina, Dubravka; Levy, Roy – Journal of Experimental Education, 2016
This study investigated the effect of complex structure on dimensionality assessment in compensatory multidimensional item response models using DETECT- and NOHARM-based methods. The performance was evaluated via the accuracy of identifying the correct number of dimensions and the ability to accurately recover item groupings using a simple…
Descriptors: Item Response Theory, Accuracy, Correlation, Sample Size
Paek, Insu – Educational and Psychological Measurement, 2016
The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test…
Descriptors: Guessing (Tests), Computation, Statistical Analysis, Test Length
Lee, HyeSun; Geisinger, Kurt F. – Educational and Psychological Measurement, 2016
The current study investigated the impact of matching criterion purification on the accuracy of differential item functioning (DIF) detection in large-scale assessments. The three matching approaches for DIF analyses (block-level matching, pooled booklet matching, and equated pooled booklet matching) were employed with the Mantel-Haenszel…
Descriptors: Test Bias, Measurement, Accuracy, Statistical Analysis
Cao, Mengyang; Tay, Louis; Liu, Yaowu – Educational and Psychological Measurement, 2017
This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…
Descriptors: Monte Carlo Methods, Test Items, Test Bias, Error of Measurement
Tay, Louis; Huang, Qiming; Vermunt, Jeroen K. – Educational and Psychological Measurement, 2016
In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
Descriptors: Item Response Theory, Test Bias, Simulation, College Entrance Examinations
Li, Feifei – ETS Research Report Series, 2017
An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement
Lu, Ying – ETS Research Report Series, 2017
For standard- or criterion-based assessments, the use of cut scores to indicate mastery, nonmastery, or different levels of skill mastery is very common. As part of performance summary, it is of interest to examine the percentage of examinees at or above the cut scores (PAC) and how PAC evolves across administrations. This paper shows that…
Descriptors: Cutting Scores, Evaluation Methods, Mastery Learning, Performance Based Assessment
Gelfand, Jessica T.; Christie, Robert E.; Gelfand, Stanley A. – Journal of Speech, Language, and Hearing Research, 2014
Purpose: Speech recognition may be analyzed in terms of recognition probabilities for perceptual wholes (e.g., words) and parts (e.g., phonemes), where j or the j-factor reveals the number of independent perceptual units required for recognition of the whole (Boothroyd, 1968b; Boothroyd & Nittrouer, 1988; Nittrouer & Boothroyd, 1990). For…
Descriptors: Phonemes, Word Recognition, Vowels, Syllables
Anthony, Christopher James; DiPerna, James Clyde – School Psychology Quarterly, 2017
The Academic Competence Evaluation Scales-Teacher Form (ACES-TF; DiPerna & Elliott, 2000) was developed to measure student academic skills and enablers (interpersonal skills, engagement, motivation, and study skills). Although ACES-TF scores have demonstrated psychometric adequacy, the length of the measure may be prohibitive for certain…
Descriptors: Test Items, Efficiency, Item Response Theory, Test Length
Goegan, Lauren D.; Harrison, Gina L. – Learning Disabilities: A Contemporary Journal, 2017
The effects of extended time on the writing performance of university students with learning disabilities (LD) was examined. Thirty-eight students (19 LD; 19 non-LD) completed a collection of cognitive, linguistic, and literacy measures, and wrote essays under regular and extended time conditions. Limited evidence was found to support the…
Descriptors: Foreign Countries, Undergraduate Students, Testing Accommodations, Learning Disabilities
Atalay Kabasakal, Kübra; Arsan, Nihan; Gök, Bilge; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2014
This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying…
Descriptors: Comparative Analysis, Item Response Theory, Statistical Analysis, Test Bias
Svetina, Dubravka – Educational and Psychological Measurement, 2013
The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in noncompensatory multidimensional item response models using dimensionality assessment procedures based on DETECT (dimensionality evaluation to enumerate contributing traits) and NOHARM (normal ogive harmonic analysis robust method). Five…
Descriptors: Item Response Theory, Statistical Analysis, Computation, Test Length
Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013
Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
Descriptors: Sample Size, Test Length, Correlation, Test Format
Liu, Qian – ProQuest LLC, 2011
For this dissertation, four item purification procedures were implemented onto the generalized linear mixed model for differential item functioning (DIF) analysis, and the performance of these item purification procedures was investigated through a series of simulations. Among the four procedures, forward and generalized linear mixed model (GLMM)…
Descriptors: Test Bias, Test Items, Statistical Analysis, Models

Peer reviewed
Direct link
