NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 286 to 300 of 636 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sijtsma, Klaas – International Journal of Testing, 2009
This article reviews three topics from test theory that continue to raise discussion and controversy and capture test theorists' and constructors' interest. The first topic concerns the discussion of the methodology of investigating and establishing construct validity; the second topic concerns reliability and its misuse, alternative definitions…
Descriptors: Construct Validity, Reliability, Classification, Test Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bulut, Okan; Kan, Adnan – Eurasian Journal of Educational Research, 2012
Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee's responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee's ability level. Instead of…
Descriptors: Adaptive Testing, Computer Assisted Testing, College Entrance Examinations, Graduate Students
Huo, Yan – ProQuest LLC, 2009
Variable-length computerized adaptive testing (CAT) can provide examinees with tailored test lengths. With the fixed standard error of measurement ("SEM") termination rule, variable-length CAT can achieve predetermined measurement precision by using relatively shorter tests compared to fixed-length CAT. To explore the application of…
Descriptors: Test Length, Test Items, Adaptive Testing, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
DeMars, Christine E. – Journal of Educational and Behavioral Statistics, 2009
The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…
Descriptors: Regression (Statistics), Test Bias, Error of Measurement, True Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Cheng, Ying-Yao; Wang, Wen-Chung; Ho, Yi-Hui – Educational and Psychological Measurement, 2009
Educational and psychological tests are often composed of multiple short subtests, each measuring a distinct latent trait. Unfortunately, short subtests suffer from low measurement precision, which makes the bandwidth-fidelity dilemma inevitable. In this study, the authors demonstrate how a multidimensional Rasch analysis can be employed to take…
Descriptors: Item Response Theory, Measurement, Correlation, Measures (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
Fitzpatrick, Anne R. – Educational Measurement: Issues and Practice, 2008
Examined in this study were the effects of reducing anchor test length on student proficiency rates for 12 multiple-choice tests administered in an annual, large-scale, high-stakes assessment. The anchor tests contained 15 items, 10 items, or five items. Five content representative samples of items were drawn at each anchor test length from a…
Descriptors: Test Length, Multiple Choice Tests, Item Sampling, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Liang, Tie; Wells, Craig S. – Educational and Psychological Measurement, 2009
Investigating the fit of a parametric model is an important part of the measurement process when implementing item response theory (IRT), but research examining it is limited. A general nonparametric approach for detecting model misfit, introduced by J. Douglas and A. S. Cohen (2001), has exhibited promising results for the two-parameter logistic…
Descriptors: Sample Size, Nonparametric Statistics, Item Response Theory, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
de la Torre, Jimmy; Song, Hao – Applied Psychological Measurement, 2009
Assessments consisting of different domains (e.g., content areas, objectives) are typically multidimensional in nature but are commonly assumed to be unidimensional for estimation purposes. The different domains of these assessments are further treated as multi-unidimensional tests for the purpose of obtaining diagnostic information. However, when…
Descriptors: Ability, Tests, Item Response Theory, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Ackerman, Phillip L.; Kanfer, Ruth – Journal of Experimental Psychology: Applied, 2009
Person and situational determinants of cognitive ability test performance and subjective reactions were examined in the context of tests with different time-on-task requirements. Two hundred thirty-nine first-year university students participated in a within-participant experiment, with completely counterbalanced treatment conditions and test…
Descriptors: Test Length, Fatigue (Biology), Cognitive Ability, College Students
Peer reviewed Peer reviewed
Direct linkDirect link
Willse, John T.; Goodman, Joshua T. – Educational and Psychological Measurement, 2008
This research provides a direct comparison of effect size estimates based on structural equation modeling (SEM), item response theory (IRT), and raw scores. Differences between the SEM, IRT, and raw score approaches are examined under a variety of data conditions (IRT models underlying the data, test lengths, magnitude of group differences, and…
Descriptors: Test Length, Structural Equation Models, Effect Size, Raw Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Klockars, Alan J.; Lee, Yoonsun – Journal of Educational Measurement, 2008
Monte Carlo simulations with 20,000 replications are reported to estimate the probability of rejecting the null hypothesis regarding DIF using SIBTEST when there is DIF present and/or when impact is present due to differences on the primary dimension to be measured. Sample sizes are varied from 250 to 2000 and test lengths from 10 to 40 items.…
Descriptors: Test Bias, Test Length, Reference Groups, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Choi, Namok; Fuqua, Dale R.; Newman, Jody L. – Educational and Psychological Measurement, 2009
The short form of the Bem Sex Role Inventory (BSRI) contains half as many items as the long form and yet has often demonstrated better reliability and validity. This study uses exploratory and confirmatory factor analytic methods to examine the structure of the short form of the BSRI. A structure noted elsewhere also emerged here, consisting of…
Descriptors: Sex Role, Measures (Individuals), Test Length, Gender Differences
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes – Applied Psychological Measurement, 2010
The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…
Descriptors: Item Response Theory, Computation, Factor Analysis, Models
Pennsylvania Department of Education, 2010
This handbook describes the responsibilities of district and school assessment coordinators in the administration of the Pennsylvania System of School Assessment (PSSA). This updated guidebook contains the following sections: (1) General Assessment Guidelines for All Assessments; (2) Writing Specific Guidelines; (3) Reading and Mathematics…
Descriptors: Guidelines, Guides, Educational Assessment, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Finkelman, Matthew – Journal of Educational and Behavioral Statistics, 2008
Sequential mastery testing (SMT) has been researched as an efficient alternative to paper-and-pencil testing for pass/fail examinations. One popular method for determining when to cease examination in SMT is the truncated sequential probability ratio test (TSPRT). This article introduces the application of stochastic curtailment in SMT to shorten…
Descriptors: Mastery Tests, Sequential Approach, Computer Assisted Testing, Adaptive Testing
Pages: 1  |  ...  |  16  |  17  |  18  |  19  |  20  |  21  |  22  |  23  |  24  |  ...  |  43