NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 316 to 330 of 636 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology
Peer reviewed Peer reviewed
Kaplan, Robert; Rothkopf, Ernst Z. – Journal of Educational Psychology, 1974
The effects of number of objective-relevant sentences on learning from texts ranging in length from about 800 to 3,000 words were investigated in two related experiments. (Author)
Descriptors: Learning, Objectives, Sentences, Test Length
Peer reviewed Peer reviewed
Liou, Michelle – Applied Psychological Measurement, 1994
A recursive equation is proposed for computing higher order derivatives of elementary symmetric functions in the Rasch model. A simulation study indicates a small loss in accuracy for the proposed formula compared to Gustafsson's sum algorithm (1980) for computing higher order derivatives when tests contain 60 items or less. (SLD)
Descriptors: Algorithms, Computation, Item Response Theory, Simulation
Peer reviewed Peer reviewed
Netemeyer, Richard G.; Williamson, Donald A.; Burton, Scot; Biswas, Dipayan; Jindal, Supriya; Landreth, Stacy; Mills, Gregory; Primeaux, Sonya – Educational and Psychological Measurement, 2002
Derived shortened versions of the Automatic Thoughts Questionnaire (ATQ) (S. Hollon and P. Kendall, 1980) using samples of 434 and 419 adults. Cross-validation with samples of 163 and 91 adults showed support for the shortened versions. Overall, results suggest that these short forms are useful in measuring cognitions associated with depression.…
Descriptors: Adults, Depression (Psychology), Psychometrics, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Chang, Yuan-chin Ivan – Psychometrika, 2005
In this paper, we apply sequential one-sided confidence interval estimation procedures with beta-protection to adaptive mastery testing. The procedures of fixed-width and fixed proportional accuracy confidence interval estimation can be viewed as extensions of one-sided confidence interval procedures. It can be shown that the adaptive mastery…
Descriptors: Mastery Tests, Probability, Intervals, Testing
Pommerich, Mary – Journal of Technology, Learning, and Assessment, 2007
Computer administered tests are becoming increasingly prevalent as computer technology becomes more readily available on a large scale. For testing programs that utilize both computer and paper administrations, mode effects are problematic in that they can result in examinee scores that are artificially inflated or deflated. As such, researchers…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Scores
Peer reviewed Peer reviewed
Silverstein, A. B. – Journal of Clinical Psychology, 1985
The findings of research on short forms of the Wechsler Adult Intelligence Scales-Revised are used to illustrate points about three criteria for evaluating the usefulness of a short form. Results indicate there is little justification for regarding the three criteria as criteria. (Author/BL)
Descriptors: Correlation, Evaluation Criteria, Test Format, Test Interpretation
Peer reviewed Peer reviewed
Axelrod, Bradley N.; Abraham, Elizabeth; Paolo, Anthony M. – Assessment, 1997
The full version of the Wisconsin Card Sorting Test (WCST) (S. Heaton and others, 1993) was compared to a 64-card version (WCST-64) using normative data and results from 350 neuropsychological patients. The WCST-64 was found useful only for respondents obtaining five or more categories by the end of the first deck. (SLD)
Descriptors: Diagnostic Tests, Neuropsychology, Norms, Patients
Peer reviewed Peer reviewed
Guilmette, Thomas J.; Kennedy, Mary Lynne – Assessment, 1997
The Wide Range Assessment of Memory and Learning (WRAML) (D. Sheslow and W. Adams, 1990) was given to 51 children. The General Memory Index (GMI) of the WRAML was compared with a short form of the WRAML, the Memory Screening Index (MSI). The MSI was higher than the GMI in 41 of 51 cases. (SLD)
Descriptors: Children, Cognitive Tests, Learning, Memory
Peer reviewed Peer reviewed
Feldt, Leonard S. – Applied Measurement in Education, 2002
Considers the degree of bias in testlet-based alpha (internal consistency reliability) through hypothetical examples and real test data from four tests of the Iowa Tests of Basic Skills. Presents a simple formula for computing a testlet-based congeneric coefficient. (SLD)
Descriptors: Estimation (Mathematics), Reliability, Statistical Bias, Test Format
Peer reviewed Peer reviewed
Campbell, Suzann K.; Wright, Benjamin D.; Linacre, J. Michael – Journal of Applied Measurement, 2002
Conducted a Rasch analysis of the psychometric qualities of the Test of Infant Motor Performance (TIMP; G. Girolami and S. Campbell, 1994) for the purpose of reducing the length of the test while maintaining its precision as a measurement device. Using scores from 1,732 tests, the TIMP was reduced to 42 items to form a functional motor scale for…
Descriptors: Infants, Measures (Individuals), Motion, Psychometrics
Peer reviewed Peer reviewed
Kingsbury, G. Gage; Zara, Anthony R. – Applied Measurement in Education, 1989
Several classical approaches and alternative approaches to item selection for computerized adaptive testing (CAT) are reviewed and compared. The study also describes procedures for constrained CAT that may be added to classical item selection approaches to allow them to be used for applied testing. (TJH)
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Construction, Test Length
Peer reviewed Peer reviewed
De Champlain, Andre; Gessaroli, Marc E. – Applied Measurement in Education, 1998
Type I error rates and rejection rates for three-dimensionality assessment procedures were studied with data sets simulated to reflect short tests and small samples. Results show that the G-squared difference test (D. Bock, R. Gibbons, and E. Muraki, 1988) suffered from a severely inflated Type I error rate at all conditions simulated. (SLD)
Descriptors: Item Response Theory, Matrices, Sample Size, Simulation
Peer reviewed Peer reviewed
Sanders, Piet F.; Verschoor, Alfred J. – Applied Psychological Measurement, 1998
Presents minimization and maximization models for parallel test construction under constraints. The minimization model constructs weakly and strongly parallel tests of minimum length, while the maximization model constructs weakly and strongly parallel tests with maximum test reliability. (Author/SLD)
Descriptors: Algorithms, Models, Reliability, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Monahan, Patrick O.; Stump, Timothy E.; Finch, Holmes; Hambleton, Ronald K. – Applied Psychological Measurement, 2007
DETECT is a nonparametric "full" dimensionality assessment procedure that clusters dichotomously scored items into dimensions and provides a DETECT index of magnitude of multidimensionality. Four factors (test length, sample size, item response theory [IRT] model, and DETECT index) were manipulated in a Monte Carlo study of bias, standard error,…
Descriptors: Test Length, Sample Size, Monte Carlo Methods, Geometric Concepts
Pages: 1  |  ...  |  18  |  19  |  20  |  21  |  22  |  23  |  24  |  25  |  26  |  ...  |  43