Publication Date
| In 2026 | 0 |
| Since 2025 | 18 |
| Since 2022 (last 5 years) | 66 |
| Since 2017 (last 10 years) | 165 |
| Since 2007 (last 20 years) | 324 |
Descriptor
Source
Author
| Hambleton, Ronald K. | 15 |
| Wang, Wen-Chung | 9 |
| Livingston, Samuel A. | 6 |
| Sijtsma, Klaas | 6 |
| Wainer, Howard | 6 |
| Weiss, David J. | 6 |
| Wilcox, Rand R. | 6 |
| Cheng, Ying | 5 |
| Gessaroli, Marc E. | 5 |
| Lee, Won-Chan | 5 |
| Lewis, Charles | 5 |
| More ▼ | |
Publication Type
Education Level
Location
| Turkey | 8 |
| Australia | 7 |
| Canada | 7 |
| China | 5 |
| Netherlands | 5 |
| Japan | 4 |
| Taiwan | 4 |
| United Kingdom | 4 |
| Germany | 3 |
| Michigan | 3 |
| Singapore | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Americans with Disabilities… | 1 |
| Equal Access | 1 |
| Job Training Partnership Act… | 1 |
| Race to the Top | 1 |
| Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Kang, Taehoon; Chen, Troy T. – ACT, Inc., 2007
Orlando and Thissen (2000, 2003) proposed an item-fit index, S-X[superscript 2], for dichotomous item response theory (IRT) models, which has performed better than traditional item-fit statistics such as Yen's (1981) Q[subscript 1] and McKinley and Mill's (1985) G[superscript 2]. This study extends the utility of S-X[superscript 2] to polytomous…
Descriptors: Item Response Theory, Models, Computer Software, Statistical Analysis
Bay, Luz – 1995
An index is proposed to detect cheating on multiple-choice examinations, and its use is evaluated through simulations. The proposed index is based on the compound binomial distribution. In total, 360 simulated data sets reflecting 12 different cheating (copying) situations were obtained and used for the study of the sensitivity of the index in…
Descriptors: Cheating, Class Size, Identification, Multiple Choice Tests
Peer reviewedConger, Anthony J. – Educational and Psychological Measurement, 1983
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length
Peer reviewedAllison, Paul A. – Psychometrika, 1976
A direct proof is given for the generalized Spearman-Brown formula for any real multiple of test length. (Author)
Descriptors: Correlation, Error of Measurement, Raw Scores, Test Length
PDF pending restorationDe Champlain, Andre F.; Gessaroli, Marc E.; Tang, K. Linda; De Champlain, Judy E. – 1998
The empirical Type I error rates of Poly-DIMTEST (H. Li and W. Stout, 1995) and the LISREL8 chi square fit statistic (K. Joreskog and D. Sorbom, 1993) were compared with polytomous unidimensional data sets simulated to vary as a function of test length and sample size. The rejection rates for both statistics were also studied with two-dimensional…
Descriptors: Chi Square, Goodness of Fit, Item Response Theory, Sample Size
Peer reviewedSilverstein, A. B. – Perceptual and Motor Skills, 1983
Formulas for estimating the validity of random short forms were applied to the standardization data for the Wechsler Adult Intelligence Scale-Revised, the Minnesota Multiphasic Personality Inventory, and the Marlowe-Crowne Social Desirability Scale. These formulas demonstrated how much "better than random" the best short forms of these…
Descriptors: Comparative Analysis, Intelligence Tests, Measures (Individuals), Test Format
Peer reviewedStern, Paul C.; Guagnano, Gregory A.; Dietz, Thomas – Educational and Psychological Measurement, 1998
A brief version of the instrument developed by S. Schwartz (1992, 1994) to measure the structure and content of human values was developed. Studies with 199 adults and 420 adults support the reliability of scores produced by the brief inventory's four three-item scales. Uses of the brief form are discussed. (SLD)
Descriptors: Adults, Reliability, Scores, Test Construction
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
Peer reviewedModjeski, Richard B.; Michael, William B. – Educational and Psychological Measurement, 1978
The General Education Performance Index (GEPI) is a comparatively short test covering the same content as the General Educational Development Test (GED), which takes ten hours to administer. Correlations of the subtests of the GEPI with the GED ranged from .28 to .57. (JKS)
Descriptors: Correlation, Equivalency Tests, Military Personnel, Statistical Data
Kennedy, Robert L.; McCallister, Corliss J. – 2000
The purpose of this study was to investigate the relationship between the scores students earned on their statistics final examinations and the number of minutes students required to complete the exams. In a previous study, K. Bridges (1985) extended the range of interest in this relationship from a single study to a course-based series, examining…
Descriptors: College Students, Higher Education, Scores, Statistics
Peer reviewedEisenstein, Norman; Engelhart, Charles I. – Psychological Assessment, 1997
The Kaufman Brief Intelligence Test (K-BIT) (A. S. Kaufman and N. L. Kaufman, 1990) was compared with short forms of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) using results from 64 referrals to a neuropsychology service. Advantages of each test are noted and their use discussed. (SLD)
Descriptors: Adults, Comparative Analysis, Intelligence Tests, Neuropsychology
Peer reviewedSunathong, Surintorn; Schumacker, Randall E.; Beyerlein, Michael M. – Journal of Applied Measurement, 2000
Studied five factors that can affect the equating of scores from two tests onto a common score scale through the simulation and equating of 4,860 item data sets. Findings indicate three statistically significant two-way interactions for common item length and test length, item difficulty standard deviation and item distribution type, and item…
Descriptors: Difficulty Level, Equated Scores, Interaction, Item Response Theory
Peer reviewedLewis, Charles; Sheehan, Kathleen – Machine-Mediated Learning, 1988
Introduces a theoretical framework for mastery testing, using Item Response Theory and Bayesian Decision Theory. The idea of sequential testing is developed, with the goal of providing longer or shorter tests as needed, and a computerized application to a hypothetical professional knowledge examination is discussed. (Author/LRW)
Descriptors: Computer Assisted Testing, Licensing Examinations (Professions), Mastery Tests, Psychometrics
Peer reviewedColliver, Jerry A.; And Others – Academic Medicine, 1992
A study investigated optimal length of screening tests used to sort out medical students needing to take a full-length performance-based standardized-patient test from those not needing it. Receiver operating characteristic analysis determined a good length is one-third the full test, with cutoff just above the mean case pass level. (Author/MSE)
Descriptors: Higher Education, Medical Education, Patients, Professional Education
Peer reviewedThompson, Anthony; Browne, Janet; Schmidt, Fred; Boer, Marian – Assessment, 1997
The validity of a four-subtest short form of the third edition of the Wechsler Intelligence Scale for Children (WISC-III) and the Kaufman Brief Intelligence Test (K-BIT) was evaluated with 42 adolescent offenders. Findings support the clinical use of the short form as a good estimate of WISC-III full-scale IQ. (SLD)
Descriptors: Adolescents, Criminals, Delinquency, Intelligence Quotient

Direct link
