Publication Date
| In 2026 | 0 |
| Since 2025 | 15 |
| Since 2022 (last 5 years) | 63 |
| Since 2017 (last 10 years) | 162 |
| Since 2007 (last 20 years) | 321 |
Descriptor
Source
Author
| Hambleton, Ronald K. | 15 |
| Wang, Wen-Chung | 9 |
| Livingston, Samuel A. | 6 |
| Sijtsma, Klaas | 6 |
| Wainer, Howard | 6 |
| Weiss, David J. | 6 |
| Wilcox, Rand R. | 6 |
| Cheng, Ying | 5 |
| Gessaroli, Marc E. | 5 |
| Lee, Won-Chan | 5 |
| Lewis, Charles | 5 |
| More ▼ | |
Publication Type
Education Level
Location
| Turkey | 8 |
| Australia | 7 |
| Canada | 7 |
| China | 5 |
| Netherlands | 5 |
| Japan | 4 |
| Taiwan | 4 |
| United Kingdom | 4 |
| Germany | 3 |
| Michigan | 3 |
| Singapore | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Americans with Disabilities… | 1 |
| Equal Access | 1 |
| Job Training Partnership Act… | 1 |
| Race to the Top | 1 |
| Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Walsh, D. M.; Finwall, J.; Touchette, P. E.; McGregor, M. R.; Fernandez, G. E.; Lott, I. T.; Sandman, C. A. – Journal of Intellectual Disability Research, 2007
Background: Most standardized intelligence tests require more than 1hour for administration, which is problematic when evaluating individuals with intellectual disabilities and developmental disabilities (IDDD), because a significant proportion of these individuals can not tolerate lengthy evaluations. Furthermore, most standardized intelligence…
Descriptors: Cognitive Ability, Standardized Tests, Developmental Disabilities, Mental Retardation
Wollack, James A. – Applied Measurement in Education, 2006
Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…
Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2006
Many academic tests (e.g. short-answer and multiple-choice) sample required knowledge with questions scoring 0 or 1 (dichotomous scoring). Few textbooks give useful guidance on the length of test needed to do this reliably. Posey's binomial error model of 1932 provides the best starting point, but allows neither for heterogeneity of question…
Descriptors: Item Sampling, Tests, Test Length, Test Reliability
Livingston, Samuel A.; Lewis, Charles – 1993
This paper presents a method for estimating the accuracy and consistency of classifications based on test scores. The scores can be produced by any scoring method, including the formation of a weighted composite. The estimates use data from a single form. The reliability of the score is used to estimate its effective test length in terms of…
Descriptors: Classification, Error of Measurement, Estimation (Mathematics), Reliability
Gershon, Richard C.; Bergstrom, Betty – 1991
The relationship of several individual differences variables to Computer Adaptive Testing (CAT) as compared with traditional written tests are explored. Seven hundred sixty-five examinees took a Computer Adaptive Test and two fixed-length written tests. Each examinee also answered a computer literacy inventory, a satisfaction questionnaire, and a…
Descriptors: Adaptive Testing, Adults, Computer Assisted Testing, Computer Literacy
Wingersky, Marilyn S. – 1989
In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Livingston, Samuel A. – 1981
The standard error of measurement (SEM) is a measure of the inconsistency in the scores of a particular group of test-takers. It is largest for test-takers with scores ranging in the 50 percent correct bracket; with nearly perfect scores, it is smaller. On tests used to make pass/fail decisions, the test-takers' scores tend to cluster in the range…
Descriptors: Error of Measurement, Estimation (Mathematics), Mathematical Formulas, Pass Fail Grading
Wilcox, Rand R. – 1980
Concern about passing those examinees who should pass, and retaining those who need remedial work, is one problem related to criterion-referenced testing. This paper deals with one aspect of that problem. When determining how many items to include on a criterion-referenced test, practitioners must resolve various non-statistical issues before a…
Descriptors: Bayesian Statistics, Criterion Referenced Tests, Latent Trait Theory, Mathematical Models
Peer reviewedNewmark, Charles S.; And Others – Journal of Clinical Psychology, 1976
The present study investigated the comparative interpretive efficacy of the Faschingbauer Abbreviated MMPI (FAM) and standard MMPI with a sample of psychiatric inpatients. A secondary goal was to estimate the functioning effective test length of the FAM as compared to the standard MMPI. (Author/RK)
Descriptors: Measurement Instruments, Patients, Psychological Studies, Psychological Testing
Peer reviewedOwen, Steven V.; Froman, Robin D. – Educational and Psychological Measurement, 1987
To test further for efficacy of three-option achievement items, parallel three- and five-option item tests were distributed randomly to college students. Results showed no differences in mean item difficulty, mean discrimination or total test score, but a substantial reduction in time spent on three-option items. (Author/BS)
Descriptors: Achievement Tests, Higher Education, Multiple Choice Tests, Test Format
Peer reviewedGallucci, Nicholas T. – Educational and Psychological Measurement, 1986
This study evaluated the degree to which 102 undergraduate participants objected to questions on the Minnesota Multiphasic Personality Inventory (MMPI) which referred to sex, religion, bladder and bowel functions, family relationships, and unusual thinking in comparision to degree of objection to length of the MMPI and repetition of questions.…
Descriptors: College Students, Higher Education, Personality Measures, Psychological Evaluation
Patsula, Liane N.; Gessaroli, Marc E. – 1995
Among the most popular techniques used to estimate item response theory (IRT) parameters are those used in the LOGIST and BILOG computer programs. Because of its accuracy with smaller sample sizes or differing test lengths, BILOG has become the standard to which new estimation programs are compared. However, BILOG is still complex and…
Descriptors: Comparative Analysis, Effect Size, Estimation (Mathematics), Item Response Theory
Schulz, E. Matthew; Wang, Lin – 2001
In this study, items were drawn from a full-length test of 30 items in order to construct shorter tests for the purpose of making accurate pass/fail classifications with regard to a specific criterion point on the latent ability metric. A three-item parameter Item Response Theory (IRT) framework was used. The criterion point on the latent ability…
Descriptors: Ability, Classification, Item Response Theory, Pass Fail Grading
Peer reviewedMillman, Jason – Review of Educational Research, 1973
Procedures for establishing standards and determining the number of items needed in criterion referenced measures were reviewed. Discussion of setting a passing score was organized around: performance of others, item content, educational consequences, psychological and financial costs, and error due to guessing and item sampling. (Author)
Descriptors: Criterion Referenced Tests, Educational Research, Literature Reviews, Measurement Techniques
Peer reviewedDonders, Jacques – Psychological Assessment, 1997
Eight subtests were selected from the Wechsler Intelligence Scale for Children--Third Edition (WISC-III) to make a short form for clinical use. Results with the 2,200 children from the WISC-III standardization sample indicated the adequate reliability and validity of the short form for clinical use. (SLD)
Descriptors: Children, Clinical Diagnosis, Intelligence Tests, Test Format

Direct link
