Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 28 |
| Since 2007 (last 20 years) | 48 |
Descriptor
| Test Construction | 150 |
| Test Length | 150 |
| Test Items | 66 |
| Test Validity | 47 |
| Test Reliability | 42 |
| Computer Assisted Testing | 32 |
| Test Format | 27 |
| Adaptive Testing | 26 |
| Item Banks | 21 |
| Psychometrics | 20 |
| Testing Problems | 20 |
| More ▼ | |
Source
Author
| Hambleton, Ronald K. | 12 |
| Wainer, Howard | 5 |
| Reckase, Mark D. | 4 |
| Berk, Ronald A. | 3 |
| Wilcox, Rand R. | 3 |
| Sijtsma, Klaas | 2 |
| Thissen, David | 2 |
| Abrams, Matthew | 1 |
| Ang, Cheng | 1 |
| Anil, Duygu | 1 |
| Arbet, Scott E. | 1 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 5 |
| Practitioners | 2 |
| Administrators | 1 |
Location
| Australia | 2 |
| Canada | 2 |
| China | 2 |
| Ireland | 2 |
| Singapore | 2 |
| United Kingdom | 2 |
| Asia | 1 |
| Germany | 1 |
| Israel | 1 |
| Italy | 1 |
| Japan | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
van der Linden, Wim J. – Evaluation in Education: International Progress, 1982
In mastery testing a linear relationship between an optimal passing score and test length is presented with a new optimization criterion. The usual indifference zone approach, a binomial error model, decision errors, and corrections for guessing are discussed. Related results in sequential testing and the latent class approach are included. (CM)
Descriptors: Cutting Scores, Educational Testing, Mastery Tests, Mathematical Models
Peer reviewedKafry, Ditsa; And Others – Applied Psychological Measurement, 1979
A series of behavioral expectation scale applications were analyzed in an attempt to point out an appropriate number of dimensions to be included in such studies. Results reflected the problems of dimension interdependence when the number of dimensions exceeds nine. (Author/JKS)
Descriptors: Behavior Rating Scales, Expectation, Factor Analysis, Higher Education
Peer reviewedSher, Kenneth J.; And Others – Psychological Assessment, 1995
Interrelated analyses were conducted with more than 4,000 college students to examine the reliability and validity of the Tridimensional Personality Questionnaire (TPQ) and to develop and validate a short version of the scale. Results provide moderate support for the reliability and validity of both the TPQ and the short form. (SLD)
Descriptors: College Students, Factor Analysis, Higher Education, Personality Assessment
Wang, Xiang Bo – College Board, 2007
This research examines the effect of increased testing time by comparing the four performance indices of randomly equivalent examinee subpopulations on sections of similar content and difficulty administered at different times on three SAT administrations. A variety of analyses were used in this study and found no evidence that the current SAT…
Descriptors: College Entrance Examinations, Thinking Skills, High School Students, Test Length
Kunce, Charles S.; Arbet, Scott E. – 1994
The National Conference of Bar Examiners commissioned American College Testing, Inc., to help them in the development and evaluation of a performance test for use in bar admissions decisions. Because it was recognized that candidate perceptions would provide valuable information, a candidate-perception questionnaire was developed to be…
Descriptors: Attitudes, Demography, Languages, Lawyers
Haladyna, Tom; Roid, Gale – 1981
Two approaches to criterion-referenced test construction are compared. Classical test theory is based on the practice of random sampling from a well-defined domain of test items; latent trait theory suggests that the difficulty of the items should be matched to the achievement level of the student. In addition to these two methods of test…
Descriptors: Criterion Referenced Tests, Error of Measurement, Latent Trait Theory, Test Construction
Myers, Charles T. – 1978
The viewpoint is expressed that adding to test reliability by either selecting a more homogeneous set of items, restricting the range of item difficulty as closely as possible to the most efficient level, or increasing the number of items will not add to test validity and that there is considerable danger that efforts to increase reliability may…
Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Test Construction
Peer reviewedHambleton, Ronald K.; De Gruijter, Dato N. M. – Journal of Educational Measurement, 1983
Addressing the shortcomings of classical item statistics for selecting criterion-referenced test items, this paper describes an optimal item selection procedure utilizing item response theory (IRT) and offers examples in which random selection and optimal item selection methods are compared. Theoretical advantages of optimal selection based upon…
Descriptors: Criterion Referenced Tests, Cutting Scores, Item Banks, Latent Trait Theory
Davey, Tim; Pommerich, Mary; Thompson, Tony D. – 1999
In computerized adaptive testing (CAT), new or experimental items are frequently administered alongside operational tests to gather the pretest data needed to replenish and replace item pools. The two basic strategies used to combine pretest and operational items are embedding and appending. Variable-length CATs are preferred because of the…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Measurement Techniques
Peer reviewedWainer, Howard; And Others – Journal of Educational Measurement, 1992
Computer simulations were run to measure the relationship between testlet validity and factors of item pool size and testlet length for both adaptive and linearly constructed testlets. Making a testlet adaptive yields only modest increases in aggregate validity because of the peakedness of the typical proficiency distribution. (Author/SLD)
Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Computer Simulation
Frick, Theodore W. – 1991
Expert systems can be used to aid decisionmaking. A computerized adaptive test is one kind of expert system, although not commonly recognized as such. A new approach, termed EXSPRT, was devised that combines expert systems reasoning and sequential probability ratio test stopping rules. Two versions of EXSPRT were developed, one with random…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Expert Systems
Mills, Craig N.; Simon, Robert – 1981
When criterion-referenced tests are used to assign examinees to states reflecting their performance level on a test, the better known methods for determining test length, which consider relationships among domain scores and errors of measurement, have their limitations. The purpose of this paper is to present a computer system named TESTLEN, which…
Descriptors: Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores, Error of Measurement
Hambleton, Ronald K.; Cook, Linda L. – 1978
The purpose of the present research was to study, systematically, the "goodness-of-fit" of the one-, two-, and three-parameter logistic models. We studied, using computer-simulated test data, the effects of four variables: variation in item discrimination parameters, the average value of the pseudo-chance level parameters, test length,…
Descriptors: Career Development, Difficulty Level, Goodness of Fit, Item Analysis
Peer reviewedHill, Kennedy T.; Wigfield, Allan – Elementary School Journal, 1984
Discusses the problem of and solution to anxiety in school testing situations. Focuses on Hill and his colleagues' long term program of research. Describes school intervention studies where new evaluation procedures and teaching programs have been developed to help students perform better in evaluative situations. (CB)
Descriptors: Elementary School Students, Elementary Secondary Education, Grades (Scholastic), Intervention
Schedl, Mary; And Others – 1995
The Test of English as a Foreign Language (TOEFL) program is exploring a change in Section 3 of the TOEFL test that would replace the vocabulary subpart with additional reading comprehension questions. This study investigated the proposed revision in terms of the length and timing that would be necessary to address concerns of test speededness of…
Descriptors: Adult Students, English (Second Language), Language Tests, Psychometrics


