NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Job Training Partnership Act…1
What Works Clearinghouse Rating
Showing 61 to 75 of 133 results Save | Export
Peer reviewed Peer reviewed
Livingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995
A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)
Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions
Peer reviewed Peer reviewed
Qualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Peer reviewed Peer reviewed
Donders, Jacques – Psychological Assessment, 1997
Eight subtests were selected from the Wechsler Intelligence Scale for Children--Third Edition (WISC-III) to make a short form for clinical use. Results with the 2,200 children from the WISC-III standardization sample indicated the adequate reliability and validity of the short form for clinical use. (SLD)
Descriptors: Children, Clinical Diagnosis, Intelligence Tests, Test Format
Peer reviewed Peer reviewed
Axelrod, Bradley N.; And Others – Psychological Assessment, 1996
The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)
Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2006
Many academic tests (e.g. short-answer and multiple-choice) sample required knowledge with questions scoring 0 or 1 (dichotomous scoring). Few textbooks give useful guidance on the length of test needed to do this reliably. Posey's binomial error model of 1932 provides the best starting point, but allows neither for heterogeneity of question…
Descriptors: Item Sampling, Tests, Test Length, Test Reliability
Peer reviewed Peer reviewed
Rowley, Glenn – Journal of Educational Measurement, 1978
The reliabilities of various observational measures were determined, and the influence of both the number and the length of the observation periods on reliability was examined, both separately and jointly. A single simplifying assumption leads to a variant of the Spearman-Brown formula, which may have wider application. (Author/CTM)
Descriptors: Career Development, Classroom Observation Techniques, Observation, Reliability
Peer reviewed Peer reviewed
Nelson, W. M., III; And Others – Journal of Personality Assessment, 1978
This study used 126 young adult black and white male inmates to test the comparability of the Pauker and Statz and Mogul short forms with the standard Wechsler Adult Intelligence Scale (WAIS). The Pauker form was superior with this population. Findings should not be generalized to other ages, races, or to women. (Author/CP)
Descriptors: Intelligence, Intelligence Differences, Intelligence Tests, Males
Peer reviewed Peer reviewed
Budescu, David – Journal of Educational Measurement, 1985
An important determinant of equating process efficiency is the correlation between the anchor test and components of each form. Use of some monotonic function of this correlation as a measure of equating efficiency is suggested. A model relating anchor test length and test reliability to this measure of efficiency is presented. (Author/DWH)
Descriptors: Correlation, Equated Scores, Mathematical Models, Standardized Tests
Peer reviewed Peer reviewed
Cliff, Norman; And Others – Applied Psychological Measurement, 1979
Monte Carlo research with TAILOR, a program using implied orders as a basis for tailored testing, is reported. TAILOR typically required about half the available items to estimate, for each simulated examinee, the responses on the remainder. (Author/CTM)
Descriptors: Adaptive Testing, Computer Programs, Item Sampling, Nonparametric Statistics
Peer reviewed Peer reviewed
Meijer, Rob R.; And Others – Applied Psychological Measurement, 1994
The power of the nonparametric person-fit statistic, U3, is investigated through simulations as a function of item characteristics, test characteristics, person characteristics, and the group to which examinees belong. Results suggest conditions under which relatively short tests can be used for person-fit analysis. (SLD)
Descriptors: Difficulty Level, Group Membership, Item Response Theory, Nonparametric Statistics
McCormick, Douglas J. – 1978
Tailored testing increases the efficiency of tests by individually selecting for each person a set of items from an item pool so that the difficulty of the items selected will be such as to maximize the information provided by the score. The tailored testing procedure designed by Cliff orders persons and items on a common ordinal scale and…
Descriptors: Adaptive Testing, Branching, Computer Assisted Testing, Computer Programs
Peer reviewed Peer reviewed
Munson, J. Michael; McQuarrie, Edward F. – Educational and Psychological Measurement, 1987
A shortened version of Zaichkowsky's 20-item Personal Involvement Inventory was created, removing four items which might be difficult to understand for noncollege-educated populations. The 16-item modified version had acceptable internal consistency; test-retest reliability; and factorial and predictive validity. (Author/GDC)
Descriptors: Factor Structure, Higher Education, Interest Inventories, Personality Measures
Peer reviewed Peer reviewed
Willson, Victor L.; Reynold, Cecil R. – Educational and Psychological Measurement, 1985
Techniques for constructing short forms of tests are discussed, and an example is given using the Wechsler Adult Intelligence Scale-Revised. Reliability and validity estimation equations are presented. (GDC)
Descriptors: Adults, Individual Testing, Intelligence Tests, Norm Referenced Tests
Peer reviewed Peer reviewed
Kipps, Debi; Hanson, Dave – School Psychology Review, 1983
The Peabody Picture Vocabulary Test-Revised (Dunn and Dunn) is described as a convenient, quick test, possessing improvements over the original. It measures a subject's receptive (hearing) vocabulary for Standard American English. However, the validity information for the test is less than adequate, since no validity studies are presented for it.…
Descriptors: Auditory Tests, Individual Testing, Scores, Test Length
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9