NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 391 to 405 of 636 results Save | Export
Peer reviewed Peer reviewed
Yen, Wendy M.; Candell, Gregory L. – Applied Measurement in Education, 1991
Empirical reliabilities of scores based on item-pattern scoring, using 3-parameter item-response theory and number-correct scoring, were compared within each of 5 score metrics for at least 900 elementary school students for 5 content areas. Average increases in reliability were produced by item-pattern scoring. (SLD)
Descriptors: Elementary Education, Elementary School Students, Grade Equivalent Scores, Item Response Theory
Peer reviewed Peer reviewed
Sindhu, R. S.; Sharma, Reeta – Science Education International, 1999
Finds that the time required to attempt all the test items of each question paper in a four-paper sample was inversely proportional to the percentage of students who attempted all the test items of that paper. Extrapolates results to give guidelines for determining the feasibility of newly-developed exam papers. (WRM)
Descriptors: Science Tests, Secondary Education, Test Construction, Test Length
Peer reviewed Peer reviewed
Ward, L. Charles; Ryan, Joseph J. – Psychological Assessment, 1996
Validity and reliability were calculated from data in the standardization sample of the Wechsler Adult Intelligence Scale--Revised for 565 proposed short forms. Time saved in comparison with use of the long form was estimated. The most efficient combinations were generally those composed of subtests that were quick to administer. (SLD)
Descriptors: Cost Effectiveness, Intelligence Tests, Selection, Test Format
Peer reviewed Peer reviewed
Axelrod, Bradley N.; And Others – Psychological Assessment, 1996
The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)
Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests
Peer reviewed Peer reviewed
Whitmore, Marjorie L.; Schumacker, Randall E. – Educational and Psychological Measurement, 1999
Compared differential item functioning detection rates for logistic regression and analysis of variance for dichotomously scored items using simulated data and varying test length, sample size, discrimination rate, and underlying ability. Explains why the logistic regression method is recommended for most applications. (SLD)
Descriptors: Ability, Analysis of Variance, Comparative Analysis, Item Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Hendrawan, Irene; Glas, Cees A. W.; Meijer, Rob R. – Applied Psychological Measurement, 2005
The effect of person misfit to an item response theory model on a mastery/nonmastery decision was investigated. Furthermore, it was investigated whether the classification precision can be improved by identifying misfitting respondents using person-fit statistics. A simulation study was conducted to investigate the probability of a correct…
Descriptors: Probability, Statistics, Test Length, Simulation
De Champlain, Andre – 1996
The usefulness of a goodness-of-fit index proposed by R. P. McDonald (1989) was investigated with regard to assessing the dimensionality of item response matrices. The m subscript k index, which is based on an estimate of the noncentrality parameter of the noncentral chi-square distribution, possesses several advantages over traditional tests of…
Descriptors: Chi Square, Cutting Scores, Goodness of Fit, Item Response Theory
Yamamoto, Kentaro – 1995
The traditional indicator of test speededness, missing responses, clearly indicates a lack of time to respond (thereby indicating the speededness of the test), but it is inadequate for evaluating speededness in a multiple-choice test scored as number correct, and it underestimates test speededness. Conventional item response theory (IRT) parameter…
Descriptors: Ability, Estimation (Mathematics), Item Response Theory, Multiple Choice Tests
Samejima, Fumiko; Changas, Paul S. – 1981
The methods and approaches for estimating the operating characteristics of the discrete item responses without assuming any mathematical form have been developed and expanded. It has been made possible that, even if the test information function of a given test is not constant for the interval of ability of interest, it is used as the Old Test.…
Descriptors: Adaptive Testing, Latent Trait Theory, Mathematical Models, Methods
Peer reviewed Peer reviewed
Rowley, Glenn – Journal of Educational Measurement, 1978
The reliabilities of various observational measures were determined, and the influence of both the number and the length of the observation periods on reliability was examined, both separately and jointly. A single simplifying assumption leads to a variant of the Spearman-Brown formula, which may have wider application. (Author/CTM)
Descriptors: Career Development, Classroom Observation Techniques, Observation, Reliability
Peer reviewed Peer reviewed
Hambleton, Ronald K. – Educational and Psychological Measurement, 1987
This paper presents an algorithm for determining the number of items to measure each objective in a criterion-referenced test when testing time is fixed and when the objectives vary in their levels of importance, reliability, and validity. Results of four special applications of the algorithm are presented. (BS)
Descriptors: Algorithms, Behavioral Objectives, Criterion Referenced Tests, Test Construction
De Champlain, Andre F. – 1999
The purpose of this study was to examine empirical Type I error rates and rejection rates for three dimensionality assessment procedures with data sets simulated to reflect short tests and small samples. The TESTFACT G superscript 2 difference test suffered from an inflated Type I error rate with unidimensional data sets, while the approximate chi…
Descriptors: Admission (School), College Entrance Examinations, Item Response Theory, Law Schools
Peer reviewed Peer reviewed
Spineti, John P.; Hambleton, Ronald K. – Educational and Psychological Measurement, 1977
The effectiveness of various tailored testing strategies for use in objective based instructional programs was investigated. The three factors of a tailored testing strategy under study with various hypothetical distributions of abilities across two learning hierarchies were test length, mastery cutting score, and starting point. (Author/JKS)
Descriptors: Adaptive Testing, Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores
Peer reviewed Peer reviewed
Kim, Seock-Ho; And Others – Psychometrika, 1994
Hierarchical Bayes procedures for the two-parameter logistic item response model were compared for estimating item and ability parameters through two joint and two marginal Bayesian procedures. Marginal procedures yielded smaller root mean square differences for item and ability, but results for larger sample size and test length were similar.…
Descriptors: Ability, Bayesian Statistics, Computer Simulation, Estimation (Mathematics)
Peer reviewed Peer reviewed
Smith, Renee L.; And Others – Psychological Assessment, 1995
The clinical utility of using fewer than 12 trials of the Selective Reminding Test, a task to assess verbal memory, was studied with 100 cardiac patients and 100 brain injury patients. Results suggest that as few as 6 trials might be adequate, providing information consistent with that from 12 trials. (SLD)
Descriptors: Clinical Diagnosis, Diagnostic Tests, Head Injuries, Memory
Pages: 1  |  ...  |  23  |  24  |  25  |  26  |  27  |  28  |  29  |  30  |  31  |  ...  |  43