NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 361 to 375 of 636 results Save | Export
Peer reviewed Peer reviewed
Serlin, Ronald C.; Kaiser, Henry F. – Educational and Psychological Measurement, 1978
When multiple-choice tests are scored in the usual manner, giving each correct answer one point, information concerning response patterns is lost. A method for utilizing this information is suggested. An example is presented and compared with two conventional methods of scoring. (Author/JKS)
Descriptors: Correlation, Factor Analysis, Item Analysis, Multiple Choice Tests
Peer reviewed Peer reviewed
Taylor, James B. – Educational and Psychological Measurement, 1977
The reliability and item homogeneity of personality scales are in part dependent on the content domain being sampled, and this characteristic reliability cannot be explained by item ambiguity or scale length. It is suggested that clarity of self concept is also a determinant. (Author/JKS)
Descriptors: Item Analysis, Personality Assessment, Personality Measures, Personality Theories
Peer reviewed Peer reviewed
Strommen, Erik F.; Smith, Jeffrey K. – Educational and Psychological Measurement, 1987
The internal consistency of the Goodenough-Harris Draw-A-Person Test was examined using 150 children, aged 5-8. The 72-item full scales showed good internal consistency at all ages, with no sex differences. Administration of a 42-item short form resulted in sex effects and differential internal consistency. (Author/GDC)
Descriptors: Freehand Drawing, Primary Education, Sex Differences, Test Bias
Peer reviewed Peer reviewed
Safrit, Margaret J.; And Others – Research Quarterly for Exercise and Sport, 1985
Constraints on criterion-referenced tests to make mastery/nonmastery classifications of motor skills can lead to excessively long tests. A sequential probability ratio test classified many subjects' golf shots quickly but required many trials for four subjects. The test's classification accuracy makes it a potentially useful device for physical…
Descriptors: Criterion Referenced Tests, Golf, Higher Education, Mastery Tests
Peer reviewed Peer reviewed
Sudman, Seymour; Bradburn, Norman – New Directions for Program Evaluation, 1984
Situations in which mailed questionnaires are most appropriate are identified. Population variables, characteristics of questionnaires, and social desirability variables are examined in depth. (Author)
Descriptors: Attitude Measures, Evaluation Methods, Program Evaluation, Research Methodology
PDF pending restoration PDF pending restoration
Neustel, Sandra – 2001
As a continuing part of its validity studies, the Association of American Medical Colleges commissioned a study of the speediness of the Medical College Admission Test (MCAT). If speed is a hidden part of the test, it is a threat to its construct validity. As a general rule, the criterion used to indicate lack of speediness is that 80% of the…
Descriptors: College Applicants, College Entrance Examinations, Higher Education, Medical Education
Ito, Kyoko; Sykes, Robert C. – 2000
This study investigated the practice of weighting a type of test item, such as constructed response, more than other types of items, such as selected response, to compute student scores for a mixed-item type of test. The study used data from statewide writing field tests in grades 3, 5, and 8 and considered two contexts, that in which a single…
Descriptors: Constructed Response, Elementary Education, Essay Tests, Test Construction
Peer reviewed Peer reviewed
Berk, Ronald A. – Journal of Experimental Education, 1980
A sampling methodology is proposed for determining lengths of tests designed to assess the comprehension of written discourse. It is based on Bormuth's transformational analysis, within a domain-referenced framework. Guidelines are provided for computing sample size and selecting sentences to which the transformational rules can be applied.…
Descriptors: Reading Comprehension, Reading Tests, Sampling, Test Construction
Peer reviewed Peer reviewed
De Ayala, R. J. – Applied Psychological Measurement, 1994
Previous work on the effects of dimensionality on parameter estimation for dichotomous models is extended to the graded response model. Datasets are generated that differ in the number of latent factors as well as their interdimensional association, number of test items, and sample size. (SLD)
Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Sample Size
Peer reviewed Peer reviewed
Livingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995
A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)
Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions
Peer reviewed Peer reviewed
Qualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Peer reviewed Peer reviewed
Paolo, Anthony M.; Ryan, Joseph J. – Psychological Assessment, 1993
The Satz-Mogel Abbreviation of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) was compared with a 7-subtest short form of 130 healthy and 40 neurologically impaired older adults. Both short forms were found similar for normal or impaired adults in comparison with the full WAIS-R. (SLD)
Descriptors: Comparative Testing, Intelligence Tests, Neurological Impairments, Older Adults
Peer reviewed Peer reviewed
Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001
Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…
Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
DeMars, Christine E. – Educational and Psychological Measurement, 2005
Type I error rates for PARSCALE's fit statistic were examined. Data were generated to fit the partial credit or graded response model, with test lengths of 10 or 20 items. The ability distribution was simulated to be either normal or uniform. Type I error rates were inflated for the shorter test length and, for the graded-response model, also for…
Descriptors: Test Length, Item Response Theory, Psychometrics, Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Pages: 1  |  ...  |  21  |  22  |  23  |  24  |  25  |  26  |  27  |  28  |  29  |  ...  |  43