NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Koch, Bill R.; Reckase, Mark D. – 1978
A live tailored testing study was conducted to compare the results of using either the one-parameter logistic model or the three-parameter logistic model to measure the performance of college students on multiple choice vocabulary items. The results of the study showed the three-parameter tailored testing procedure to be superior to the…
Descriptors: Adaptive Testing, Comparative Analysis, Goodness of Fit, Higher Education
Peer reviewed Peer reviewed
Zimmerman, Donald W.; And Others – Journal of Experimental Education, 1984
Three types of test were compared: a completion test, a matching test, and a multiple-choice test. The completion test was more reliable than the matching test, and the matching test was more reliable than the multiple-choice test. (Author/BW)
Descriptors: Comparative Analysis, Error of Measurement, Higher Education, Mathematical Models
Peer reviewed Peer reviewed
Smith, Richard M. – Journal of Educational Measurement, 1987
Partial knowledge was assessed in a multiple choice vocabulary test. Test reliability and concurrent validity were compared using Rasch-based dichotomous and polychotomous scoring models. Results supported the polychtomous scoring model, and moderately supported J. O'Connor's theory of vocabulary acquisition. (Author/GDC)
Descriptors: Adults, Higher Education, Knowledge Level, Latent Trait Theory
Scheetz, James P.; vonFraunhofer, J. Anthony – 1980
Subkoviak suggested a technique for estimating both group reliability and the reliability associated with assigning a given individual to a mastery or non-mastery category based on a single test administration. Two assumptions underlie this model. First, it is assumed that had successive test administrations occurred, scores for each individual…
Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Higher Education
Levine, Michael V.; Drasgow, Fritz – 1980
Appropriateness measurement is a general approach to the problem caused by multiple choice tests failing to measure accurately the ability of atypical examinees. The conceptual framework of appropriateness measurement is presented, and several statistical indices of the appropriateness of a multiple choice test for an examinee are noted. A series…
Descriptors: Aptitude Tests, Cheating, Error of Measurement, Error Patterns
Douglass, James B. – 1980
The three-, two- and one-parameter (Rasch) logistic item characteristic curve models are compared for use in a large multi-section college course. Only the three-parameter model produced clearly unacceptable parameter estimates for 100 item tests with examinee samples ranging from 594 to 1082. The Rasch and two-parameter models were compared for…
Descriptors: Academic Ability, Achievement Tests, Course Content, Difficulty Level
Peer reviewed Peer reviewed
Palm, Thomas – Social Science Computer Review, 1988
Develops procedures for the automatic generation and incorporation of graphics (both statistical and analytical) into mathematically oriented multiple choice examinations. Uses an integrated spreadsheet to create and associate tabular data, diagrams, questions, and answers. Illustrates the process of constructing multiple choice questions and…
Descriptors: Computer Assisted Testing, Computer Graphics, Data, Diagrams
Westers, Paul; Kelderman, Henk – 1990
In multiple-choice items the response probability on an item may be viewed as the result of two distinct latent processes--a cognitive process to solve the problem, and another random process that leads to the choice of a certain alternative (the process of giving the actual response). An incomplete latent class model is formulated that describes…
Descriptors: Cognitive Processes, Estimation (Mathematics), Foreign Countries, Guessing (Tests)
Tinsley, Howard E. A.; Dawis, Rene V. – 1972
Selection of items for analogy tests according to the Rasch item probability of "goodness of fit" to the model is compared with three commonly used item selection criteria: item discrimination, item difficulty, and item-ability correlation. Word, picture, symbol and number analogies in multiple choice format were administered to several…
Descriptors: College Students, Correlation, Evaluation Criteria, Goodness of Fit
Levine, Michael V.; Drasgow, Fritz – 1984
Some examinees' test-taking behavior may be so idiosyncratic that their scores are not comparable to the scores of more typical examinees. Appropriateness indices, which provide quantitative measures of response-pattern atypicality, can be viewed as statistics for testing a null hypothesis of normal test-taking behavior against an alternative…
Descriptors: Cheating, College Entrance Examinations, Computer Simulation, Estimation (Mathematics)
Koch, William R.; Reckase, Mark D. – 1979
Tailored testing procedures for achievement testing were applied in a situation that failed to meet some of the specifications generally considered to be necessary for tailored testing. Discrepancies from the appropriate conditions included the use of small samples for calibrating items, and the use of an item pool that was not designed to be…
Descriptors: Achievement Tests, Adaptive Testing, Educational Testing, Higher Education
Peer reviewed Peer reviewed
Cohen, Allan S.; And Others – Journal of Educational Measurement, 1991
Detecting differential item functioning (DIF) on test items constructed to favor 1 group over another was investigated on parameter estimates from 2 item response theory-based computer programs--BILOG and LOGIST--using data for 1,000 White and 1,000 Black college students. Use of prior distributions and marginal-maximum a posteriori estimation is…
Descriptors: Black Students, College Students, Computer Assisted Testing, Equations (Mathematics)