ERIC - Search Results

Descriptor

Higher Education	12
Mathematical Models	12
Multiple Choice Tests	12
Estimation (Mathematics)	4
Latent Trait Theory	4
Test Reliability	4
Error of Measurement	3
Test Construction	3
Test Items	3
Achievement Tests	2
Adaptive Testing	2
Cheating	2
College Students	2
Comparative Analysis	2
Computer Assisted Testing	2
Goodness of Fit	2
Item Analysis	2
Item Bias	2
Item Response Theory	2
Probability	2
Secondary Education	2
Testing Problems	2
Verbal Tests	2
Vocabulary Skills	2
Academic Ability	1
More ▼

Source

Journal of Educational…	2
Journal of Experimental…	1
Social Science Computer Review	1

Author

Drasgow, Fritz	2
Levine, Michael V.	2
Reckase, Mark D.	2
Cohen, Allan S.	1
Dawis, Rene V.	1
Douglass, James B.	1
Kelderman, Henk	1
Koch, Bill R.	1
Koch, William R.	1
Palm, Thomas	1
Scheetz, James P.	1
Smith, Richard M.	1
Tinsley, Howard E. A.	1
Westers, Paul	1
Zimmerman, Donald W.	1
vonFraunhofer, J. Anthony	1
More ▼

Publication Type

Reports - Research	9
Journal Articles	4
Speeches/Meeting Papers	3
Reports - Evaluative	2
Guides - Classroom - Teacher	1
Guides - Non-Classroom	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 12 results Save | Export

A Live Tailored Testing Comparison Study of the One and Three Parameter Logistic Models.

Download full text

Koch, Bill R.; Reckase, Mark D. – 1978

A live tailored testing study was conducted to compare the results of using either the one-parameter logistic model or the three-parameter logistic model to measure the performance of college students on multiple choice vocabulary items. The results of the study showed the three-parameter tailored testing procedure to be superior to the…

Descriptors: Adaptive Testing, Comparative Analysis, Goodness of Fit, Higher Education

Empirical Estimates of the Comparative Reliability of Matching Tests and Multiple-Choice Tests.

Peer reviewed

Zimmerman, Donald W.; And Others – Journal of Experimental Education, 1984

Three types of test were compared: a completion test, a matching test, and a multiple-choice test. The completion test was more reliable than the matching test, and the matching test was more reliable than the multiple-choice test. (Author/BW)

Descriptors: Comparative Analysis, Error of Measurement, Higher Education, Mathematical Models

Assessing Partial Knowledge in Vocabulary.

Peer reviewed

Smith, Richard M. – Journal of Educational Measurement, 1987

Partial knowledge was assessed in a multiple choice vocabulary test. Test reliability and concurrent validity were compared using Rasch-based dichotomous and polychotomous scoring models. Results supported the polychtomous scoring model, and moderately supported J. O'Connor's theory of vocabulary acquisition. (Author/GDC)

Descriptors: Adults, Higher Education, Knowledge Level, Latent Trait Theory

Measuring Criterion-Referenced Test Reliability with a Single Test Administration.

Scheetz, James P.; vonFraunhofer, J. Anthony – 1980

Subkoviak suggested a technique for estimating both group reliability and the reliability associated with assigning a given individual to a mastery or non-mastery category based on a single test administration. Two assumptions underlie this model. First, it is assumed that had successive test administrations occurred, scores for each individual…

Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Higher Education

Appropriateness Measurement with Aptitude Test Data and Estimated Parameters.

Levine, Michael V.; Drasgow, Fritz – 1980

Appropriateness measurement is a general approach to the problem caused by multiple choice tests failing to measure accurately the ability of atypical examinees. The conceptual framework of appropriateness measurement is presented, and several statistical indices of the appropriateness of a multiple choice test for an examinee are noted. A series…

Descriptors: Aptitude Tests, Cheating, Error of Measurement, Error Patterns

Applying Latent Trait Theory to a Classroom Examination System: Model Comparison and Selection.

Douglass, James B. – 1980

The three-, two- and one-parameter (Rasch) logistic item characteristic curve models are compared for use in a large multi-section college course. Only the three-parameter model produced clearly unacceptable parameter estimates for 100 item tests with examinee samples ranging from 594 to 1082. The Rasch and two-parameter models were compared for…

Descriptors: Academic Ability, Achievement Tests, Course Content, Difficulty Level

Incorporating Graphics in Automated Social Science Examinations.

Peer reviewed

Palm, Thomas – Social Science Computer Review, 1988

Develops procedures for the automatic generation and incorporation of graphics (both statistical and analytical) into mathematically oriented multiple choice examinations. Uses an integrated spreadsheet to create and associate tabular data, diagrams, questions, and answers. Illustrates the process of constructing multiple choice questions and…

Descriptors: Computer Assisted Testing, Computer Graphics, Data, Diagrams

Differential Item Functioning in Multiple Choice Items. Project Psychometric Aspects of Item Banking No. 47. Research Report 90-1.

Download full text

Westers, Paul; Kelderman, Henk – 1990

In multiple-choice items the response probability on an item may be viewed as the result of two distinct latent processes--a cognitive process to solve the problem, and another random process that leads to the choice of a certain alternative (the process of giving the actual response). An incomplete latent class model is formulated that describes…

Descriptors: Cognitive Processes, Estimation (Mathematics), Foreign Countries, Guessing (Tests)

A Comparison on the Rasch Item Probability with Three Common Item Characteristics as Criteria for Item Selection.

Download full text

Tinsley, Howard E. A.; Dawis, Rene V. – 1972

Selection of items for analogy tests according to the Rasch item probability of "goodness of fit" to the model is compared with three commonly used item selection criteria: item discrimination, item difficulty, and item-ability correlation. Word, picture, symbol and number analogies in multiple choice format were administered to several…

Descriptors: College Students, Correlation, Evaluation Criteria, Goodness of Fit

Performance Envelopes and Optimal Appropriateness Measurement.

Levine, Michael V.; Drasgow, Fritz – 1984

Some examinees' test-taking behavior may be so idiosyncratic that their scores are not comparable to the scores of more typical examinees. Appropriateness indices, which provide quantitative measures of response-pattern atypicality, can be viewed as statistics for testing a null hypothesis of normal test-taking behavior against an alternative…

Descriptors: Cheating, College Entrance Examinations, Computer Simulation, Estimation (Mathematics)

Problems in Application of Latent Trait Models to Tailored Testing.

Download full text

Koch, William R.; Reckase, Mark D. – 1979

Tailored testing procedures for achievement testing were applied in a situation that failed to meet some of the specifications generally considered to be necessary for tailored testing. Discrepancies from the appropriate conditions included the use of small samples for calibrating items, and the use of an item pool that was not designed to be…

Descriptors: Achievement Tests, Adaptive Testing, Educational Testing, Higher Education

Influence of Prior Distributions on Detection of DIF.

Peer reviewed

Cohen, Allan S.; And Others – Journal of Educational Measurement, 1991

Detecting differential item functioning (DIF) on test items constructed to favor 1 group over another was investigated on parameter estimates from 2 item response theory-based computer programs--BILOG and LOGIST--using data for 1,000 White and 1,000 Black college students. Use of prior distributions and marginal-maximum a posteriori estimation is…

Descriptors: Black Students, College Students, Computer Assisted Testing, Equations (Mathematics)