NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021
There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…
Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016
Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…
Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Keller, Lisa A.; Keller, Robert R. – Applied Measurement in Education, 2015
Equating test forms is an essential activity in standardized testing, with increased importance with the accountability systems in existence through the mandate of Adequate Yearly Progress. It is through equating that scores from different test forms become comparable, which allows for the tracking of changes in the performance of students from…
Descriptors: Item Response Theory, Rating Scales, Standardized Tests, Scoring Rubrics
Peer reviewed Peer reviewed
Direct linkDirect link
Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008
In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…
Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Ascalon, M. Evelina; Meyers, Lawrence S.; Davis, Bruce W.; Smits, Niels – Applied Measurement in Education, 2007
This article examined two item-writing guidelines: the format of the item stem and homogeneity of the answer set. Answering the call of Haladyna, Downing, and Rodriguez (2002) for empirical tests of item writing guidelines and extending the work of Smith and Smith (1988) on differential use of item characteristics, a mock multiple-choice driver's…
Descriptors: Guidelines, Difficulty Level, Standard Setting, Driver Education
Peer reviewed Peer reviewed
Dorans, Neil J.; Lawrence, Ida M. – Applied Measurement in Education, 1990
A procedure for checking the score equivalence of nearly identical editions of a test is described and illustrated with Scholastic Aptitude Test data. The procedure uses the standard error of equating and uses graphical representation of score conversion deviations from the identity function in standard error units. (SLD)
Descriptors: Equated Scores, Grade Equivalent Scores, Scores, Statistical Analysis
Peer reviewed Peer reviewed
Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989
Results of 96 theoretical/empirical studies were reviewed to see if they support a taxonomy of 43 rules for writing multiple-choice test items. The taxonomy is the result of an analysis of 46 textbooks dealing with multiple-choice item writing. For nearly half of the rules, no research was found. (SLD)
Descriptors: Classification, Literature Reviews, Multiple Choice Tests, Test Construction
Peer reviewed Peer reviewed
Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989
A taxonomy of 43 rules for writing multiple-choice test items is presented, based on a consensus of 46 textbooks. These guidelines are presented as complete and authoritative, with solid consensus apparent for 33 of the rules. Four rules lack consensus, and 5 rules were cited fewer than 10 times. (SLD)
Descriptors: Classification, Interrater Reliability, Multiple Choice Tests, Objective Tests
Peer reviewed Peer reviewed
Downing, Steven M.; And Others – Applied Measurement in Education, 1995
The criterion-related validity evidence and other psychometric characteristics of multiple-choice and multiple true-false (MTF) items in medical specialty certification examinations were compared using results from 21,346 candidates. Advantages of MTF items and implications for test construction are discussed. (SLD)
Descriptors: Cognitive Ability, Licensing Examinations (Professions), Medical Education, Objective Tests
Peer reviewed Peer reviewed
Sireci, Stephen G.; Berberoglu, Giray – Applied Measurement in Education, 2000
Studied a method for investigating the equivalence of translated-adapted items using bilingual test takers through item response theory. Results from an English-Turkish course evaluation form completed by 688 Turkish students indicate that the methodology is effective in flagging items that function differentially across languages and informing…
Descriptors: Bilingualism, College Students, Evaluation Methods, Higher Education
Peer reviewed Peer reviewed
Martinez, Michael E.; Bennett, Randy Elliot – Applied Measurement in Education, 1992
New developments in the use of automatically scorable constructed response item types for large-scale assessment are reviewed for five domains: (1) mathematical reasoning; (2) algebra problem solving; (3) computer science; (4) architecture; and (5) natural language. Ways in which these technologies are likely to shape testing are considered. (SLD)
Descriptors: Algebra, Architecture, Automation, Computer Science
Peer reviewed Peer reviewed
Haladyna, Thomas A. – Applied Measurement in Education, 1992
Several multiple-choice item formats are examined in the current climate of test reform. The reform movement is discussed as it affects use of the following formats: (1) complex multiple-choice; (2) alternate choice; (3) true-false; (4) multiple true-false; and (5) the context dependent item set. (SLD)
Descriptors: Cognitive Psychology, Comparative Testing, Context Effect, Educational Change
Peer reviewed Peer reviewed
Harris, Abigail M.; Carlton, Sydell T. – Applied Measurement in Education, 1993
Differential item functioning on 6 forms of the Scholastic Aptitude Test was examined for 181,228 male and 198,668 female students focusing on the points tested, the test format, and subject matter in which items are embedded. Implications of the identifiable differences are discussed. (SLD)
Descriptors: College Entrance Examinations, Comparative Analysis, Females, High School Students
Peer reviewed Peer reviewed
Hall, Bruce W.; And Others – Applied Measurement in Education, 1988
Responses of 310 teachers in Florida to a survey about use of teacher-made tests, nationally standardized tests, and state minimum competency tests were studied. Results show that all three test types were used to some extent in eight decision categories, but none of the tests were clearly dominant. (SLD)
Descriptors: Classroom Techniques, Decision Making, Elementary Secondary Education, Minimum Competency Testing