NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Applied Measurement in…54
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 31 to 45 of 54 results Save | Export
Peer reviewed Peer reviewed
Dorans, Neil J.; Lawrence, Ida M. – Applied Measurement in Education, 1990
A procedure for checking the score equivalence of nearly identical editions of a test is described and illustrated with Scholastic Aptitude Test data. The procedure uses the standard error of equating and uses graphical representation of score conversion deviations from the identity function in standard error units. (SLD)
Descriptors: Equated Scores, Grade Equivalent Scores, Scores, Statistical Analysis
Peer reviewed Peer reviewed
Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989
Results of 96 theoretical/empirical studies were reviewed to see if they support a taxonomy of 43 rules for writing multiple-choice test items. The taxonomy is the result of an analysis of 46 textbooks dealing with multiple-choice item writing. For nearly half of the rules, no research was found. (SLD)
Descriptors: Classification, Literature Reviews, Multiple Choice Tests, Test Construction
Peer reviewed Peer reviewed
Qualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Peer reviewed Peer reviewed
Ryan, Katherine E.; Chiu, Shuwan – Applied Measurement in Education, 2001
Examined whether patterns of gender differential item functioning (DIF) in parcels of items are influenced by changes in item position. Findings for more than 2,000 college freshmen taking a test of mathematics suggest that the amounts of gender DIF and DIF present in item parcels tend not to be influenced by changes in item position. (SLD)
Descriptors: College Freshmen, Context Effect, Higher Education, Item Bias
Peer reviewed Peer reviewed
Allalouf, Avi – Applied Measurement in Education, 2003
Studied whether differential item functioning (DIF) in translated verbal items could be reduced or eliminated by revising these items. Results for six sections of an Israeli college admission test translated from Hebrew to Russian show that revisions can reduce DIF considerably. Discusses costs of the revision process. (SLD)
Descriptors: College Entrance Examinations, Costs, Foreign Countries, Hebrew
Peer reviewed Peer reviewed
Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989
A taxonomy of 43 rules for writing multiple-choice test items is presented, based on a consensus of 46 textbooks. These guidelines are presented as complete and authoritative, with solid consensus apparent for 33 of the rules. Four rules lack consensus, and 5 rules were cited fewer than 10 times. (SLD)
Descriptors: Classification, Interrater Reliability, Multiple Choice Tests, Objective Tests
Peer reviewed Peer reviewed
Downing, Steven M.; And Others – Applied Measurement in Education, 1995
The criterion-related validity evidence and other psychometric characteristics of multiple-choice and multiple true-false (MTF) items in medical specialty certification examinations were compared using results from 21,346 candidates. Advantages of MTF items and implications for test construction are discussed. (SLD)
Descriptors: Cognitive Ability, Licensing Examinations (Professions), Medical Education, Objective Tests
Peer reviewed Peer reviewed
Frisbie, David A.; Becker, Douglas F. – Applied Measurement in Education, 1990
Seventeen educational measurement textbooks were reviewed to analyze current perceptions regarding true-false achievement testing. A synthesis of the rules for item writing is presented, and the purported advantages and disadvantages of the true-false format derived from those texts are reviewed. (TJH)
Descriptors: Achievement Tests, Higher Education, Methods Courses, Objective Tests
Peer reviewed Peer reviewed
Stecher, Brian M.; Klein, Stephen P.; Solano-Flores, Guillermo; McCaffrey, Dan; Robyn, Abby; Shavelson, Richard J.; Haertel, Edward – Applied Measurement in Education, 2000
Studied content domain, format, and level of inquiry as factors contributing to the large variation in student performance across open-ended measures. Results for more than 1,200 eighth graders do not support the hypothesis that tasks similar in content, format, and level of inquiry would correlate higher with each other than with measures…
Descriptors: Correlation, Inquiry, Junior High School Students, Junior High Schools
Peer reviewed Peer reviewed
Sireci, Stephen G.; Berberoglu, Giray – Applied Measurement in Education, 2000
Studied a method for investigating the equivalence of translated-adapted items using bilingual test takers through item response theory. Results from an English-Turkish course evaluation form completed by 688 Turkish students indicate that the methodology is effective in flagging items that function differentially across languages and informing…
Descriptors: Bilingualism, College Students, Evaluation Methods, Higher Education
Peer reviewed Peer reviewed
Kobrin, Jennifer L.; Young, John W. – Applied Measurement in Education, 2003
Studied the cognitive equivalence of computerized and paper-and-pencil reading comprehension tests using verbal protocol analysis. Results for 48 college students indicate that the only significant difference between the computerized and paper-and-pencil tests was in the frequency of identifying important information in the passage. (SLD)
Descriptors: Cognitive Processes, College Students, Computer Assisted Testing, Difficulty Level
Peer reviewed Peer reviewed
Dunham, Trudy C.; Davison, Mark L. – Applied Measurement in Education, 1990
The effects of packing or skewing the response options of a scale on the common measurement problems of leniency and range restriction in instructor ratings were assessed. Results from a sample of 130 undergraduate education students indicate that packing reduced leniency but had no effect on range restriction. (TJH)
Descriptors: Education Majors, Higher Education, Professors, Rating Scales
Peer reviewed Peer reviewed
Rocklin, Thomas – Applied Measurement in Education, 1992
College students rated dissimilarity of pairs of common test item formats. A multidimensional scaling model with individual differences fit to data from 111 students suggested that they used 2 dimensions to distinguish among the formats, 1 separating supply from selection items and 1 based on the number of options. (SLD)
Descriptors: Academic Ability, Academic Achievement, College Students, Higher Education
Peer reviewed Peer reviewed
DeMars, Christine E. – Applied Measurement in Education, 1998
Scores from mathematics (tested at 102 schools) and science (tested at 99 schools) sections of pilot forms of the Michigan High School Proficiency Test were examined for interaction between gender and response format (multiple choice or constructed response). Overall, neither males nor females seemed to be disadvantaged by item format. (SLD)
Descriptors: Constructed Response, High School Students, High Schools, Mathematics Tests
Peer reviewed Peer reviewed
Ponsoda, Vicente; Olea, Julio; Rodriguez, Maria Soledad; Revuelta, Javier – Applied Measurement in Education, 1999
Compared easy and difficult versions of self-adapted tests (SAT) and computerized adapted tests. No significant differences were found among the tests for estimated ability or posttest state anxiety in studies with 187 Spanish high school students, although other significant differences were found. Discusses implications for interpreting test…
Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing
Pages: 1  |  2  |  3  |  4