Publication Date
| In 2026 | 0 |
| Since 2025 | 15 |
| Since 2022 (last 5 years) | 56 |
| Since 2017 (last 10 years) | 147 |
| Since 2007 (last 20 years) | 261 |
Descriptor
| Difficulty Level | 426 |
| Multiple Choice Tests | 426 |
| Test Items | 298 |
| Test Construction | 134 |
| Foreign Countries | 131 |
| Item Analysis | 103 |
| Test Format | 96 |
| Test Reliability | 85 |
| Item Response Theory | 79 |
| Test Validity | 74 |
| Higher Education | 70 |
| More ▼ | |
Source
Author
| Tindal, Gerald | 6 |
| Alonzo, Julie | 5 |
| DeBoer, George E. | 5 |
| Herrmann-Abell, Cari F. | 5 |
| Plake, Barbara S. | 5 |
| Cizek, Gregory J. | 4 |
| Huntley, Renee M. | 4 |
| Katz, Irvin R. | 4 |
| Tollefson, Nona | 4 |
| Anderson, Paul S. | 3 |
| Andrich, David | 3 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 106 |
| Postsecondary Education | 86 |
| Secondary Education | 64 |
| Elementary Education | 45 |
| Middle Schools | 30 |
| High Schools | 24 |
| Junior High Schools | 19 |
| Intermediate Grades | 17 |
| Grade 6 | 13 |
| Grade 7 | 13 |
| Grade 5 | 12 |
| More ▼ | |
Audience
| Researchers | 10 |
| Teachers | 2 |
| Administrators | 1 |
| Practitioners | 1 |
Location
| Turkey | 14 |
| Indonesia | 10 |
| Australia | 8 |
| Canada | 8 |
| Germany | 8 |
| Nigeria | 7 |
| Taiwan | 6 |
| Jordan | 5 |
| Netherlands | 5 |
| California | 4 |
| Malaysia | 4 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 2 |
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012
The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…
Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests
Shuhidan, Shuhaida; Hamilton, Margaret; D'Souza, Daryl – Computer Science Education, 2010
Learning to program is known to be difficult for novices. High attrition and high failure rates in foundation-level programming courses undertaken at tertiary level in Computer Science programs, are commonly reported. A common approach to evaluating novice programming ability is through a combination of formative and summative assessments, with…
Descriptors: Teacher Attitudes, Secondary School Teachers, College Faculty, Multiple Choice Tests
Park, Bitnara Jasmine; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2011
This technical report describes the process of development and piloting of reading comprehension measures that are appropriate for seventh-grade students as part of an online progress screening and monitoring assessment system, http://easycbm.com. Each measure consists of an original fictional story of approximately 1,600 to 1,900 words with 20…
Descriptors: Reading Comprehension, Reading Tests, Grade 7, Test Construction
DiBartolomeo, Matthew – ProQuest LLC, 2010
Multiple factors have influenced testing agencies to more carefully consider the manner and frequency in which pretest item data are collected and analyzed. One potentially promising development is judges' estimates of item difficulty. Accurate estimates of item difficulty may be used to reduce pretest samples sizes, supplement insufficient…
Descriptors: Test Items, Group Discussion, Athletics, Pretests Posttests
Wang, Pei-Yu – Journal of Educational Technology, 2013
This study examined the impact of e-book text-tracking design on 4th graders' (10-year-old children's) learning of Chinese characters. The e-books used in this study were created with Adobe Flash CS 5.5 and Action Script 3.0. This study was guided by two main questions: (1) Is there any difference in learning achievement (Chinese character…
Descriptors: Comparative Analysis, Books, Foreign Countries, Grade 4
Kobrin, Jennifer L.; Kim, Rachel; Sackett, Paul – College Board, 2011
There is much debate on the merits and pitfalls of standardized tests for college admission, with questions regarding the format (multiple-choice versus constructed response), cognitive complexity, and content of these assessments (achievement versus aptitude) at the forefront of the discussion. This study addressed these questions by…
Descriptors: College Entrance Examinations, Mathematics Tests, Test Items, Predictive Validity
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – ETS Research Report Series, 2008
This study examined variations of a nonequivalent groups equating design used with constructed-response (CR) tests to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, the study investigated the use of anchor CR item rescoring in the context of classical…
Descriptors: Equated Scores, Comparative Analysis, Test Format, Responses
Lingard, Jennifer; Minasian-Batmanian, Laura; Vella, Gilbert; Cathers, Ian; Gonzalez, Carlos – Assessment & Evaluation in Higher Education, 2009
Effective criterion referenced assessment requires grade descriptors to clarify to students what skills are required to gain higher grades. But do students and staff actually have the same perception of the grading system, and if so, do they perform better than those whose perceptions are less accurately aligned with those of staff? Since…
Descriptors: Feedback (Response), Prior Learning, Physics, Difficulty Level
Vuk, Jasna; Morse, David T. – Research in the Schools, 2009
In this study we observed college students' behavior on two self-tailored, multiple-choice exams. Self-tailoring was defined as an option to omit up to five items from being scored on an exam. Participants, 80 undergraduate college students enrolled in two sections of an educational psychology course, statistically significantly improved their…
Descriptors: College Students, Educational Psychology, Academic Achievement, Correlation
Sawchuk, Stephen – Education Digest: Essential Readings Condensed for Quick Review, 2010
Most experts in the testing community have presumed that the $350 million promised by the U.S. Department of Education to support common assessments would promote those that made greater use of open-ended items capable of measuring higher-order critical-thinking skills. But as measurement experts consider the multitude of possibilities for an…
Descriptors: Educational Quality, Test Items, Comparative Analysis, Multiple Choice Tests
Kaliski, Pamela; Huff, Kristen; Barry, Carol – College Board, 2011
For educational achievement tests that employ multiple-choice (MC) items and aim to reliably classify students into performance categories, it is critical to design MC items that are capable of discriminating student performance according to the stated achievement levels. This is accomplished, in part, by clearly understanding how item design…
Descriptors: Alignment (Education), Academic Achievement, Expertise, Evaluative Thinking
Smith, Richard B. – J Educ Meas, 1970
Descriptors: Classification, Difficulty Level, Multiple Choice Tests
Kopp, Veronika; Stark, Robin; Heitzmann, Nicole; Fischer, Martin R. – Evaluation & Research in Education, 2009
To foster medical students' diagnostic knowledge a case-based worked example approach was implemented in the context of a computer-based learning environment. Thirty medical students were randomly assigned to the condition "with erroneous examples", and 31 students learned with correct examples. Diagnostic knowledge was operationalised…
Descriptors: Medical Students, Computer Assisted Instruction, Multiple Choice Tests, Independent Study
Sinharay, Sandip; Holland, Paul W. – Educational Testing Service, 2008
The nonequivalent groups with anchor test (NEAT) design involves missing data that are missing by design. Three popular equating methods that can be used with a NEAT design are the poststratification equating method, the chain equipercentile equating method, and the item-response-theory observed-score-equating method. These three methods each…
Descriptors: Equated Scores, Test Items, Item Response Theory, Data
Tan, Kim Chwee Daniel; Taber, Keith S.; Liu, Xiufeng; Coll, Richard K.; Lorenzo, Mercedes; Li, Jia; Goh, Ngoh Khang; Chia, Lian Sai – International Journal of Science Education, 2008
Previous studies have indicated that A-level students in the UK and Singapore have difficulty learning the topic of ionisation energy. A two-tier multiple-choice instrument developed in Singapore in an earlier study, the Ionisation Energy Diagnostic Instrument, was administered to A-level students in the UK, advanced placement high school students…
Descriptors: College Freshmen, Difficulty Level, Advanced Placement, Foreign Countries

Direct link
Peer reviewed
