Publication Date
| In 2026 | 0 |
| Since 2025 | 66 |
| Since 2022 (last 5 years) | 301 |
| Since 2017 (last 10 years) | 682 |
| Since 2007 (last 20 years) | 1203 |
Descriptor
| Test Construction | 2711 |
| Test Items | 2711 |
| Test Validity | 706 |
| Test Reliability | 565 |
| Foreign Countries | 548 |
| Item Analysis | 486 |
| Difficulty Level | 428 |
| Multiple Choice Tests | 413 |
| Item Response Theory | 394 |
| Computer Assisted Testing | 385 |
| Test Format | 363 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 155 |
| Teachers | 116 |
| Researchers | 99 |
| Administrators | 32 |
| Students | 17 |
| Policymakers | 6 |
| Parents | 4 |
| Counselors | 3 |
| Support Staff | 3 |
Location
| Turkey | 65 |
| Australia | 53 |
| Canada | 30 |
| Indonesia | 29 |
| Florida | 26 |
| Germany | 24 |
| United Kingdom | 23 |
| United Kingdom (England) | 20 |
| China | 19 |
| Japan | 16 |
| Oregon | 16 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Yanyan Fu – Educational Measurement: Issues and Practice, 2024
The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items.…
Descriptors: Error Correction, Automation, Test Items, Test Construction
Changiz Mohiyeddini – Anatomical Sciences Education, 2025
This article presents a step-by-step guide to using R and SPSS to bootstrap exam questions. Bootstrapping, a versatile nonparametric analytical technique, can help to improve the psychometric qualities of exam questions in the process of quality assurance. Bootstrapping is particularly useful in disciplines such as medical education, where student…
Descriptors: Test Items, Sampling, Statistical Inference, Nonparametric Statistics
Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024
Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…
Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items
Robert J. Marzano; Bridget Cahill; Jeni Gotto; Brian J. Kosena; Michael Lynch; Lucy Pearson – Solution Tree, 2025
In "Test-Specific Thinking," the authors provide recommended practices, methods, and means for educators to implement structural schemas into teaching, helping students better prepare for tests and formulate stronger responses to certain question frames. Armed with a better understanding of how tests are designed, teachers will increase…
Descriptors: English Instruction, Language Arts, Mathematics Tests, Test Construction
Becker, Benjamin; Weirich, Sebastian; Goldhammer, Frank; Debeer, Dries – Journal of Educational Measurement, 2023
When designing or modifying a test, an important challenge is controlling its speededness. To achieve this, van der Linden (2011a, 2011b) proposed using a lognormal response time model, more specifically the two-parameter lognormal model, and automated test assembly (ATA) via mixed integer linear programming. However, this approach has a severe…
Descriptors: Test Construction, Automation, Models, Test Items
Miguel A. García-Pérez – Educational and Psychological Measurement, 2024
A recurring question regarding Likert items is whether the discrete steps that this response format allows represent constant increments along the underlying continuum. This question appears unsolvable because Likert responses carry no direct information to this effect. Yet, any item administered in Likert format can identically be administered…
Descriptors: Likert Scales, Test Construction, Test Items, Item Analysis
Po-Chun Huang; Ying-Hong Chan; Ching-Yu Yang; Hung-Yuan Chen; Yao-Chung Fan – IEEE Transactions on Learning Technologies, 2024
Question generation (QG) task plays a crucial role in adaptive learning. While significant QG performance advancements are reported, the existing QG studies are still far from practical usage. One point that needs strengthening is to consider the generation of question group, which remains untouched. For forming a question group, intrafactors…
Descriptors: Automation, Test Items, Computer Assisted Testing, Test Construction
Mahmood Ul Hassan; Frank Miller – Journal of Educational Measurement, 2024
Multidimensional achievement tests are recently gaining more importance in educational and psychological measurements. For example, multidimensional diagnostic tests can help students to determine which particular domain of knowledge they need to improve for better performance. To estimate the characteristics of candidate items (calibration) for…
Descriptors: Multidimensional Scaling, Achievement Tests, Test Items, Test Construction
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Chan Zhang; Shuaiying Cao; Minglei Wang; Jiangyan Wang; Lirui He – Field Methods, 2025
Previous research on grid questions has mostly focused on their comparability with the item-by-item method and the use of shading to help respondents navigate through a grid. This study extends prior work by examining whether lexical similarity among grid items affects how respondents answer the questions in an experiment where we manipulated…
Descriptors: Foreign Countries, Surveys, Test Construction, Design
Christoph Ableitinger; Christian Dorner – International Journal of Mathematical Education in Science and Technology, 2025
The number of complaints university lecturers make about a lack of knowledge, especially first-year students' procedural knowledge, has increased recently. Due to missing adequate empirical evidence, a survey of procedural knowledge among students of Austrian high schools in their final year was conducted. For this purpose, test items for…
Descriptors: Knowledge Level, Cognitive Processes, High School Seniors, Foreign Countries
Christopher J. Anthony; Stephen N. Elliott – School Mental Health, 2025
Stress is a complex construct that is related to resilience and general health starting in childhood. Despite its importance for student health and well-being, there are few measures of stress designed for school-based applications. In this study, we developed and initially validated a Stress Indicators Scale using five samples of teachers,…
Descriptors: Test Construction, Stress Variables, Test Validity, Test Items
Ikkyu Choi; Jiyun Zu – Language Testing, 2025
Today's language models can produce syntactically accurate and semantically coherent texts. This capability presents new opportunities for generating content for language assessments, which have traditionally required intensive expert resources. However, these models are also known to generate biased texts, leading to representational harms.…
Descriptors: Artificial Intelligence, Language Tests, Test Bias, Test Construction
Anna Planas-Lladó; Xavier Úcar – American Journal of Evaluation, 2024
Empowerment is a concept that has become increasingly used over recent years. However, little research has been undertaken into how empowerment can be evaluated, particularly in the case of young people. The aim of this article is to present an inventory of dimensions and indicators of youth empowerment. The article describes the various phases in…
Descriptors: Youth, Empowerment, Test Construction, Test Validity
Ildiko Porter-Szucs; Cynthia J. Macknish; Suzanne Toohey – John Wiley & Sons, Inc, 2025
"A Practical Guide to Language Assessment" helps educators at every level redefine their approach to language assessment. Grounded in extensive research and aligned with the latest advances in language education, this comprehensive guide introduces foundational concepts and explores key principles in test development and item writing.…
Descriptors: Student Evaluation, Language Tests, Test Construction, Test Items

Peer reviewed
Direct link
