Publication Date
| In 2026 | 0 |
| Since 2025 | 52 |
| Since 2022 (last 5 years) | 194 |
| Since 2017 (last 10 years) | 494 |
| Since 2007 (last 20 years) | 742 |
Descriptor
| Test Items | 1186 |
| Test Reliability | 1186 |
| Test Validity | 684 |
| Test Construction | 565 |
| Foreign Countries | 348 |
| Difficulty Level | 279 |
| Item Analysis | 252 |
| Psychometrics | 233 |
| Item Response Theory | 219 |
| Factor Analysis | 183 |
| Multiple Choice Tests | 172 |
| More ▼ | |
Source
Author
| Schoen, Robert C. | 12 |
| LaVenia, Mark | 5 |
| Liu, Ou Lydia | 5 |
| Anderson, Daniel | 4 |
| Bauduin, Charity | 4 |
| DiLuzio, Geneva J. | 4 |
| Farina, Kristy | 4 |
| Haladyna, Thomas M. | 4 |
| Huck, Schuyler W. | 4 |
| Petscher, Yaacov | 4 |
| Stansfield, Charles W. | 4 |
| More ▼ | |
Publication Type
Education Level
Audience
| Practitioners | 39 |
| Researchers | 30 |
| Teachers | 24 |
| Administrators | 13 |
| Support Staff | 3 |
| Counselors | 2 |
| Students | 2 |
| Community | 1 |
| Parents | 1 |
| Policymakers | 1 |
Location
| Turkey | 68 |
| Indonesia | 37 |
| Germany | 20 |
| Canada | 17 |
| Florida | 17 |
| China | 16 |
| Australia | 15 |
| California | 12 |
| Iran | 11 |
| India | 10 |
| New York | 9 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024
This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…
Descriptors: Korean, Test Validity, Test Reliability, Imitation
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…
Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory
E.?B. Merki; S.?I. Hofer; A. Vaterlaus; A. Lichtenberger – Physical Review Physics Education Research, 2025
When describing motion in physics, the selection of a frame of reference is crucial. The graph of a moving object can look quite different based on the frame of reference. In recent years, various tests have been developed to assess the interpretation of kinematic graphs, but none of these tests have specifically addressed differences in reference…
Descriptors: Graphs, Motion, Physics, Secondary School Students
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Melissa Whatley; Dominique Foster; Stephen Paul – Journal of Studies in International Education, 2024
The purpose of this study was to develop a measurement instrument that scholars and practitioners in international education can use as a means of exploring whether and how individuals who come into contact with international education programs develop a greater sense of cultural humility. Specifically, the study described here outlines the four…
Descriptors: Foreign Students, Cultural Awareness, Consciousness Raising, Test Construction
Emily A. Holt; Jessica Duke; Ryan Dunk; Krystal Hinerman – Environmental Education Research, 2024
Student understanding of climate change is an active and growing area of research, but little research has documented undergraduate students' knowledge about the biotic impacts of climate change. Here, we address this literature gap by presenting the Inventory of Biotic Climate Literacy (IBCL), a concept inventory developed to assess undergraduate…
Descriptors: Climate, Undergraduate Students, Knowledge Level, Test Construction
Kent Anderson Seidel – School Leadership Review, 2025
This paper examines one of three central diagnostic tools of the Concerns Based Adoption Model, the Stages of Concern Questionnaire (SoCQ). The SoCQ was developed with a focus on K12 education. It has been used widely since developed in 1973, in early childhood, higher education, medical, business, community, and military settings. The SoCQ…
Descriptors: Questionnaires, Educational Change, Educational Innovation, Intervention
Zyluk, Natalia; Karpe, Karolina; Urbanski, Mariusz – SAGE Open, 2022
The aim of this paper is to describe the process of modification of the research tool designed for measuring the development of personal epistemology--"Standardized Epistemological Understanding Assessment" (SEUA). SEUA was constructed as an improved version of the instrument initially proposed by Kuhn et al. SEUA was proved to be a more…
Descriptors: Epistemology, Research Tools, Beliefs, Test Items
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022
The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…
Descriptors: Test Reliability, Scores, Test Items, Correlation
Marta Montenegro-Rueda; José María Fernández-Batanero – European Journal of Special Needs Education, 2024
The instruments for the evaluation of teachers' digital competence are abundant, however, there is still a lack of instruments oriented to the context of Special Education. In this sense, this study presents the validation process of an instrument that aims to determine the level of knowledge and digital competence of Special Education teachers…
Descriptors: Teacher Competencies, Technological Literacy, Special Education Teachers, Test Construction
Mahdi Ghorbankhani; Keyvan Salehi – SAGE Open, 2025
Academic procrastination, the tendency to delay academic tasks without reasonable justification, has significant implications for students' academic performance and overall well-being. To measure this construct, numerous scales have been developed, among which the Academic Procrastination Scale (APS) has shown promise in assessing academic…
Descriptors: Psychometrics, Measures (Individuals), Time Management, Foreign Countries
Martin Steinbach; Carolin Eitemüller; Marc Rodemer; Maik Walpuski – International Journal of Science Education, 2025
The intricate relationship between representational competence and content knowledge in organic chemistry has been widely debated, and the ways in which representations contribute to task difficulty, particularly in assessment, remain unclear. This paper presents a multiple-choice test instrument for assessing individuals' knowledge of fundamental…
Descriptors: Organic Chemistry, Difficulty Level, Multiple Choice Tests, Fundamental Concepts
Al Lawati, Zahra Ali – Language Testing in Asia, 2023
This study discusses the characteristics of test specifications (specs) and item writer guidelines (IWGs), their role in item development of English as a Second Language (ESL) reading tests, and the use of the CEFR for specs development. This mixed-method study analyzed specs, IWGs, tests, and the Pearson Test of English General test statistics.…
Descriptors: Language Tests, Test Items, Test Construction, English (Second Language)
Aybek, Eren Can; Toraman, Cetin – International Journal of Assessment Tools in Education, 2022
The current study investigates the optimum number of response categories for the Likert type of scales under the item response theory (IRT). The data was collected from university students attend to mainly the faculty of medicine and the faculty of education. A form of the "Social Gender Equity Scale" developed by Gozutok et al. (2017)…
Descriptors: Likert Scales, Item Response Theory, College Students, Test Reliability
Kamau Oginga Siwatu; Kara Page; Narges Hadi – College Teaching, 2024
The purpose of this article is to document the development of a new measure of teaching self-efficacy -- "The College Teaching Self-Efficacy (CTSE) Scale." We designed the CTSE scale to examine individuals' beliefs in their abilities to perform specific teaching tasks in a college classroom successfully. We developed an instrument that…
Descriptors: Self Efficacy, Beliefs, Psychometrics, Measures (Individuals)

Peer reviewed
Direct link
