Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018
The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…
Descriptors: Test Content, Difficulty Level, Test Items, Test Construction
Kam, Chester Chun Seng – Sociological Methods & Research, 2018
The item wording (or keying) effect is respondents' differential response style to positively and negatively worded items. Despite decades of research, the nature of the effect is still unclear. This article proposes a potential reason; namely, that the item wording effect is scale-specific, and thus findings are applicable only to a particular…
Descriptors: Response Style (Tests), Test Items, Language Usage, College Students
Howard, Matt C. – Practical Assessment, Research & Evaluation, 2018
Scale pretests analyze the suitability of individual scale items for further analysis, whether through judging their face validity, wording concerns, and/or other aspects. The current article reviews scale pretests, separated by qualitative and quantitative methods, in order to identify the differences, similarities, and even existence of the…
Descriptors: Pretesting, Measures (Individuals), Test Items, Statistical Analysis
International Journal of Testing, 2018
The second edition of the International Test Commission Guidelines for Translating and Adapting Tests was prepared between 2005 and 2015 to improve upon the first edition, and to respond to advances in testing technology and practices. The 18 guidelines are organized into six categories to facilitate their use: pre-condition (3), test development…
Descriptors: Translation, Test Construction, Testing, Scoring
Sari, Halil Ibrahim; Karaman, Mehmet Akif – International Journal of Assessment Tools in Education, 2018
The current study shows the applications of both classical test theory (CTT) and item response theory (IRT) to the psychology data. The study discusses item level analyses of General Mattering Scale produced by the two theories as well as strengths and weaknesses of both measurement approaches. The survey consisted of a total of five Likert-type…
Descriptors: Measures (Individuals), Test Theory, Item Response Theory, Likert Scales
Becker, Anthony; Nekrasova-Beker, Tatiana – Educational Assessment, 2018
While previous research has identified numerous factors that contribute to item difficulty, studies involving large-scale reading tests have provided mixed results. This study examined five selected-response item types used to measure reading comprehension in the Pearson Test of English Academic: a) multiple-choice (choose one answer), b)…
Descriptors: Reading Comprehension, Test Items, Reading Tests, Test Format
Rubright, Jonathan D. – Educational Measurement: Issues and Practice, 2018
Performance assessments, scenario-based tasks, and other groups of items carry a risk of violating the local item independence assumption made by unidimensional item response theory (IRT) models. Previous studies have identified negative impacts of ignoring such violations, most notably inflated reliability estimates. Still, the influence of this…
Descriptors: Performance Based Assessment, Item Response Theory, Models, Test Reliability
Measuring Student Understanding of Genetics: Psychometric, Cognitive, and Demographic Considerations
Tornabene, Robyn – ProQuest LLC, 2018
Genetics is universally recognized as a core aspect of biological and scientific literacy. Beyond genetics' own role as a major unifying topic within the biological sciences, understanding genetics is essential for understanding other integral ideas such as evolution and development. Genetics understanding also underlies public decision making…
Descriptors: Item Response Theory, Biology, Undergraduate Students, Majors (Students)
Kim, Kyung Yong; Lee, Won-Chan – Applied Measurement in Education, 2017
This article provides a detailed description of three factors (specification of the ability distribution, numerical integration, and frame of reference for the item parameter estimates) that might affect the item parameter estimation of the three-parameter logistic model, and compares five item calibration methods, which are combinations of the…
Descriptors: Test Items, Item Response Theory, Comparative Analysis, Methods
Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Journal of Educational Measurement, 2017
Competence data from low-stakes educational large-scale assessment studies allow for evaluating relationships between competencies and other variables. The impact of item-level nonresponse has not been investigated with regard to statistics that determine the size of these relationships (e.g., correlations, regression coefficients). Classical…
Descriptors: Test Items, Cognitive Measurement, Testing Problems, Regression (Statistics)
Bayaydah, Areen Mohammad; Altwissi, Ahmad Issa – International Online Journal of Primary Education, 2020
This study aimed to identify and analyze the patterns of final exam questions prepared by English teachers for the 9th and 10th grades and to analyze all the revision questions presented in the English language textbooks in Jordan, based on Bloom's taxonomy to determine the nature and types of these questions. The sample of the study consisted of…
Descriptors: Taxonomy, Textbook Content, Content Analysis, Language Tests
Chen, Michelle Y.; Flasko, Jennifer J. – Canadian Journal of Applied Linguistics / Revue canadienne de linguistique appliquée, 2020
Seeking evidence to support content validity is essential to test validation. This is especially the case in contexts where test scores are interpreted in relation to external proficiency standards and where new test content is constantly being produced to meet test administration and security demands. In this paper, we describe a modified…
Descriptors: Foreign Countries, Reading Tests, Language Tests, English (Second Language)
Basaraba, Deni L.; Yovanoff, Paul; Shivraj, Pooja; Ketterlin-Geller, Leanne R. – Practical Assessment, Research & Evaluation, 2020
Stopping rules for fixed-form tests with graduated item difficulty are intended to stop administration of a test at the point where students are sufficiently unlikely to provide a correct response following a pattern of incorrect responses. Although widely employed in fixed-form tests in education, little research has been done to empirically…
Descriptors: Formative Evaluation, Test Format, Test Items, Difficulty Level
Lundgren, Erik; Eklöf, Hanna – Educational Research and Evaluation, 2020
The present study used process data from a computer-based problem-solving task as indications of behavioural level of test-taking effort, and explored how behavioural item-level effort related to overall test performance and self-reported effort. Variables were extracted from raw process data and clustered. Four distinct clusters were obtained and…
Descriptors: Computer Assisted Testing, Problem Solving, Response Style (Tests), Test Items
Jia, Bing; He, Dan; Zhu, Zhemin – Problems of Education in the 21st Century, 2020
The quality of multiple-choice questions (MCQs) as well as the student's solve behavior in MCQs are educational concerns. MCQs cover wide educational content and can be immediately and accurately scored. However, many studies have found some flawed items in this exam type, thereby possibly resulting in misleading insights into students'…
Descriptors: Foreign Countries, Multiple Choice Tests, Test Items, Item Response Theory

Peer reviewed
Direct link
