Publication Date
In 2025 | 2 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 12 |
Since 2016 (last 10 years) | 23 |
Since 2006 (last 20 years) | 36 |
Descriptor
Difficulty Level | 84 |
Item Analysis | 84 |
Test Reliability | 84 |
Test Items | 65 |
Test Validity | 42 |
Test Construction | 39 |
Multiple Choice Tests | 27 |
Foreign Countries | 26 |
Higher Education | 16 |
Statistical Analysis | 16 |
Achievement Tests | 14 |
More ▼ |
Source
Author
Publication Type
Education Level
Higher Education | 15 |
Postsecondary Education | 12 |
Secondary Education | 10 |
Elementary Education | 7 |
Middle Schools | 4 |
Intermediate Grades | 3 |
Early Childhood Education | 2 |
Grade 1 | 2 |
Grade 6 | 2 |
Kindergarten | 2 |
Primary Education | 2 |
More ▼ |
Audience
Researchers | 5 |
Practitioners | 3 |
Teachers | 1 |
Location
Turkey | 5 |
Nigeria | 4 |
India | 3 |
United States | 3 |
Indonesia | 2 |
Bosnia and Herzegovina | 1 |
Canada | 1 |
Finland | 1 |
Florida | 1 |
Georgia | 1 |
Germany | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
Cattell Culture Fair… | 1 |
Flesch Kincaid Grade Level… | 1 |
Flesch Reading Ease Formula | 1 |
Graduate Management Admission… | 1 |
Graduate Record Examinations | 1 |
Matching Familiar Figures Test | 1 |
Stanford Binet Intelligence… | 1 |
What Works Clearinghouse Rating
Thompson, Kathryn N. – ProQuest LLC, 2023
It is imperative to collect validity evidence prior to interpreting and using test scores. During the process of collecting validity evidence, test developers should consider whether test scores are contaminated by sources of extraneous information. This is referred to as construct irrelevant variance, or the "degree to which test scores are…
Descriptors: Test Wiseness, Test Items, Item Response Theory, Scores
Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025
The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…
Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction
Ferrari-Bridgers, Franca – International Journal of Listening, 2023
While many tools exist to assess student content knowledge, there are few that assess whether students display the critical listening skills necessary to interpret the quality of a speaker's message at the college level. The following research provides preliminary evidence for the internal consistency and factor structure of a tool, the…
Descriptors: Factor Structure, Test Validity, Community College Students, Test Reliability
Büsra Kilinç; Mehmet Diyaddin Yasar – Science Insights Education Frontiers, 2024
In this study, it was aimed to develop an achievement test taking into account the subject acquisitions of the sound and properties unit in the sixth-grade science course. In the test development phase, firstly, literature review for the study was conducted. Then, 30 multiple choice questions in align with the subject acquisition in the 2018…
Descriptors: Science Tests, Test Construction, Grade 6, Science Instruction
Dina Kamber Hamzic; Mirsad Trumic; Ismar Hadžalic – International Electronic Journal of Mathematics Education, 2025
Trigonometry is an important part of secondary school mathematics, but it is usually challenging for students to understand and learn. Since trigonometry is learned and used at a university level in many fields, like physics or geodesy, it is important to have an insight into students' trigonometry knowledge before the beginning of the university…
Descriptors: Trigonometry, Mathematics Instruction, Prior Learning, Outcomes of Education
Tim Jacobbe; Bob delMas; Brad Hartlaub; Jeff Haberstroh; Catherine Case; Steven Foti; Douglas Whitaker – Numeracy, 2023
The development of assessments as part of the funded LOCUS project is described. The assessments measure students' conceptual understanding of statistics as outlined in the GAISE PreK-12 Framework. Results are reported from a large-scale administration to 3,430 students in grades 6 through 12 in the United States. Items were designed to assess…
Descriptors: Statistics Education, Common Core State Standards, Student Evaluation, Elementary School Students
Slepkov, A. D.; Van Bussel, M. L.; Fitze, K. M.; Burr, W. S. – SAGE Open, 2021
There is a broad literature in multiple-choice test development, both in terms of item-writing guidelines, and psychometric functionality as a measurement tool. However, most of the published literature concerns multiple-choice testing in the context of expert-designed high-stakes standardized assessments, with little attention being paid to the…
Descriptors: Foreign Countries, Undergraduate Students, Student Evaluation, Multiple Choice Tests
Hartono, Wahyu; Hadi, Samsul; Rosnawati, Raden; Retnawati, Heri – Pegem Journal of Education and Instruction, 2023
Researchers design diagnostic assessments to measure students' knowledge structures and processing skills to provide information about their cognitive attribute. The purpose of this study is to determine the instrument's validity and score reliability, as well as to investigate the use of classical test theory to identify item characteristics. The…
Descriptors: Diagnostic Tests, Test Validity, Item Response Theory, Content Validity
Aleyna Altan; Zehra Taspinar Sener – Online Submission, 2023
This research aimed to develop a valid and reliable test to be used to detect sixth grade students' misconceptions and errors regarding the subject of fractions. A misconception diagnostic test has been developed that includes the concept of fractions, different representations of fractions, ordering and comparing fractions, equivalence of…
Descriptors: Diagnostic Tests, Mathematics Tests, Fractions, Misconceptions
Cifci, Musa; Kaplan, Kadir – Turkish Online Journal of Educational Technology - TOJET, 2020
An achievement test was prepared to determine students' caricature reading skills. In the first draft of the achievement test, 32 test items and four choices were prepared for each question. The item analysis of the data obtained from the pre-application was made and the internal consistency coefficient (KR-20) was calculated as 0.67 for the…
Descriptors: Reading Tests, Achievement Tests, Reading Skills, Literary Devices
Al-Jarf, Reima – Online Submission, 2023
This study explores the similarities and differences between English and Arabic numeral-based formulaic expressions, and difficulties that student-translators have with them. A corpus of English and Arabic numeral-based formulaic expressions containing zero, two, three, twenty, sixty, hundred, thousand…etc., and another corpus of specialized…
Descriptors: Translation, Arabic, Contrastive Linguistics, Phrase Structure
Rafi, Ibnu; Retnawati, Heri; Apino, Ezi; Hadiana, Deni; Lydiati, Ida; Rosyada, Munaya Nikma – Pedagogical Research, 2023
This study describes the characteristics of the test and its items used in the national-standardized school examination by applying classical test theory and focusing on the item difficulty, item discrimination, test reliability, and distractor analysis. We analyzed response data of 191 12th graders from one of public senior high schools in…
Descriptors: Foreign Countries, National Competency Tests, Standardized Tests, Mathematics Tests
Omarov, Nazarbek Bakytbekovich; Mohammed, Aisha; Alghurabi, Ammar Muhi Khleel; Alallo, Hajir Mahmood Ibrahim; Ali, Yusra Mohammed; Hassan, Aalaa Yaseen; Demeuova, Lyazat; Viktorovna, Shvedova Irina; Nazym, Bekenova; Al Khateeb, Nashaat Sultan Afif – International Journal of Language Testing, 2023
The Multiple-choice (MC) item format is commonly used in educational assessments due to its economy and effectiveness across a variety of content domains. However, numerous studies have examined the quality of MC items in high-stakes and higher-education assessments and found many flawed items, especially in terms of distractors. These faulty…
Descriptors: Test Items, Multiple Choice Tests, Item Response Theory, English (Second Language)
Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017
In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…
Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability
Smith, Tamarah; Smith, Samantha – International Journal of Teaching and Learning in Higher Education, 2018
The Research Methods Skills Assessment (RMSA) was created to measure psychology majors' statistics knowledge and skills. The American Psychological Association's Guidelines for the Undergraduate Major in Psychology (APA, 2007, 2013) served as a framework for development. Results from a Rasch analysis with data from n = 330 undergraduates showed…
Descriptors: Psychology, Statistics, Undergraduate Students, Item Response Theory