Publication Date
| In 2026 | 0 |
| Since 2025 | 7 |
| Since 2022 (last 5 years) | 36 |
Descriptor
| Test Validity | 36 |
| Testing | 36 |
| Test Reliability | 20 |
| Test Construction | 14 |
| Evaluation Methods | 9 |
| Foreign Countries | 9 |
| Student Evaluation | 9 |
| Test Format | 9 |
| Language Tests | 8 |
| Test Items | 8 |
| Computer Assisted Testing | 7 |
| More ▼ | |
Source
Author
| Jeff Allen | 2 |
| Ty Cruce | 2 |
| Al-Tamimi, Mohammad | 1 |
| Amery D. Wu | 1 |
| Amit Sevak | 1 |
| Amy K. Clark | 1 |
| Andres De Los Reyes | 1 |
| Anne Wicks | 1 |
| Badran, Darwish | 1 |
| Beach, Pamela | 1 |
| Belknap, Katriana | 1 |
| More ▼ | |
Publication Type
Education Level
Audience
| Teachers | 3 |
| Policymakers | 1 |
| Students | 1 |
Location
| Indonesia | 2 |
| Nebraska | 2 |
| Philippines | 2 |
| California | 1 |
| Iran | 1 |
| Israel (Tel Aviv) | 1 |
| New York | 1 |
| Norway | 1 |
| South Africa | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| ACT Assessment | 2 |
| Measures of Academic Progress | 2 |
| Program for International… | 1 |
| Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024
This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…
Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Kun Su – ProQuest LLC, 2022
This dissertation provides a start-to-finish description of development, administration, and validation for an online middle-school physics test using a DCM framework with response-time. The first paper illustrated the process of implementing DCM with a careful selection of the content domain and a simulation approach for a Q-matrix construction.…
Descriptors: Science Instruction, Physics, Middle Schools, Testing
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Bruno D. Zumbo – International Journal of Assessment Tools in Education, 2023
In line with the journal volume's theme, this essay considers lessons from the past and visions for the future of test validity. In the first part of the essay, a description of historical trends in test validity since the early 1900s leads to the natural question of whether the discipline has progressed in its definition and description of test…
Descriptors: Test Theory, Test Validity, True Scores, Definitions
Susan K. Johnsen – Gifted Child Today, 2024
The author provides a checklist for educators who are selecting technically adequate tests for identifying and referring students for gifted education services and programs. The checklist includes questions related to how the test was normed, reliability and validity studies as well as questions related to types of scores, administration, and…
Descriptors: Test Selection, Academically Gifted, Gifted Education, Test Validity
Ian Phil Canlas; Joyce Molino-Magtolis – Journal of Biological Education, 2024
The use of drawing as an assessment tool to reveal students' conceptions in biology specifically on human organs and organ systems is not new, however, there is a deficit in the literature that attempted to explore and reflect on its usefulness and relevance specifically, in eliciting students' preconceptions related thereto. Making use of a…
Descriptors: Foreign Countries, Preservice Teacher Education, Preservice Teachers, Biology
Yan Jin; Jason Fan – Language Assessment Quarterly, 2023
In language assessment, AI technology has been incorporated in task design, assessment delivery, automated scoring of performance-based tasks, score reporting, and provision of feedback. AI technology is also used for collecting and analyzing performance data in language assessment validation. Research has been conducted to investigate the…
Descriptors: Language Tests, Artificial Intelligence, Computer Assisted Testing, Test Format
Han, Chao – Language Testing, 2022
Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the…
Descriptors: Translation, Language Tests, Testing, Evaluation Methods
Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025
This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…
Descriptors: College Entrance Examinations, Testing, Change, Test Construction
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Anne Wicks; Robin Berkley – George W. Bush Institute, 2025
Assessments are one of the most important--and often misunderstood--elements of education. In most cases, tests are administered by the state as well as by districts and schools. Assessments at each of these levels have distinct purposes, yield different information, and are part of a powerful, coordinated approach to improving student outcomes.…
Descriptors: Student Evaluation, Testing, Tests, Standardized Tests
Militsa G. Ivanova; Michalis P. Michaelides – Practical Assessment, Research & Evaluation, 2023
Research on methods for measuring examinee engagement with constructed-response items is limited. The present study used data from the PISA 2018 Reading domain to construct and compare indicators of test-taking effort on constructed-response items: response time, number of actions, the union (combining effortless responses detected by either…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Jeff Allen; Ty Cruce – ACT Education Corp., 2025
This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…
Descriptors: College Entrance Examinations, Testing, Change, Scores

Peer reviewed
Direct link
