Publication Date
| In 2026 | 8 |
| Since 2025 | 2276 |
| Since 2022 (last 5 years) | 12791 |
| Since 2017 (last 10 years) | 33916 |
| Since 2007 (last 20 years) | 68407 |
Descriptor
| Foreign Countries | 30560 |
| Test Validity | 21743 |
| Scores | 18256 |
| Academic Achievement | 16928 |
| Test Construction | 16756 |
| Test Reliability | 15028 |
| Achievement Tests | 14859 |
| Standardized Tests | 14720 |
| Comparative Analysis | 14431 |
| Elementary Secondary Education | 13042 |
| Language Tests | 12551 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5034 |
| Teachers | 3393 |
| Researchers | 2630 |
| Policymakers | 1232 |
| Administrators | 978 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2822 |
| Australia | 2426 |
| Canada | 2270 |
| California | 1854 |
| United States | 1726 |
| Texas | 1615 |
| China | 1578 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1202 |
| Germany | 1122 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
B. Goecke; S. Weiss; B. Barbot – Journal of Creative Behavior, 2025
The present paper questions the content validity of the eight creativity-related self-report scales available in PISA 2022's context questionnaire and provides a set of considerations for researchers interested in using these indexes. Specifically, we point out some threats to the content validity of these scales (e.g., "creative thinking…
Descriptors: Creativity, Creativity Tests, Questionnaires, Content Validity
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
S. Kanageswari Suppiah Shanmugam; Arsaythamby Veloo; Suheysen Revindran – Practical Assessment, Research & Evaluation, 2025
Conventional mathematics testing often fails to reflect the diverse cultural backgrounds and lived experiences of Indigenous pupils. While efforts to improve educational access for Indigenous communities have increased, less emphasis has been placed on adapting test development processes to align with Indigenous learners' linguistic backgrounds…
Descriptors: Mathematics Tests, Cultural Relevance, Indigenous Populations, Minority Group Students
Victoria Gray Palmer – New England College Journal of Applied Educational Research, 2025
For at-risk diverse students who score lower points on district/state/federal testing as well as on other standardized tests such as the SAT, (formerly known as the Scholastic Assessment Test), these students are gatekept out of higher-level courses in high school due to "teaching to the test" policies, gatekept out of entrance to…
Descriptors: Standardized Tests, At Risk Students, High Stakes Tests, Test Bias
Maria Treadaway; John Read – Language Testing, 2024
Standard-setting is an essential component of test development, supporting the meaningfulness and appropriate interpretation of test scores. However, in the high-stakes testing environment of aviation, standard-setting studies are underexplored. To address this gap, we document two stages in the standard-setting procedures for the Overseas Flight…
Descriptors: Standard Setting, Diagnostic Tests, High Stakes Tests, English for Special Purposes
Kent Anderson Seidel – School Leadership Review, 2025
This paper examines one of three central diagnostic tools of the Concerns Based Adoption Model, the Stages of Concern Questionnaire (SoCQ). The SoCQ was developed with a focus on K12 education. It has been used widely since developed in 1973, in early childhood, higher education, medical, business, community, and military settings. The SoCQ…
Descriptors: Questionnaires, Educational Change, Educational Innovation, Intervention
Deniz Mertkan Gezgin; Tugba Türk Kurtça – Education and Information Technologies, 2025
The purpose of this research is to create a reliable and valid scale to assess AIlessphobia in Education (the fear of being without Artificial Intelligence in education) among university students. In three phases, a sample of 1378 undergraduate students from different faculties at a public university participated in the reliability and validity…
Descriptors: Test Construction, Fear, Artificial Intelligence, Psychometrics
Joanna Williamson – Research Matters, 2025
Teachers, examiners and assessment experts know from experience that some candidates annotate exam questions. "Annotation" includes anything the candidate writes or draws outside of the designated response space, such as underlining, jotting, circling, sketching and calculating. Annotations are of interest because they may evidence…
Descriptors: Mathematics, Tests, Documentation, Secondary Education
Chunyan Shi – SAGE Open, 2025
The National Matriculation English Test (NMET), also known as the Gaokao English Examination, is a high-stakes, large-scale selection test for tertiary education in China, with numerous provinces and regions adopting the standardized national test papers. The validity of the NMET has garnered extensive attention. To verify the NMET's validity, it…
Descriptors: Foreign Countries, English (Second Language), Language Tests, High Stakes Tests
Construction and Validation of a Multilingual Diagnostic Instrument for Neuromyths and Their Origins
Oktay Cem Adigüzel; Sibel Küçükkayhan; Patrice Potvin; Derya Atik-Kara – International Journal of Assessment Tools in Education, 2025
This study presents the development of a comprehensive neuromyth identification tool designed to be valid, reliable, and multilingual, including French, English, Turkish, Greek, Kazakh, Arabic, Malay, and Chinese. By incorporating languages from diverse geographic regions, the tool aims to increase the accessibility and relevance of neuromyth…
Descriptors: Diagnostic Tests, Test Construction, Multilingualism, Test Validity
Nebraska Department of Education, 2024
The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…
Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students
Cornelia Eva Neuert – Sociological Methods & Research, 2024
The quality of data in surveys is affected by response burden and questionnaire length. With an increasing number of questions, respondents can become bored, tired, and annoyed and may take shortcuts to reduce the effort needed to complete the survey. In this article, direct evidence is presented on how the position of items within a web…
Descriptors: Online Surveys, Test Items, Test Format, Test Construction
Marjolein Muskens; Willem E. Frankenhuis; Lex Borghans – npj Science of Learning, 2024
In many countries, standardized math tests are important for achieving academic success. Here, we examine whether content of items, the story that explains a mathematical question, biases performance of low-SES students. In a large-scale cohort study of Trends in International Mathematics and Science Studies (TIMSS)--including data from 58…
Descriptors: Mathematics Tests, Standardized Tests, Test Items, Low Income Students
Using Performance Assessments Instead of High-Stakes Tests: A Promising Strategy for a Better Future
Hani Morgan – Policy Futures in Education, 2025
According to the National Assessment of Educational Progress (NAEP), US reading and math scores have recently dropped. This decline is believed to be the result of the school closings that occurred during the COVID-19 pandemic. The drop likely contributed to the increase in pressure educators are feeling to teach in a way that leads to higher test…
Descriptors: Performance Based Assessment, High Stakes Tests, Standardized Tests, Testing Problems
Erin Johnson; Samantha Barstack; Yikai Xu; Hannah Wise; Bradley T. Erford; Catharina Chang; David Delmonico – Measurement and Evaluation in Counseling and Development, 2025
Problem Statement: Among individuals aged 12 years or older, 14.3% (40.0 million) reporting the use of an illicit drug in the previous year. Given the prevalence of drug abuse, it is increasingly important to determine effective screening practices, treatment procedures, and best practices among various subpopulations to identify drug use-related…
Descriptors: Drug Abuse, Screening Tests, Psychometrics, Synthesis

Peer reviewed
Direct link
