Publication Date
| In 2026 | 6 |
| Since 2025 | 2195 |
| Since 2022 (last 5 years) | 12710 |
| Since 2017 (last 10 years) | 33835 |
| Since 2007 (last 20 years) | 68326 |
Descriptor
| Foreign Countries | 30532 |
| Test Validity | 21728 |
| Scores | 18248 |
| Academic Achievement | 16912 |
| Test Construction | 16738 |
| Test Reliability | 15015 |
| Achievement Tests | 14839 |
| Standardized Tests | 14712 |
| Comparative Analysis | 14429 |
| Elementary Secondary Education | 13038 |
| Language Tests | 12549 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5034 |
| Teachers | 3391 |
| Researchers | 2630 |
| Policymakers | 1229 |
| Administrators | 976 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2815 |
| Australia | 2426 |
| Canada | 2269 |
| California | 1853 |
| United States | 1725 |
| Texas | 1615 |
| China | 1578 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1202 |
| Germany | 1121 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Belzak, William C. M. – Educational Measurement: Issues and Practice, 2023
Test developers and psychometricians have historically examined measurement bias and differential item functioning (DIF) across a single categorical variable (e.g., gender), independently of other variables (e.g., race, age, etc.). This is problematic when more complex forms of measurement bias may adversely affect test responses and, ultimately,…
Descriptors: Test Bias, High Stakes Tests, Artificial Intelligence, Test Items
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023
Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…
Descriptors: Test Reliability, Achievement Tests, Computation, Test Items
Sasima Charubusp; Orawan Wangsombat; Napatacha Sriwichai; Chanida Phongnapharuk – PASAA: Journal of Language Teaching and Learning in Thailand, 2025
Washback refers to the impact of a test on instruction and learning, with high-stakes tests exerting both positive and negative effects. This study examined the washback of an English exit exam (EEE) on English language learning at a Thai university where English-medium instruction is used in most academic disciplines. The EEE is an in-house…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Tests
André A. Rupp; Laura Pinsonneault – National Center for the Improvement of Educational Assessment, 2025
State education agencies are sitting on rich repositories of quantitative and qualitative assessment data. This document is designed to provide a conceptual framework and implementation guidance that can help agency leadership leverage and interrogate student performance data in systematic ways for reporting, outreach, and planning purposes. The…
Descriptors: Evaluation Methods, Educational Assessment, Achievement Tests, College Entrance Examinations
Mohammad Nayef Ayasrah; Mohamad Ahmad Saleem Khasawneh; Mazen Omar Almulla; Amoura Hassan Aboutaleb – Journal of Computer Assisted Learning, 2025
Background: One area that has been dramatically changed by artificial intelligence (AI) is educational environments. Chatbots, Recommender Systems, Adaptive Learning Systems and Large Language Models have been emerging as practical tools for facilitating learning. However, using such tools appropriately is challenging. In this regard, the…
Descriptors: Test Construction, Test Validity, Test Reliability, Rating Scales
Kartianom Kartianom; Heri Retnawati; Kana Hidayati – Journal of Pedagogical Research, 2024
Conducting a fair test is important for educational research. Unfair assessments can lead to gender disparities in academic achievement, ultimately resulting in disparities in opportunities, wages, and career choice. Differential Item Function [DIF] analysis is presented to provide evidence of whether the test is truly fair, where it does not harm…
Descriptors: Foreign Countries, Test Bias, Item Response Theory, Test Theory
Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024
Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…
Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity
Do-Hong Kim; Chuang Wang; Thi Nhu Ngoc Truong – Language Teaching Research, 2024
Researchers and practitioners in the field of second language acquisition have come to realize the importance of non-cognitive skills such as self-efficacy and self-regulation in students' learning of a second language. However, there has been limited systematic research on such measures in the second language context and the validity and…
Descriptors: Psychometrics, Test Content, Self Efficacy, English Language Learners
Marilena Z. Leana-Tascilar – Cogent Education, 2024
This study aimed to develop a comprehensive tool to assess underachievement in gifted students, incorporating input from parents, teachers, and students themselves. A total of 285 participants, including 95 gifted students, their parents, and teachers, were involved in the study. The results have revealed a four-factor structure for the Gifted…
Descriptors: Psychometrics, Academic Achievement, Underachievement, Academically Gifted
Sean N. Weeks; Tyler L. Renshaw; Allysia A. Rainey; Aubrey Hiatt – Journal of Emotional and Behavioral Disorders, 2024
Internalizing and externalizing problems are common targets for school mental health screening. Prior research supports the interpretation of scores from the Youth Internalizing Problems Screener (YIPS) and the Youth Externalizing Problems Screener (YEPS), which were developed separately yet intended as companion measures. We extended previous…
Descriptors: Adolescents, Screening Tests, Behavior Problems, Mental Health
Development of the American Sign Language Fingerspelling and Numbers Comprehension Test (ASL FaN-CT)
Corrine Occhino; Ryan Lidster; Leah C. Geer; Jason Listman; Peter C. Hauser – Language Testing, 2024
We describe the development and initial validation of the "ASL Fingerspelling and Number Comprehension Test" (ASL FaN-CT), a test of recognition proficiency for fingerspelled words in American Sign Language (ASL). Despite the relative frequency of fingerspelling in ASL discourse, learners commonly struggle to produce and perceive…
Descriptors: Language Tests, Test Construction, Finger Spelling, Test Validity
Ananda Aprilia; Wipsar Sunu Brams Dwandaru – Science Education International, 2024
This study focused on developing physics test instruments for senior high school students on the topics of temperature and heat. The study aimed to determine (i) the quality of the test instrument content, (ii) the feasibility of the test instrument, and (iii) students' graphic representation abilities on "Temperature and Heat" topics.…
Descriptors: Foreign Countries, High School Seniors, Secondary School Science, Physics
Chunyan Liu; Raja Subhiyah; Richard A. Feinberg – Applied Measurement in Education, 2024
Mixed-format tests that include both multiple-choice (MC) and constructed-response (CR) items have become widely used in many large-scale assessments. When an item response theory (IRT) model is used to score a mixed-format test, the unidimensionality assumption may be violated if the CR items measure a different construct from that measured by MC…
Descriptors: Test Format, Response Style (Tests), Multiple Choice Tests, Item Response Theory
Yuhei Kodani; Kazuki Sekine; Yasuhiro Tanaka; Shinsuke Nagami; Katsuya Nakamura; Shinya Fukunaga; Hikaru Nakamura – International Journal of Language & Communication Disorders, 2024
Background: The Scenario Test is recognised for its effectiveness in assessing the interactive aspects of functional communication in people with post-stroke aphasia (PWA). Aims: To develop a Japanese version of the Scenario Test (Scenario Test-JP) and assess its reliability and validity. Methods & Procedures: Among 66 participants, we…
Descriptors: Foreign Countries, Aphasia, Communication Disorders, Translation
Chunhua Liu; Panwang Yang – European Journal of Education, 2024
Student satisfaction in online live classes is considered an important criterion to evaluate the effectiveness of this instructional system. This study aims to develop a performance evaluation index to measure the satisfaction of students who have mastered Chinese language and literature through online live classes. Guided by survey techniques and…
Descriptors: Student Satisfaction, Online Courses, Performance Based Assessment, Chinese

Peer reviewed
Direct link
