Publication Date
| In 2026 | 0 |
| Since 2025 | 7 |
| Since 2022 (last 5 years) | 35 |
| Since 2017 (last 10 years) | 102 |
| Since 2007 (last 20 years) | 176 |
Descriptor
| Test Bias | 415 |
| Test Reliability | 415 |
| Test Validity | 274 |
| Test Construction | 125 |
| Test Items | 86 |
| Testing Problems | 77 |
| Scores | 65 |
| Item Response Theory | 58 |
| Standardized Tests | 57 |
| Elementary Secondary Education | 55 |
| Testing | 55 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 11 |
| Practitioners | 8 |
| Administrators | 7 |
| Teachers | 7 |
| Policymakers | 4 |
| Support Staff | 3 |
| Parents | 2 |
| Community | 1 |
| Counselors | 1 |
| Students | 1 |
Location
| New York | 7 |
| California | 6 |
| Florida | 6 |
| Illinois | 6 |
| Canada | 5 |
| Turkey | 5 |
| Australia | 4 |
| China | 4 |
| Indonesia | 4 |
| Singapore | 4 |
| Texas | 4 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024
This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…
Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction
Menold, Natalja – Field Methods, 2023
While numerical bipolar rating scales may evoke positivity bias, little is known about the corresponding bias in verbal bipolar rating scales. The choice of verbalization of the middle category may lead to response bias, particularly if it is not in line with the scale polarity. Unipolar and bipolar seven-category rating scales in which the…
Descriptors: Rating Scales, Test Bias, Verbal Tests, Responses
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Novina Sabila Zahra; Hillman Wirawan – Measurement: Interdisciplinary Research and Perspectives, 2025
Technology development has triggered digital transformation in various organizations, influencing work processes, communication, and innovation. Digital leadership plays a crucial role in directing and managing this transformation. This research aims to develop a new measurement tool for assessing digital leadership using the Rasch Model for…
Descriptors: Leadership, Measures (Individuals), Test Validity, Item Response Theory
Akif Avcu – International Journal of Psychology and Educational Studies, 2025
This review explores the significant contributions of Rasch modeling in enhancing classroom assessment practices, particularly in measuring student attitudes. Classroom assessment has evolved from standardized testing to integrative practices that emphasize both academic and affective dimensions of student development. Accurate attitude…
Descriptors: Item Response Theory, Student Attitudes, Student Evaluation, Attitude Measures
Tien-Ling Hu; Dubravka Svetina Valdivia – Research in Higher Education, 2024
Undergraduate research, recognized as one of the High-Impact Practices (HIPs), has demonstrated a positive association with diverse student learning outcomes. Understanding the pivotal quality factors essential for its efficacy is important for enhancing student success. This study evaluates the psychometric properties of survey items employed to…
Descriptors: Undergraduate Students, Student Research, Student Experience, Psychometrics
Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023
We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…
Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length
Nicholas W. Affrunti; Eric Rossen – National Association of School Psychologists, 2023
In this data brief, we examine the scores and pass rates for the Praxis School Psychologist tests (both Praxis 5402 and the newer version, Praxis 5403) by racial-ethnic group and gender for the period September 1, 2022 to August 31, 2023. The Praxis School Psychologist tests are the most often used external assessment of competency by school…
Descriptors: School Psychology, School Psychologists, Counselor Certification, Test Bias
Catherine Mata; Katharine Meyer; Lindsay Page – Annenberg Institute for School Reform at Brown University, 2024
This article examines the risk of crossover contamination in individual-level randomization, a common concern in experimental research, in the context of a large-enrollment college course. While individual-level randomization is more efficient for assessing program effectiveness, it also increases the potential for control group students to cross…
Descriptors: Chemistry, Science Instruction, Undergraduate Students, Large Group Instruction
Troy L. Cox; Gregory L. Thompson; Steven S. Stokes – Foreign Language Annals, 2025
This study investigated the differences between the ACTFL Oral Proficiency Interview (OPI) and the ACTFL Oral Proficiency Interview - Computer (OPIc) among Spanish learners at a U.S. university. Participants (N = 154) were randomly assigned to take both tests in a counterbalanced order to mitigate test order effects. Data were analyzed using an…
Descriptors: Oral Language, Language Proficiency, Interviews, Computer Uses in Education
Muslihin, Heri Yusuf; Suryana, Dodi; Ahman; Suherman, Uman; Dahlan, Tina Hayati – International Journal of Instruction, 2022
Self-determination can affect students to have a positive way of thinking and acting, also to make realistic choices so they can make a decision responsibly. This study aimed to develop a questionnaire to measure student self-determination and validate it. This study was conducted in 2019, involved 406 university students as participants…
Descriptors: Test Validity, Test Reliability, Item Response Theory, Questionnaires
Maïano, Christophe; Morin, Alexandre J. S.; Gagnon, Cynthia; Olivier, Elizabeth; Tracey, Danielle; Craven, Rhonda G.; Bouchard, Stéphane – Journal of Autism and Developmental Disorders, 2023
The objective of the study was to validate adapted versions of the Glasgow Anxiety Scale for people with Intellectual Disabilities (GAS-ID) simultaneously developed in English and French. A sample of 361 youth with mild to moderate intellectual disability (ID) (M = 15.78 years) from Australia (English-speaking) and Canada (French-speaking)…
Descriptors: Intellectual Disability, Anxiety, French, English
van Rensburg, Clarisse; Mostert, Karina – Journal of Student Affairs in Africa, 2023
Student well-being has gradually become a topic of interest in higher education, and the accurate, valid, and reliable measure of well-being constructs is crucial in the South African context. This study examined item bias and configural, metric and scalar invariance of the Satisfaction with Life Scale (SWLS) for South African first-year…
Descriptors: Life Satisfaction, Measures (Individuals), Foreign Countries, College Freshmen
Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021
Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…
Descriptors: Test Reliability, Scores, Pretests Posttests, Computation
Liou, Gloria; Bonner, Cavan V.; Tay, Louis – International Journal of Testing, 2022
With the advent of big data and advances in technology, psychological assessments have become increasingly sophisticated and complex. Nevertheless, traditional psychometric issues concerning the validity, reliability, and measurement bias of such assessments remain fundamental in determining whether score inferences of human attributes are…
Descriptors: Psychometrics, Computer Assisted Testing, Adaptive Testing, Data

Peer reviewed
Direct link
