ERIC - Search Results

Publication Date

In 2026	0
Since 2025	17

Source

Annenberg Institute for…	2
Structural Equation Modeling:…	2
ACT Education Corp.	1
Advances in Physiology…	1
Autism: The International…	1
ETS Research Report Series	1
Education and Information…	1
Educational Measurement:…	1
Educational and Psychological…	1
Environmental Education…	1
Gifted Child Today	1
International Journal of…	1
Measurement in Physical…	1
Psychology in the Schools	1
Sociological Methods &…	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	13
Reports - Descriptive	2
Reports - Evaluative	2
Guides - Non-Classroom	1
Information Analyses	1

Education Level

Secondary Education	5
Elementary Education	3
Higher Education	2
Junior High Schools	2
Middle Schools	2
Postsecondary Education	2
High Schools	1

Audience

Location

Belgium	1
China	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Program for International…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

Measurement Invariance of the Action Competence in Sustainable Development Questionnaire: Can We Compare between Groups?

Peer reviewed

Direct link

M. Van Harskamp; S. De Maeyer; W. Sass; P. Van Petegem; J. Boeve-de Pauw – Environmental Education Research, 2025

There is a need for valid and reliable instruments to assess learning outcomes in education for sustainable development (ESD). Measurement invariance (MI) needs to be established before results of these instruments can be validly compared between groups. Despite its importance, establishing MI is an often overlooked validation step. To provide an…

Descriptors: Measurement, Sustainable Development, Error of Measurement, Questionnaires

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights from a Novel Modeling Approach

Peer reviewed

Direct link

Hung-Yu Huang – Educational and Psychological Measurement, 2025

The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…

Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Confirming Increased Statistical Errors in Testing Correlations from Small Sample Sizes

Peer reviewed

Direct link

Duane Knudson – Measurement in Physical Education and Exercise Science, 2025

Small sample sizes contribute to several problems in research and knowledge advancement. This conceptual replication study confirmed and extended the inflation of type II errors and confidence intervals in correlation analyses of small sample sizes common in kinesiology/exercise science. Current population data (N = 18, 230, & 464) on four…

Descriptors: Kinesiology, Exercise, Biomechanics, Movement Education

Evaluating Measurement Invariance of Students' Practices Regarding Online Information Questionnaire in PISA 2022: A Comparative Study Using MGCFA and Alignment Method

Peer reviewed

Direct link

Esra Sözer Boz – Education and Information Technologies, 2025

International large-scale assessments provide cross-national data on students' cognitive and non-cognitive characteristics. A critical methodological issue that often arises in comparing data from cross-national studies is ensuring measurement invariance, indicating that the construct under investigation is the same across the compared groups.…

Descriptors: Achievement Tests, International Assessment, Foreign Countries, Secondary School Students

Enhancing Model Fit Evaluation in SEM: Practical Tips for Optimizing Chi-Square Tests

Peer reviewed

Direct link

Bang Quan Zheng; Peter M. Bentler – Structural Equation Modeling: A Multidisciplinary Journal, 2025

This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can…

Descriptors: Monte Carlo Methods, Structural Equation Models, Goodness of Fit, Robustness (Statistics)

Are the Signs of Factor Loadings Arbitrary in Confirmatory Factor Analysis? Problems and Solutions

Peer reviewed

Direct link

Dandan Tang; Steven M. Boker; Xin Tong – Structural Equation Modeling: A Multidisciplinary Journal, 2025

The replication crisis in social and behavioral sciences has raised concerns about the reliability and validity of empirical studies. While research in the literature has explored contributing factors to this crisis, the issues related to analytical tools have received less attention. This study focuses on a widely used analytical tool -…

Descriptors: Test Validity, Factor Analysis, Replication (Evaluation), Social Science Research

Quality-of-Life Measurement in Randomised Controlled Trials of Mental Health Interventions for Autistic Adults: A Systematic Review

Peer reviewed

Direct link

Amanda Timmerman; Vasiliki Totsika; Valerie Lye; Laura Crane; Audrey Linden; Elizabeth Pellicano – Autism: The International Journal of Research and Practice, 2025

Autistic people are more likely to have co-occurring mental health conditions compared to the general population, and mental health interventions have been identified as a top research priority by autistic people and the wider autism community. Autistic adults have also communicated that quality of life is the outcome that matters most to them in…

Descriptors: Adults, Autism Spectrum Disorders, Quality of Life, Randomized Controlled Trials

The Vague Language Use Scale: Clinical Utility and Psychometrics from Adults with Traumatic Brain Injury

Peer reviewed

Direct link

Kathryn J. Greenslade; Julia K. Bushell; Emily F. Dillon; Amy E. Ramage – International Journal of Language & Communication Disorders, 2025

Background: Pragmatic communication difficulties encompass many distinct behaviours, including the use of vague and/or insufficient language, a common characteristic following traumatic brain injury (TBI) that negatively impacts psychosocial outcomes. Existing assessments evaluate pragmatic communication broadly, often with only one or two items…

Descriptors: Neurological Impairments, Head Injuries, Language Impairments, Language Tests

Factor Structure and Measurement Invariance of the Chinese Reading Strategy Scale (CRSS): A Bifactor Analysis

Peer reviewed

Direct link

Na Liu; Mingren Zhao; Rui Jin – Psychology in the Schools, 2025

Empirical research has extensively used the reading strategy scales to explore English reading strategies within the Chinese context. However, there is a lack of studies examining the reading strategies used by Chinese students in their native language (L1). Considering the differences between Chinese logographic characters and alphabetic English…

Descriptors: Reading Strategies, Rating Scales, Chinese, Native Language

TOEFL iBT® Technical Manual. TOEFL® Research Series. RR-106. ETS Research Report. RR-25-12

Peer reviewed
PDF on ERIC

Download full text

Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025

This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

Lagged Dependent Variable Predictors, Classical Measurement Error, and Path Dependency: The Conditions under Which Various Estimators Are Appropriate

Peer reviewed

Direct link

Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025

Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…

Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables

How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116

Download full text

Paul T. von Hippel; Brendan A. Schuetze – Annenberg Institute for School Reform at Brown University, 2025

Researchers across many fields have called for greater attention to heterogeneity of treatment effects--shifting focus from the average effect to variation in effects between different treatments, studies, or subgroups. True heterogeneity is important, but many reports of heterogeneity have proved to be false, non-replicable, or exaggerated. In…

Descriptors: Educational Research, Replication (Evaluation), Generalizability Theory, Inferences

Previous Page | Next Page »

Pages: 1 | 2

Error of Measurement	17
Test Reliability	17
Test Validity	10
Robustness (Statistics)	5
Evaluation Methods	4
Factor Analysis	3
Foreign Countries	3
Interrater Reliability	3
Rating Scales	3
Scores	3
Scoring	3
Test Construction	3
Testing	3
Accuracy	2
Behavioral Science Research	2
Elementary School Students	2
Goodness of Fit	2
Item Response Theory	2
Language Tests	2
Middle School Students	2
Predictor Variables	2
Replication (Evaluation)	2
Secondary School Students	2
Social Science Research	2
Structural Equation Models	2
More ▼

Amanda Timmerman	1
Amy E. Ramage	1
Anders Hjorth-Trolle	1
Anders Holm	1
Audrey Linden	1
Bang Quan Zheng	1
Benjamin W. Domingue	1
Brendan A. Schuetze	1
Dandan Tang	1
Derek C. Briggs	1
Duane Knudson	1
Elizabeth Pellicano	1
Emily F. Dillon	1
Esra Sözer Boz	1
Hung-Yu Huang	1
J. Boeve-de Pauw	1
James G. Soland	1
Jeff Allen	1
Joshua B. Gilbert	1
Julia K. Bushell	1
Kathryn J. Greenslade	1
Laura Crane	1
Laurie Davis	1
Lixiong Gu	1
M. Van Harskamp	1
More ▼