ERIC - Search Results

Publication Date

In 2025	2
Since 2024	7
Since 2021 (last 5 years)	35
Since 2016 (last 10 years)	79
Since 2006 (last 20 years)	174

Descriptor

Test Bias	154
Test Items	99
Statistical Bias	93
Item Response Theory	81
Statistical Analysis	77
Simulation	63
Correlation	57
Error of Measurement	55
Sample Size	50
Comparative Analysis	47
Monte Carlo Methods	46
Computation	44
Models	38
Regression (Statistics)	35
Effect Size	30
Item Analysis	30
Scores	30
Bias	29
Factor Analysis	29
Test Validity	28
Foreign Countries	25
Evaluation Methods	24
Response Style (Tests)	24
Difficulty Level	22
Measurement Techniques	22
More ▼

Source

Educational and Psychological…

296

Publication Type

Journal Articles	269
Reports - Research	208
Reports - Evaluative	51
Reports - Descriptive	8
Speeches/Meeting Papers	5
Guides - Non-Classroom	1
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	13
Postsecondary Education	12
Secondary Education	11
Elementary Education	9
Middle Schools	9
Junior High Schools	6
Grade 3	4
Early Childhood Education	3
Grade 4	3
Grade 7	3
Intermediate Grades	3
Primary Education	3
Grade 6	2
High Schools	2
Adult Education	1
Grade 2	1
Grade 8	1
Grade 9	1
Kindergarten	1
Preschool Education	1
More ▼

Audience

Location

Germany	5
Canada	4
Georgia	3
Australia	2
California	2
China	2
Spain	2
Taiwan	2
United States	2
Alaska	1
Brazil	1
Florida	1
Greece	1
Illinois (Chicago)	1
Ireland	1
Israel	1
Netherlands	1
New Zealand	1
North Carolina (Durham)	1
Poland	1
Romania	1
Saudi Arabia	1
Singapore	1
Sweden	1
Taiwan (Taipei)	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Educational and Psychological Measurement X

Showing 1 to 15 of 296 results Save | Export

A Comparison of Response Time Threshold Scoring Procedures in Mitigating Bias from Rapid Guessing Behavior

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2024

Rapid guessing (RG) is a form of non-effortful responding that is characterized by short response latencies. This construct-irrelevant behavior has been shown in previous research to bias inferences concerning measurement properties and scores. To mitigate these deleterious effects, a number of response time threshold scoring procedures have been…

Descriptors: Reaction Time, Scores, Item Response Theory, Guessing (Tests)

Reevaluating the SIBTEST Classification Heuristics for Dichotomous Differential Item Functioning

Peer reviewed

Direct link

Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022

A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…

Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis

Testing for Differential Item Functioning under the "D"-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2022

This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as "D"-scoring method (DSM). Under the proposed approach, called "P-Z" method of testing for DIF, the item response functions of two groups (reference and focal) are compared by…

Descriptors: Test Bias, Methods, Test Items, Scoring

A Comparison of Person-Fit Indices to Detect Social Desirability Bias

Peer reviewed

Direct link

Nazari, Sanaz; Leite, Walter L.; Huggins-Manley, A. Corinne – Educational and Psychological Measurement, 2023

Social desirability bias (SDB) has been a major concern in educational and psychological assessments when measuring latent variables because it has the potential to introduce measurement error and bias in assessments. Person-fit indices can detect bias in the form of misfitted response vectors. The objective of this study was to compare the…

Descriptors: Social Desirability, Bias, Indexes, Goodness of Fit

Evaluating Imputation-Based Fit Statistics in Structural Equation Modeling with Ordinal Data: The Mi2S Approach

Peer reviewed

Direct link

Suppanut Sriutaisuk; Yu Liu; Seungwon Chung; Hanjoe Kim; Fei Gu – Educational and Psychological Measurement, 2025

The multiple imputation two-stage (MI2S) approach holds promise for evaluating the model fit of structural equation models for ordinal variables with multiply imputed data. However, previous studies only examined the performance of MI2S-based residual-based test statistics. This study extends previous research by examining the performance of two…

Descriptors: Structural Equation Models, Error of Measurement, Programming Languages, Goodness of Fit

The Trade-Off between Factor Score Determinacy and the Preservation of Inter-Factor Correlations

Peer reviewed

Direct link

André Beauducel; Norbert Hilger; Tobias Kuhl – Educational and Psychological Measurement, 2024

Regression factor score predictors have the maximum factor score determinacy, that is, the maximum correlation with the corresponding factor, but they do not have the same inter-correlations as the factors. As it might be useful to compute factor score predictors that have the same inter-correlations as the factors, correlation-preserving factor…

Descriptors: Scores, Factor Analysis, Correlation, Predictor Variables

Enhancing the Detection of Social Desirability Bias Using Machine Learning: A Novel Application of Person-Fit Indices

Peer reviewed

Direct link

Sanaz Nazari; Walter L. Leite; A. Corinne Huggins-Manley – Educational and Psychological Measurement, 2024

Social desirability bias (SDB) is a common threat to the validity of conclusions from responses to a scale or survey. There is a wide range of person-fit statistics in the literature that can be employed to detect SDB. In addition, machine learning classifiers, such as logistic regression and random forest, have the potential to distinguish…

Descriptors: Social Desirability, Bias, Artificial Intelligence, Identification

The Impact and Detection of Uniform Differential Item Functioning for Continuous Item Response Models

Peer reviewed

Direct link

Finch, W. Holmes – Educational and Psychological Measurement, 2023

Psychometricians have devoted much research and attention to categorical item responses, leading to the development and widespread use of item response theory for the estimation of model parameters and identification of items that do not perform in the same way for examinees from different population subgroups (e.g., differential item functioning…

Descriptors: Test Bias, Item Response Theory, Computation, Methods

Correcting for Extreme Response Style: Model Choice Matters

Peer reviewed

Direct link

Martijn Schoenmakers; Jesper Tijmstra; Jeroen Vermunt; Maria Bolsinova – Educational and Psychological Measurement, 2024

Extreme response style (ERS), the tendency of participants to select extreme item categories regardless of the item content, has frequently been found to decrease the validity of Likert-type questionnaire results. For this reason, various item response theory (IRT) models have been proposed to model ERS and correct for it. Comparisons of these…

Descriptors: Item Response Theory, Response Style (Tests), Models, Likert Scales

Effects of Compounded Nonnormality of Residuals in Hierarchical Linear Modeling

Peer reviewed

Direct link

Man, Kaiwen; Schumacker, Randall; Morell, Monica; Wang, Yurou – Educational and Psychological Measurement, 2022

While hierarchical linear modeling is often used in social science research, the assumption of normally distributed residuals at the individual and cluster levels can be violated in empirical data. Previous studies have focused on the effects of nonnormality at either lower or higher level(s) separately. However, the violation of the normality…

Descriptors: Hierarchical Linear Modeling, Statistical Distributions, Statistical Bias, Computation

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights from a Novel Modeling Approach

Peer reviewed

Direct link

Hung-Yu Huang – Educational and Psychological Measurement, 2025

The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…

Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability

Implementing a Standardized Effect Size in the POLYSIBTEST Procedure

Peer reviewed

Direct link

Weese, James D.; Turner, Ronna C.; Liang, Xinya; Ames, Allison; Crawford, Brandon – Educational and Psychological Measurement, 2023

A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and…

Descriptors: Effect Size, Classification, Guidelines, Statistical Analysis

Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery

Peer reviewed

Direct link

Mostafa Hosseinzadeh; Ki Lynn Matlock Cole – Educational and Psychological Measurement, 2024

In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was…

Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Algorithms

Detecting Preknowledge Cheating via Innovative Measures: A Mixture Hierarchical Model for Jointly Modeling Item Responses, Response Times, and Visual Fixation Counts

Peer reviewed

Direct link

Man, Kaiwen; Harring, Jeffrey R. – Educational and Psychological Measurement, 2023

Preknowledge cheating jeopardizes the validity of inferences based on test results. Many methods have been developed to detect preknowledge cheating by jointly analyzing item responses and response times. Gaze fixations, an essential eye-tracker measure, can be utilized to help detect aberrant testing behavior with improved accuracy beyond using…

Descriptors: Cheating, Reaction Time, Test Items, Responses

The Sampling Ratio in Multilevel Structural Equation Models: Considerations to Inform Study Design

Peer reviewed
PDF on ERIC

Download full text

Direct link

Kush, Joseph M.; Konold, Timothy R.; Bradshaw, Catherine P. – Educational and Psychological Measurement, 2022

Multilevel structural equation modeling (MSEM) allows researchers to model latent factor structures at multiple levels simultaneously by decomposing within- and between-group variation. Yet the extent to which the sampling ratio (i.e., proportion of cases sampled from each group) influences the results of MSEM models remains unknown. This article…

Descriptors: Structural Equation Models, Factor Structure, Statistical Bias, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 20

Finch, W. Holmes	8
Wang, Wen-Chung	6
Zumbo, Bruno D.	5
Beretvas, S. Natasha	4
French, Brian F.	4
Kromrey, Jeffrey D.	4
Oshima, T. C.	4
Strobl, Carolin	4
Walker, Cindy M.	4
Ahn, Soyeon	3
DeMars, Christine E.	3
Dimitrov, Dimiter M.	3
Huggins-Manley, Anne Corinne	3
Magis, David	3
Paek, Insu	3
Penfield, Randall D.	3
Plake, Barbara S.	3
Raju, Nambury S.	3
Rosseel, Yves	3
Shih, Ching-Lin	3
Wilson, Mark	3
Zeileis, Achim	3
A. Corinne Huggins-Manley	2
Alliger, George M.	2
More ▼

SAT (College Admission Test)	6
Program for International…	4
Wechsler Intelligence Scale…	3
California Achievement Tests	2
Georgia Criterion Referenced…	2
Graduate Record Examinations	2
Marlowe Crowne Social…	2
National Assessment of…	2
SRA Achievement Series	2
Sixteen Personality Factor…	2
Stanford Achievement Tests	2
Wechsler Adult Intelligence…	2
ACT Assessment	1
Beck Depression Inventory	1
Boehm Test of Basic Concepts	1
Center for Epidemiologic…	1
Cognitive Abilities Test	1
College Board Achievement…	1
Comprehensive Tests of Basic…	1
Draw a Person Test	1
Early Childhood Longitudinal…	1
Estes Attitude Scale	1
Florida Comprehensive…	1
Iowa Tests of Basic Skills	1
Kaufman Assessment Battery…	1
More ▼