ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	13
Since 2006 (last 20 years)	35

Source

Educational and Psychological…

Publication Type

Journal Articles	51
Reports - Research	37
Reports - Evaluative	13
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	6
Postsecondary Education	3
Grade 3	2
Kindergarten	2
Adult Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 1	1
Grade 2	1
Grade 4	1
Grade 6	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Location

Canada	2
China	1
Jamaica	1
Netherlands (Amsterdam)	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Eysenck Personality Inventory	2
Marlowe Crowne Social…	2
SAT (College Admission Test)	2
Wechsler Intelligence Scale…	2
Beck Depression Inventory	1
Childrens Depression Inventory	1
Graduate Record Examinations	1
Law School Admission Test	1
Minnesota Multiphasic…	1
National Assessment of…	1
Program for International…	1
Raven Advanced Progressive…	1
Raven Progressive Matrices	1
Rotter Internal External…	1
Torrance Tests of Creative…	1
United States Medical…	1
Wechsler Adult Intelligence…	1
Wechsler Memory Scale	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 54 results Save | Export

The Trade-Off between Factor Score Determinacy and the Preservation of Inter-Factor Correlations

Peer reviewed

Direct link

André Beauducel; Norbert Hilger; Tobias Kuhl – Educational and Psychological Measurement, 2024

Regression factor score predictors have the maximum factor score determinacy, that is, the maximum correlation with the corresponding factor, but they do not have the same inter-correlations as the factors. As it might be useful to compute factor score predictors that have the same inter-correlations as the factors, correlation-preserving factor…

Descriptors: Scores, Factor Analysis, Correlation, Predictor Variables

Linear Factor Analytic Thurstonian Forced-Choice Models: Current Status and Issues

Peer reviewed

Direct link

Markus T. Jansen; Ralf Schulze – Educational and Psychological Measurement, 2024

Thurstonian forced-choice modeling is considered to be a powerful new tool to estimate item and person parameters while simultaneously testing the model fit. This assessment approach is associated with the aim of reducing faking and other response tendencies that plague traditional self-report trait assessments. As a result of major recent…

Descriptors: Factor Analysis, Models, Item Analysis, Evaluation Methods

Item-Score Reliability in Empirical-Data Sets and Its Relationship with Other Item Indices

Peer reviewed

Direct link

Zijlmans, Eva A. O.; Tijmstra, Jesper; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2018

Reliability is usually estimated for a total score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the repeatability of an individual item score in a group. Three methods to estimate item-score reliability are discussed, known as method MS, method [lambda][subscript 6], and method CA. The item-score…

Descriptors: Test Items, Test Reliability, Correlation, Comparative Analysis

Measuring Response Style Stability across Constructs with Item Response Trees

Peer reviewed

Direct link

Ames, Allison J. – Educational and Psychological Measurement, 2022

Individual response style behaviors, unrelated to the latent trait of interest, may influence responses to ordinal survey items. Response style can introduce bias in the total score with respect to the trait of interest, threatening valid interpretation of scores. Despite claims of response style stability across scales, there has been little…

Descriptors: Response Style (Tests), Individual Differences, Scores, Test Items

Improvement of Norm Score Quality via Regression-Based Continuous Norming

Peer reviewed

Direct link

Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021

The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…

Descriptors: Test Norms, Scores, Regression (Statistics), Test Items

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

Variability in the Results of Meta-Analysis as a Function of Comparing Effect Sizes Based on Scores from Noncomparable Measures: A Simulation Study

Peer reviewed

Direct link

Nugent, William R. – Educational and Psychological Measurement, 2017

Meta-analysis is a significant methodological advance that is increasingly important in research synthesis. Fundamental to meta-analysis is the presumption that effect sizes, such as the standardized mean difference (SMD), based on scores from different measures are comparable. It has been argued that population observed score SMDs based on scores…

Descriptors: Meta Analysis, Effect Size, Comparative Analysis, Scores

The Total Score with Maximal Reliability and Maximal Criterion Validity: An Illustration Using a Career Satisfaction Measure

Peer reviewed

Direct link

Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2018

The maximal reliability of a congeneric measure is achieved by weighting item scores to form the optimal linear combination as the total score; it is never lower than the composite reliability of the measure when measurement errors are uncorrelated. The statistical method that renders maximal reliability would also lead to maximal criterion…

Descriptors: Test Reliability, Test Validity, Comparative Analysis, Attitude Measures

Evaluation of Two Methods for Modeling Measurement Errors When Testing Interaction Effects with Observed Composite Scores

Peer reviewed

Direct link

Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C. – Educational and Psychological Measurement, 2018

Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…

Descriptors: Error of Measurement, Testing, Scores, Models

Does Matching Quality Matter in Mode Comparison Studies?

Peer reviewed

Direct link

Zeng, Ji; Yin, Ping; Shedden, Kerby A. – Educational and Psychological Measurement, 2015

This article provides a brief overview and comparison of three matching approaches in forming comparable groups for a study comparing test administration modes (i.e., computer-based tests [CBT] and paper-and-pencil tests [PPT]): (a) a propensity score matching approach proposed in this article, (b) the propensity score matching approach used by…

Descriptors: Comparative Analysis, Computer Assisted Testing, Probability, Classification

Using the Stan Program for Bayesian Item Response Theory

Peer reviewed

Direct link

Luo, Yong; Jiao, Hong – Educational and Psychological Measurement, 2018

Stan is a new Bayesian statistical software program that implements the powerful and efficient Hamiltonian Monte Carlo (HMC) algorithm. To date there is not a source that systematically provides Stan code for various item response theory (IRT) models. This article provides Stan code for three representative IRT models, including the…

Descriptors: Bayesian Statistics, Item Response Theory, Probability, Computer Software

Effort in Low-Stakes Assessments: What Does It Take to Perform as Well as in a High-Stakes Setting?

Peer reviewed

Direct link

Attali, Yigal – Educational and Psychological Measurement, 2016

Performance of students in low-stakes testing situations has been a concern and focus of recent research. However, researchers who have examined the effect of stakes on performance have not been able to compare low-stakes performance to truly high-stakes performance of the same students. Results of such a comparison are reported in this article.…

Descriptors: College Entrance Examinations, Graduate Study, High Stakes Tests, Comparative Analysis

The Effect of Rating Unfamiliar Items on Angoff Passing Scores

Peer reviewed

Direct link

Clauser, Jerome C.; Hambleton, Ronald K.; Baldwin, Peter – Educational and Psychological Measurement, 2017

The Angoff standard setting method relies on content experts to review exam items and make judgments about the performance of the minimally proficient examinee. Unfortunately, at times content experts may have gaps in their understanding of specific exam content. These gaps are particularly likely to occur when the content domain is broad and/or…

Descriptors: Scores, Item Analysis, Classification, Decision Making

Further Considerations in Using Items with Diverse Content to Measure Acquiescence

Peer reviewed

Direct link

Kam, Chester Chun Seng – Educational and Psychological Measurement, 2016

To measure the response style of acquiescence, researchers recommend the use of at least 15 items with heterogeneous content. Such an approach is consistent with its theoretical definition and is a substantial improvement over traditional methods. Nevertheless, measurement of acquiescence can be enhanced by two additional considerations: first, to…

Descriptors: Test Items, Response Style (Tests), Test Content, Measurement

Rasch Mixture Models for DIF Detection: A Comparison of Old and New Score Specifications

Peer reviewed

Direct link

Frick, Hannah; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2015

Rasch mixture models can be a useful tool when checking the assumption of measurement invariance for a single Rasch model. They provide advantages compared to manifest differential item functioning (DIF) tests when the DIF groups are only weakly correlated with the manifest covariates available. Unlike in single Rasch models, estimation of Rasch…

Descriptors: Item Response Theory, Test Bias, Comparative Analysis, Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Jiao, Hong	2
Nugent, William R.	2
Rogers, W. Todd	2
Ahn, Soyeon	1
Alderman, Donald L.	1
Ames, Allison J.	1
André Beauducel	1
Attali, Yigal	1
Ayers, Elizabeth	1
Bagger, Jessica	1
Baldwin, Peter	1
Balogh, Jennifer	1
Bandalos, Deborah	1
Bandalos, Deborah L.	1
Beaujean, A. Alexander	1
Beretvas, S. Natasha	1
Bernstein, Jared	1
Bowden, Stephen C.	1
Bridgeman, Brent	1
Brooks, Thomas	1
Bruno D. Zumbo	1
Cai, Li	1
Carvajal, Jorge	1
Cawthon, Stephanie W.	1
More ▼

Comparative Analysis	54
Scores	54
Item Response Theory	18
Correlation	13
Test Items	12
Factor Analysis	8
Monte Carlo Methods	8
Error of Measurement	7
Higher Education	7
Regression (Statistics)	7
Reliability	7
Statistical Analysis	7
Test Reliability	7
Models	6
Simulation	6
Evaluation Methods	5
Foreign Countries	5
Item Analysis	5
Measures (Individuals)	5
Predictive Validity	5
Psychometrics	5
Sample Size	5
Test Bias	5
Undergraduate Students	5
College Entrance Examinations	4
More ▼