ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	16
Since 2006 (last 20 years)	40

Descriptor

Models	40
Evaluation Methods	30
Item Response Theory	20
Psychometrics	20
Measurement	18
Classification	16
Evaluation Problems	15
Measurement Techniques	15
Comparative Analysis	11
Diagnostic Tests	11
Test Items	10
Evaluation Research	8
Educational Assessment	7
Evidence	7
Goodness of Fit	6
Misconceptions	6
Monte Carlo Methods	6
Physics	6
Social Sciences	6
Criterion Referenced Tests	5
Definitions	5
Educational Testing	5
Sociometric Techniques	5
State of the Art Reviews	5
Student Evaluation	5
More ▼

Source

Measurement:…

Publication Type

Journal Articles	40
Opinion Papers	19
Reports - Research	14
Reports - Evaluative	9
Reports - Descriptive	2
Book/Product Reviews	1
Information Analyses	1

Education Level

Elementary Secondary Education	2
Grade 8	1

Audience

Practitioners	2
Researchers	1

Location

California	1
Germany	1
South Korea	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 40 results Save | Export

Uncertainty in Artificial Neural Network Models: Monte-Carlo Simulations beyond the GUM Boundaries

Peer reviewed

Direct link

A. M. Sadek; Fahad Al-Muhlaki – Measurement: Interdisciplinary Research and Perspectives, 2024

In this study, the accuracy of the artificial neural network (ANN) was assessed considering the uncertainties associated with the randomness of the data and the lack of learning. The Monte-Carlo algorithm was applied to simulate the randomness of the input variables and evaluate the output distribution. It has been shown that under certain…

Descriptors: Monte Carlo Methods, Accuracy, Artificial Intelligence, Guidelines

Explainable Machine Learning for Credit Risk Management When Features Are Dependent

Peer reviewed

Direct link

Thanh Thuy Do; Golnoosh Babaei; Paolo Pagnottoni – Measurement: Interdisciplinary Research and Perspectives, 2024

Complex Machine Learning (ML) models used to support decision-making in peer-to-peer (P2P) lending often lack clear, accurate, and interpretable explanations. While the game-theoretic concept of Shapley values and its computationally efficient variant Kernel SHAP may be employed for this aim, similarly to other existing methods, the latter makes…

Descriptors: Artificial Intelligence, Risk Management, Credit (Finance), Prediction

Identifying Response Styles Using Person Fit Analysis and Response-Styles Models

Peer reviewed

Direct link

Wind, Stefanie A.; Ge, Yuan – Measurement: Interdisciplinary Research and Perspectives, 2023

In selected-response assessments such as attitude surveys with Likert-type rating scales, examinees often select from rating scale categories to reflect their locations on a construct. Researchers have observed that some examinees exhibit "response styles," which are systematic patterns of responses in which examinees are more likely to…

Descriptors: Goodness of Fit, Responses, Likert Scales, Models

A Validation Study of the Extended Relevance Scale Using the D3mirt Package for R

Peer reviewed

Direct link

Erik Forsberg; Anders Sjöberg – Measurement: Interdisciplinary Research and Perspectives, 2025

This paper reports a validation study based on descriptive multidimensional item response theory (DMIRT), implemented in the R package "D3mirt" by using the ERS-C, an extended version of the Relevance subscale from the Moral Foundations Questionnaire including two new items for collectivism (17 items in total). Two latent models are…

Descriptors: Evaluation Methods, Programming Languages, Altruism, Collectivism

Validation and Implementation of Customer Classification System Using Machine Learning

Peer reviewed

Direct link

Hyemin Yoon; HyunJin Kim; Sangjin Kim – Measurement: Interdisciplinary Research and Perspectives, 2024

We have maintained the customer grade system that is being implemented to customers with excellent performance through customer segmentation for years. Currently, financial institutions that operate the customer grade system provide similar services based on the score calculation criteria, but the score calculation criteria vary from the financial…

Descriptors: Classification, Artificial Intelligence, Prediction, Decision Making

Misclassification Error, Binary Regression Bias, and Reliability in Multidimensional Poverty Measurement: An Estimation Approach Based on Bayesian Modelling

Peer reviewed

Direct link

Najera, Hector – Measurement: Interdisciplinary Research and Perspectives, 2023

Measurement error affects the quality of population orderings of an index and, hence, increases the misclassification of the poor and the non-poor groups and affects statistical inferences from binary regression models. Hence, the conclusions about the extent, profile, and distribution of poverty are likely to be misleading. However, the size and…

Descriptors: Poverty, Error of Measurement, Classification, Statistical Inference

Performance of Nonparametric Person-Fit Statistics with Unfolding versus Dominance Response Models

Peer reviewed

Direct link

Reimers, Jennifer; Turner, Ronna C.; Tendeiro, Jorge N.; Lo, Wen-Juo; Keiffer, Elizabeth – Measurement: Interdisciplinary Research and Perspectives, 2023

Person-fit analyses are commonly used to detect aberrant responding in self-report data. Nonparametric person fit statistics do not require fitting a parametric test theory model and have performed well compared to other person-fit statistics. However, detection of aberrant responding has primarily focused on dominance response data, thus the…

Descriptors: Goodness of Fit, Nonparametric Statistics, Error of Measurement, Comparative Analysis

The Comparison of Estimation Methods for the Four-Parameter Logistic Item Response Theory Model

Peer reviewed

Direct link

Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022

The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…

Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms

Can the One-Parameter Logistic Model Be a Spurious Finding for a Heterogeneous Population?

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Harrison, Michael – Measurement: Interdisciplinary Research and Perspectives, 2019

Utilizing the perspective of finite mixture modeling, this note considers whether a finding of a plausible one-parameter logistic model could be spurious for a population with substantial unobserved heterogeneity. A theoretically and empirically important setting is discussed involving the mixture of two latent classes, with the less restrictive…

Descriptors: Models, Evaluation Methods, Social Science Research, Statistical Analysis

A Mixed Methods Model of Scale Development and Validation Analysis

Peer reviewed

Direct link

Zhou, Yuchun – Measurement: Interdisciplinary Research and Perspectives, 2019

Using mixed methods to develop new scales is not a new idea since the 2000s. However, there exists inadequate literature that discusses scale development using mixed methods, with steps including how to design the study, how to implement the process, and how to conduct validation. This study proposes a hands-on model of using mixed methods to…

Descriptors: Mixed Methods Research, Test Construction, Construct Validity, Psychometrics

Simultaneously Modeling Differential Testlet Functioning and Differential Item Functioning: Addressing Variance Heterogeneity with a Multigroup One-Parameter Testlet Model

Peer reviewed

Direct link

Luo, Yong; Liang, Xinya – Measurement: Interdisciplinary Research and Perspectives, 2019

Current methods that simultaneously model differential testlet functioning (DTLF) and differential item functioning (DIF) constrain the variances of latent ability and testlet effects to be equal between the focal and the reference groups. Such a constraint can be stringent and unrealistic with real data. In this study, we propose a multigroup…

Descriptors: Test Items, Item Response Theory, Test Bias, Models

flexMIRT: A Flexible Modeling Package for Multidimensional Item Response Models

Peer reviewed

Direct link

Chung, Seungwon; Houts, Carrie – Measurement: Interdisciplinary Research and Perspectives, 2020

Advanced modeling of item response data through the item response theory (IRT) or item factor analysis frameworks is becoming increasingly popular. In the social and behavioral sciences, the underlying structure of tests/assessments is often multidimensional (i.e., more than 1 latent variable/construct is represented in the items). This review…

Descriptors: Item Response Theory, Evaluation Methods, Models, Factor Analysis

Bayesian Analysis of Multidimensional Item Response Theory Models: A Discussion and Illustration of Three Response Style Models

Peer reviewed

Direct link

Leventhal, Brian C.; Stone, Clement A. – Measurement: Interdisciplinary Research and Perspectives, 2018

Interest in Bayesian analysis of item response theory (IRT) models has grown tremendously due to the appeal of the paradigm among psychometricians, advantages of these methods when analyzing complex models, and availability of general-purpose software. Possible models include models which reflect multidimensionality due to designed test structure,…

Descriptors: Bayesian Statistics, Item Response Theory, Models, Psychometrics

Using Response Times to Assess Learning Progress: A Joint Model for Responses and Response Times

Peer reviewed

Direct link

Wang, Shiyu; Zhang, Susu; Douglas, Jeff; Culpepper, Steven – Measurement: Interdisciplinary Research and Perspectives, 2018

Analyzing students' growth remains an important topic in educational research. Most recently, Diagnostic Classification Models (DCMs) have been used to track skill acquisition in a longitudinal fashion, with the purpose to provide an estimate of students' learning trajectories in terms of the change of fine-grained skills overtime. Response time…

Descriptors: Reaction Time, Markov Processes, Computer Assisted Instruction, Spatial Ability

Are Formative Indicators Superfluous? An Extension of Aguirre-Urreta, Rönkkö, and Marakas Analysis

Peer reviewed

Direct link

Guyon, Hervé; Tensaout, Mouloud – Measurement: Interdisciplinary Research and Perspectives, 2016

In this article, the authors extend the results of Aguirre-Urreta, Rönkkö, and Marakas (2016) concerning the omission of a relevant causal indicator by testing the validity of the assumption that causal indicators are entirely superfluous to the measurement model and discuss the implications for measurement theory. Contrary to common wisdom…

Descriptors: Causal Models, Structural Equation Models, Formative Evaluation, Measurement

Previous Page | Next Page »

Pages: 1 | 2 | 3

Black, Paul	2
Humphry, Stephen M.	2
Kyngdon, Andrew	2
Wilson, Mark	2
Yao, Shih-Ying	2
A. M. Sadek	1
Anders Sjöberg	1
Andrich, David	1
Bechger, Timo	1
Brennan, Robert L.	1
Carstensen, Claus H.	1
Chung, Seungwon	1
Culpepper, Steven	1
Douglas, Jeff	1
Engelhard, George, Jr.	1
Erik Forsberg	1
Fahad Al-Muhlaki	1
Frey, Andreas	1
Ge, Yuan	1
Golnoosh Babaei	1
Guyon, Hervé	1
Haberman, Shelby J.	1
Hancock, Gregory R.	1
Harrison, Michael	1
Heene, Moritz	1
More ▼