ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	6
Since 2007 (last 20 years)	14

Descriptor

Correlation	16
Difficulty Level	16
Sample Size	16
Test Items	11
Item Response Theory	10
Test Length	7
Accuracy	6
Simulation	6
Computation	5
Comparative Analysis	4
Guessing (Tests)	4
Computer Software	3
Error of Measurement	3
Evaluation Methods	3
Factor Analysis	3
Monte Carlo Methods	3
Statistical Analysis	3
Bayesian Statistics	2
Classification	2
Equated Scores	2
Evaluation Criteria	2
Item Analysis	2
Language Tests	2
Mathematical Models	2
Models	2
More ▼

Source

ProQuest LLC	4
Educational and Psychological…	2
Journal of Experimental…	2
Applied Psychological…	1
International Journal of…	1
Journal of Educational…	1
Measurement:…	1
Quality Assurance in…	1
Research Matters	1

Publication Type

Reports - Research	11
Journal Articles	10
Dissertations/Theses -…	4
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Detecting Rater Centrality Effects in Performance Assessments: A Model-Based Comparison of Centrality Indices

Peer reviewed

Direct link

Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022

Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…

Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

Investigation of the Effect of Parameter Estimation and Classification Accuracy in Mixture IRT Models under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Saatcioglu, Fatima Munevver; Atar, Hakan Yavuz – International Journal of Assessment Tools in Education, 2022

This study aims to examine the effects of mixture item response theory (IRT) models on item parameter estimation and classification accuracy under different conditions. The manipulated variables of the simulation study are set as mixture IRT models (Rasch, 2PL, 3PL); sample size (600, 1000); the number of items (10, 30); the number of latent…

Descriptors: Accuracy, Classification, Item Response Theory, Programming Languages

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Differential Item Functioning Effect Size from the Multigroup Confirmatory Factor Analysis for a Meta-Analysis: A Simulation Study

Peer reviewed

Direct link

Park, Sung Eun; Ahn, Soyeon; Zopluoglu, Cengiz – Educational and Psychological Measurement, 2021

This study presents a new approach to synthesizing differential item functioning (DIF) effect size: First, using correlation matrices from each study, we perform a multigroup confirmatory factor analysis (MGCFA) that examines measurement invariance of a test item between two subgroups (i.e., focal and reference groups). Then we synthesize, across…

Descriptors: Item Analysis, Effect Size, Difficulty Level, Monte Carlo Methods

Exploration of Factors Affecting the Added Value of Test Subscores

Peer reviewed

Direct link

Wang, Xiaolin; Svetina, Dubravka; Dai, Shenghai – Journal of Experimental Education, 2019

Recently, interest in test subscore reporting for diagnosis purposes has been growing rapidly. The two simulation studies here examined factors (sample size, number of subscales, correlation between subscales, and three factors affecting subscore reliability: number of items per subscale, item parameter distribution, and data generating model)…

Descriptors: Value Added Models, Scores, Sample Size, Correlation

Dimensionality in Compensatory MIRT When Complex Structure Exists: Evaluation of DETECT and NOHARM

Peer reviewed

Direct link

Svetina, Dubravka; Levy, Roy – Journal of Experimental Education, 2016

This study investigated the effect of complex structure on dimensionality assessment in compensatory multidimensional item response models using DETECT- and NOHARM-based methods. The performance was evaluated via the accuracy of identifying the correct number of dimensions and the ability to accurately recover item groupings using a simple…

Descriptors: Item Response Theory, Accuracy, Correlation, Sample Size

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

Equating Multidimensional Tests under a Random Groups Design: A Comparison of Various Equating Procedures

Direct link

Lee, Eunjung – ProQuest LLC, 2013

The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…

Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory

The Performance of the Linear Logistic Test Model When the Q-Matrix Is Misspecified: A Simulation Study

Direct link

MacDonald, George T. – ProQuest LLC, 2014

A simulation study was conducted to explore the performance of the linear logistic test model (LLTM) when the relationships between items and cognitive components were misspecified. Factors manipulated included percent of misspecification (0%, 1%, 5%, 10%, and 15%), form of misspecification (under-specification, balanced misspecification, and…

Descriptors: Simulation, Item Response Theory, Models, Test Items

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

Diagnosing Examinees' Attributes-Mastery Using the Bayesian Inference for Binomial Proportion: A New Method for Cognitive Diagnostic Assessment

Direct link

Kim, Hyun Seok John – ProQuest LLC, 2011

Cognitive diagnostic assessment (CDA) is a new theoretical framework for psychological and educational testing that is designed to provide detailed information about examinees' strengths and weaknesses in specific knowledge structures and processing skills. During the last three decades, more than a dozen psychometric models have been developed…

Descriptors: Cognitive Measurement, Diagnostic Tests, Bayesian Statistics, Statistical Inference

Comparing Accuracy of Parameter Estimation Using IRT Models in the Presence of Guessing

Direct link

Fu, Qiong – ProQuest LLC, 2010

This research investigated how the accuracy of person ability and item difficulty parameter estimation varied across five IRT models with respect to the presence of guessing, targeting, and varied combinations of sample sizes and test lengths. The data were simulated with 50 replications under each of the 18 combined conditions. Five IRT models…

Descriptors: Item Response Theory, Guessing (Tests), Accuracy, Computation

Item Parameter Estimation for the MIRT Model: Bias and Precision of Confirmatory Factor Analysis-Based Models

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2010

The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…

Descriptors: Item Response Theory, Computation, Factor Analysis, Models

The Influence of Multidimensionality on the Graded Response Model.

Download full text

De Ayala, R. J. – 1993

Previous work on the effects of dimensionality on parameter estimation was extended from dichotomous models to the polytomous graded response (GR) model. A multidimensional GR model was developed to generate data in one-, two-, and three-dimensions, with two- and three-dimensional conditions varying in their interdimensional associations. Test…

Descriptors: Computer Simulation, Correlation, Difficulty Level, Estimation (Mathematics)

Previous Page | Next Page »

Pages: 1 | 2

Svetina, Dubravka	2
Ahn, Soyeon	1
Atar, Hakan Yavuz	1
Bramley, Tom	1
Dai, Shenghai	1
De Ayala, R. J.	1
DeMars, Christine E.	1
Eckes, Thomas	1
Finch, Holmes	1
Fu, Qiong	1
Jin, Kuan-Yu	1
Kim, Hyun Seok John	1
Kárász, Judit T.	1
Lee, Eunjung	1
Levy, Roy	1
MacDonald, George T.	1
Park, Sung Eun	1
Robitzsch, Alexander	1
Saatcioglu, Fatima Munevver	1
Schipolowski, Stefan	1
Schroeders, Ulrich	1
Socha, Alan	1
Széll, Krisztián	1
Takács, Szabolcs	1
More ▼