ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	15

Descriptor

Comparative Analysis	21
Difficulty Level	21
Sample Size	21
Test Items	16
Item Response Theory	12
Simulation	8
Test Length	7
Equated Scores	6
Computation	5
Monte Carlo Methods	5
Ability	4
Correlation	4
Guessing (Tests)	4
Psychometrics	4
Test Format	4
Accuracy	3
Achievement Tests	3
Item Bias	3
Regression (Statistics)	3
Scores	3
Bayesian Statistics	2
Classification	2
Error of Measurement	2
Estimation (Mathematics)	2
Foreign Countries	2
More ▼

Source

ProQuest LLC	5
Educational and Psychological…	3
Applied Measurement in…	2
Computers & Education	1
Educational Sciences: Theory…	1
Hacettepe University Journal…	1
International Journal of…	1
Journal of Educational…	1
Quality Assurance in…	1

Publication Type

Reports - Research	13
Journal Articles	11
Dissertations/Theses -…	5
Speeches/Meeting Papers	4
Reports - Evaluative	3

Education Level

Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

Improvement of Norm Score Quality via Regression-Based Continuous Norming

Peer reviewed

Direct link

Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021

The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…

Descriptors: Test Norms, Scores, Regression (Statistics), Test Items

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

The Effect of Mini and Midi Anchor Tests on Test Equating

Peer reviewed
PDF on ERIC

Download full text

Arikan, Çigdem Akin – International Journal of Progressive Education, 2018

The main purpose of this study is to compare the test forms to the midi anchor test and the mini anchor test performance based on item response theory. The research was conducted with using simulated data which were generated based on Rasch model. In order to equate two test forms the anchor item nonequivalent groups (internal anchor test) was…

Descriptors: Equated Scores, Comparative Analysis, Item Response Theory, Tests

Investigating Test Equating Methods in Small Samples through Various Factors

Peer reviewed
PDF on ERIC

Download full text

Asiret, Semih; Sünbül, Seçil Ömür – Educational Sciences: Theory and Practice, 2016

In this study, equating methods for random group design using small samples through factors such as sample size, difference in difficulty between forms, and guessing parameter was aimed for comparison. Moreover, which method gives better results under which conditions was also investigated. In this study, 5,000 dichotomous simulated data…

Descriptors: Equated Scores, Sample Size, Difficulty Level, Guessing (Tests)

A Comparison of Item Parameter Standard Error Estimation Procedures for Unidimensional and Multidimensional Item Response Theory Modeling

Peer reviewed

Direct link

Paek, Insu; Cai, Li – Educational and Psychological Measurement, 2014

The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…

Descriptors: Item Response Theory, Comparative Analysis, Error of Measurement, Computation

Accuracy and Variability of Item Parameter Estimates from Marginal Maximum a Posteriori Estimation and Bayesian Inference via Gibbs Samplers

Direct link

Wu, Yi-Fang – ProQuest LLC, 2015

Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…

Descriptors: Item Response Theory, Test Items, Accuracy, Computation

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

Equating Multidimensional Tests under a Random Groups Design: A Comparison of Various Equating Procedures

Direct link

Lee, Eunjung – ProQuest LLC, 2013

The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…

Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory

A Comparison of Uniform DIF Effect Size Estimators under the MIMIC and Rasch Models

Peer reviewed

Direct link

Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon; Penfield, Randall D. – Educational and Psychological Measurement, 2013

The Rasch model, a member of a larger group of models within item response theory, is widely used in empirical studies. Detection of uniform differential item functioning (DIF) within the Rasch model typically employs null hypothesis testing with a concomitant consideration of effect size (e.g., signed area [SA]). Parametric equivalence between…

Descriptors: Test Bias, Effect Size, Item Response Theory, Comparative Analysis

The Effects of Anchor Length, Test Difficulty, Population Ability Differences, Mixture of Populations and Sample Size on the Psychometric Properties of Levine Observed Score Linear Equating Method for Different Assumptions

Direct link

Carvajal-Espinoza, Jorge E. – ProQuest LLC, 2011

The Non-Equivalent groups with Anchor Test equating (NEAT) design is a widely used equating design in large scale testing that involves two groups that do not have to be of equal ability. One group P gets form X and a group of items A and the other group Q gets form Y and the same group of items A. One of the most commonly used equating methods in…

Descriptors: Sample Size, Equated Scores, Psychometrics, Measurement

Item Difficulty Estimation: An Auspicious Collaboration between Data and Judgment

Peer reviewed

Direct link

Wauters, Kelly; Desmet, Piet; Van Den Noortgate, Wim – Computers & Education, 2012

The evolution from static to dynamic electronic learning environments has stimulated the research on adaptive item sequencing. A prerequisite for adaptive item sequencing, in which the difficulty of the item is constantly matched to the ability level of the learner, is to have items with a known difficulty level. The difficulty level can be…

Descriptors: Expertise, Electronic Learning, Feedback (Response), Sample Size

Diagnosing Examinees' Attributes-Mastery Using the Bayesian Inference for Binomial Proportion: A New Method for Cognitive Diagnostic Assessment

Direct link

Kim, Hyun Seok John – ProQuest LLC, 2011

Cognitive diagnostic assessment (CDA) is a new theoretical framework for psychological and educational testing that is designed to provide detailed information about examinees' strengths and weaknesses in specific knowledge structures and processing skills. During the last three decades, more than a dozen psychometric models have been developed…

Descriptors: Cognitive Measurement, Diagnostic Tests, Bayesian Statistics, Statistical Inference

Conditions Affecting the Accuracy of Classical Equating Methods for Small Samples under the NEAT Design: A Simulation Study

Direct link

Sunnassee, Devdass – ProQuest LLC, 2011

Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…

Descriptors: Test Length, Test Format, Sample Size, Simulation

Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

Peer reviewed

Direct link

Atar, Burcu; Kamata, Akihito – Hacettepe University Journal of Education, 2011

The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

Descriptors: Test Bias, Sample Size, Monte Carlo Methods, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Abulela, Mohammed A. A.	1
Ahn, Soyeon	1
Allen, Nancy L.	1
Arikan, Çigdem Akin	1
Asiret, Semih	1
Atar, Burcu	1
Bacon, Tina P.	1
Barnes, Laura L. B.	1
Cai, Li	1
Carvajal-Espinoza, Jorge E.	1
Desmet, Piet	1
Donoghue, John R.	1
Gialluca, Kathleen A.	1
Jin, Ying	1
Kamata, Akihito	1
Kim, Hyun Seok John	1
Kromrey, Jeffrey D.	1
Kárász, Judit T.	1
Lee, Eunjung	1
Lenhard, Alexandra	1
Lenhard, Wolfgang	1
Myers, Nicholas D.	1
Paek, Insu	1
Penfield, Randall D.	1
More ▼