Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 15 |
Descriptor
| Comparative Analysis | 21 |
| Difficulty Level | 21 |
| Sample Size | 21 |
| Test Items | 16 |
| Item Response Theory | 12 |
| Simulation | 8 |
| Test Length | 7 |
| Equated Scores | 6 |
| Computation | 5 |
| Monte Carlo Methods | 5 |
| Ability | 4 |
| More ▼ | |
Source
Author
| Abulela, Mohammed A. A. | 1 |
| Ahn, Soyeon | 1 |
| Allen, Nancy L. | 1 |
| Arikan, Çigdem Akin | 1 |
| Asiret, Semih | 1 |
| Atar, Burcu | 1 |
| Bacon, Tina P. | 1 |
| Barnes, Laura L. B. | 1 |
| Cai, Li | 1 |
| Carvajal-Espinoza, Jorge E. | 1 |
| Desmet, Piet | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 13 |
| Journal Articles | 11 |
| Dissertations/Theses -… | 5 |
| Speeches/Meeting Papers | 4 |
| Reports - Evaluative | 3 |
Education Level
| Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| Program for International… | 1 |
What Works Clearinghouse Rating
Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023
Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…
Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level
Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021
The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…
Descriptors: Test Norms, Scores, Regression (Statistics), Test Items
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Arikan, Çigdem Akin – International Journal of Progressive Education, 2018
The main purpose of this study is to compare the test forms to the midi anchor test and the mini anchor test performance based on item response theory. The research was conducted with using simulated data which were generated based on Rasch model. In order to equate two test forms the anchor item nonequivalent groups (internal anchor test) was…
Descriptors: Equated Scores, Comparative Analysis, Item Response Theory, Tests
Asiret, Semih; Sünbül, Seçil Ömür – Educational Sciences: Theory and Practice, 2016
In this study, equating methods for random group design using small samples through factors such as sample size, difference in difficulty between forms, and guessing parameter was aimed for comparison. Moreover, which method gives better results under which conditions was also investigated. In this study, 5,000 dichotomous simulated data…
Descriptors: Equated Scores, Sample Size, Difficulty Level, Guessing (Tests)
Paek, Insu; Cai, Li – Educational and Psychological Measurement, 2014
The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…
Descriptors: Item Response Theory, Comparative Analysis, Error of Measurement, Computation
Wu, Yi-Fang – ProQuest LLC, 2015
Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…
Descriptors: Item Response Theory, Test Items, Accuracy, Computation
Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014
C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…
Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests
Lee, Eunjung – ProQuest LLC, 2013
The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…
Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory
Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon; Penfield, Randall D. – Educational and Psychological Measurement, 2013
The Rasch model, a member of a larger group of models within item response theory, is widely used in empirical studies. Detection of uniform differential item functioning (DIF) within the Rasch model typically employs null hypothesis testing with a concomitant consideration of effect size (e.g., signed area [SA]). Parametric equivalence between…
Descriptors: Test Bias, Effect Size, Item Response Theory, Comparative Analysis
Carvajal-Espinoza, Jorge E. – ProQuest LLC, 2011
The Non-Equivalent groups with Anchor Test equating (NEAT) design is a widely used equating design in large scale testing that involves two groups that do not have to be of equal ability. One group P gets form X and a group of items A and the other group Q gets form Y and the same group of items A. One of the most commonly used equating methods in…
Descriptors: Sample Size, Equated Scores, Psychometrics, Measurement
Wauters, Kelly; Desmet, Piet; Van Den Noortgate, Wim – Computers & Education, 2012
The evolution from static to dynamic electronic learning environments has stimulated the research on adaptive item sequencing. A prerequisite for adaptive item sequencing, in which the difficulty of the item is constantly matched to the ability level of the learner, is to have items with a known difficulty level. The difficulty level can be…
Descriptors: Expertise, Electronic Learning, Feedback (Response), Sample Size
Kim, Hyun Seok John – ProQuest LLC, 2011
Cognitive diagnostic assessment (CDA) is a new theoretical framework for psychological and educational testing that is designed to provide detailed information about examinees' strengths and weaknesses in specific knowledge structures and processing skills. During the last three decades, more than a dozen psychometric models have been developed…
Descriptors: Cognitive Measurement, Diagnostic Tests, Bayesian Statistics, Statistical Inference
Sunnassee, Devdass – ProQuest LLC, 2011
Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…
Descriptors: Test Length, Test Format, Sample Size, Simulation
Atar, Burcu; Kamata, Akihito – Hacettepe University Journal of Education, 2011
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
Descriptors: Test Bias, Sample Size, Monte Carlo Methods, Item Response Theory
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
