Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 8 |
| Since 2007 (last 20 years) | 16 |
Descriptor
| Comparative Analysis | 27 |
| Evaluation Methods | 15 |
| Item Response Theory | 11 |
| Test Items | 11 |
| Simulation | 9 |
| Methods | 6 |
| Monte Carlo Methods | 6 |
| Models | 5 |
| Scores | 5 |
| Adaptive Testing | 4 |
| College Entrance Examinations | 4 |
| More ▼ | |
Source
| Journal of Educational… | 27 |
Author
| Lee, Won-Chan | 2 |
| Nandakumar, Ratna | 2 |
| Ali, Usama S. | 1 |
| Ames, Allison | 1 |
| Ansley, Timothy | 1 |
| Baxter, Gail P. | 1 |
| Brennan, Robert L. | 1 |
| Cahn, Miriam F. | 1 |
| Cai, Yan | 1 |
| Chang, Hua-Hua | 1 |
| Chen, Shu-Ying | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 26 |
| Reports - Research | 18 |
| Reports - Evaluative | 7 |
| Reports - Descriptive | 1 |
Education Level
| Elementary Secondary Education | 1 |
| Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| SAT (College Admission Test) | 2 |
| National Assessment of… | 1 |
| Program for International… | 1 |
| Trends in International… | 1 |
What Works Clearinghouse Rating
Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022
Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…
Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods
Feuerstahler, Leah; Wilson, Mark – Journal of Educational Measurement, 2019
Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described--delta dimensional alignment (DDA) and logistic regression alignment (LRA)--to transform estimated…
Descriptors: Item Response Theory, Models, Scores, Comparative Analysis
Kim, Hyung Jin; Brennan, Robert L.; Lee, Won-Chan – Journal of Educational Measurement, 2017
In equating, when common items are internal and scoring is conducted in terms of the number of correct items, some pairs of total scores ("X") and common-item scores ("V") can never be observed in a bivariate distribution of "X" and "V"; these pairs are called "structural zeros." This simulation…
Descriptors: Test Items, Equated Scores, Comparative Analysis, Methods
Heritage, Margaret; Kingston, Neal M. – Journal of Educational Measurement, 2019
Classroom assessment and large-scale assessment have, for the most part, existed in mutual isolation. Some experts have felt this is for the best and others have been concerned that the schism limits the potential contribution of both forms of assessment. Margaret Heritage has long been a champion of best practices in classroom assessment. Neal…
Descriptors: Measurement, Psychometrics, Context Effect, Classroom Environment
Liu, Shuchang; Cai, Yan; Tu, Dongbo – Journal of Educational Measurement, 2018
This study applied the mode of on-the-fly assembled multistage adaptive testing to cognitive diagnosis (CD-OMST). Several and several module assembly methods for CD-OMST were proposed and compared in terms of measurement precision, test security, and constrain management. The module assembly methods in the study included the maximum priority index…
Descriptors: Adaptive Testing, Monte Carlo Methods, Computer Security, Clinical Diagnosis
Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017
Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…
Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis
Ames, Allison; Smith, Elizabeth – Journal of Educational Measurement, 2018
Bayesian methods incorporate model parameter information prior to data collection. Eliciting information from content experts is an option, but has seen little implementation in Bayesian item response theory (IRT) modeling. This study aims to use ethical reasoning content experts to elicit prior information and incorporate this information into…
Descriptors: Item Response Theory, Bayesian Statistics, Ethics, Specialists
Lee, Soo; Suh, Youngsuk – Journal of Educational Measurement, 2018
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
Descriptors: Item Response Theory, Sample Size, Models, Error of Measurement
Häggström, Jenny; Wiberg, Marie – Journal of Educational Measurement, 2014
The selection of bandwidth in kernel equating is important because it has a direct impact on the equated test scores. The aim of this article is to examine the use of double smoothing when selecting bandwidths in kernel equating and to compare double smoothing with the commonly used penalty method. This comparison was made using both an equivalent…
Descriptors: Equated Scores, Data Analysis, Comparative Analysis, Simulation
Powers, Sonya; Kolen, Michael J. – Journal of Educational Measurement, 2014
Accurate equating results are essential when comparing examinee scores across exam forms. Previous research indicates that equating results may not be accurate when group differences are large. This study compared the equating results of frequency estimation, chained equipercentile, item response theory (IRT) true-score, and IRT observed-score…
Descriptors: Accuracy, Equated Scores, Differences, Groups
Yao, Lihua – Journal of Educational Measurement, 2014
The intent of this research was to find an item selection procedure in the multidimensional computer adaptive testing (CAT) framework that yielded higher precision for both the domain and composite abilities, had a higher usage of the item pool, and controlled the exposure rate. Five multidimensional CAT item selection procedures (minimum angle;…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection
Tendeiro, Jorge N.; Meijer, Rob R. – Journal of Educational Measurement, 2014
In recent guidelines for fair educational testing it is advised to check the validity of individual test scores through the use of person-fit statistics. For practitioners it is unclear on the basis of the existing literature which statistic to use. An overview of relatively simple existing nonparametric approaches to identify atypical response…
Descriptors: Educational Assessment, Test Validity, Scores, Statistical Analysis
Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna – Journal of Educational Measurement, 2014
Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…
Descriptors: Test Bias, Models, Simulation, Error Patterns
Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016
Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…
Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation
Jiao, Hong; Wang, Shudong; He, Wei – Journal of Educational Measurement, 2013
This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…
Descriptors: Computation, Item Response Theory, Models, Monte Carlo Methods
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
