Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 12 |
Descriptor
| Methods | 12 |
| Test Format | 12 |
| Comparative Analysis | 4 |
| Item Response Theory | 4 |
| Accuracy | 3 |
| Computer Assisted Testing | 3 |
| Academic Achievement | 2 |
| Bayesian Statistics | 2 |
| Computation | 2 |
| Differences | 2 |
| Equated Scores | 2 |
| More ▼ | |
Source
Author
| Ali, Usama S. | 1 |
| Anakwe, Bridget | 1 |
| Cameron, Catherine | 1 |
| De Bra, Paul | 1 |
| Debeer, Dries | 1 |
| Diao, Qi | 1 |
| Ferne, Tracy | 1 |
| Gabel, David | 1 |
| Kim, Stella Y. | 1 |
| Kopriva, Rebecca | 1 |
| Lee, Won-Chan | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 9 |
| Journal Articles | 8 |
| Dissertations/Theses -… | 2 |
| Reports - Evaluative | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
| Higher Education | 2 |
| Elementary Education | 1 |
| Grade 4 | 1 |
| Grade 5 | 1 |
| Grade 7 | 1 |
| Grade 8 | 1 |
| High Schools | 1 |
| Middle Schools | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
Location
| Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| Advanced Placement… | 2 |
What Works Clearinghouse Rating
Ting Sun; Stella Yun Kim – Educational and Psychological Measurement, 2024
Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude…
Descriptors: Difficulty Level, Data Interpretation, Equated Scores, High School Students
Luo, Yong – Measurement: Interdisciplinary Research and Perspectives, 2021
To date, only frequentist model-selection methods have been studied with mixed-format data in the context of IRT model-selection, and it is unknown how popular Bayesian model-selection methods such as DIC, WAIC, and LOO perform. In this study, we present the results of a comprehensive simulation study that compared the performances of eight…
Descriptors: Item Response Theory, Test Format, Selection, Methods
Kim, Stella Y.; Lee, Won-Chan – Applied Measurement in Education, 2019
This study explores classification consistency and accuracy for mixed-format tests using real and simulated data. In particular, the current study compares six methods of estimating classification consistency and accuracy for seven mixed-format tests. The relative performance of the estimation methods is evaluated using simulated data. Study…
Descriptors: Classification, Reliability, Accuracy, Test Format
Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017
Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…
Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis
Diao, Qi; van der Linden, Wim J. – Applied Psychological Measurement, 2013
Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using…
Descriptors: Automation, Test Construction, Test Format, Item Banks
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015
Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics
Wang, Wei – ProQuest LLC, 2013
Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…
Descriptors: Equated Scores, Test Format, Test Items, Test Length
Kopriva, Rebecca; Gabel, David; Cameron, Catherine – Society for Research on Educational Effectiveness, 2011
This presentation explains and illustrates how computer-based innovative test tasks are designed in a multi-semiotic environment to effectively and comprehensibly convey meaning to students, especially English language learners (ELLs), students with learning disabilities (LDs), selected other students with disabilities, and non-identified native…
Descriptors: Middle School Students, Speech Communication, Test Format, Reading
Pechenizkiy, Mykola; Trcka, Nikola; Vasilyeva, Ekaterina; van der Aalst, Wil; De Bra, Paul – International Working Group on Educational Data Mining, 2009
Traditional data mining techniques have been extensively applied to find interesting patterns, build descriptive and predictive models from large volumes of data accumulated through the use of different information systems. The results of data mining can be used for getting a better understanding of the underlying educational processes, for…
Descriptors: Data Analysis, Methods, Computer Software, Computer Assisted Testing
Meng, Huijuan – ProQuest LLC, 2007
The purpose of this dissertation was to investigate how different Item Response Theory (IRT)-based calibration methods affect student achievement growth pattern recovery. Ninety-six vertical scales (4 x 2 x 2 x 2 x 3) were constructed using different combinations of IRT calibration methods (separate, pair-wise concurrent, semi-concurrent, and…
Descriptors: Scaling, Test Format, Item Response Theory, Methods
Anakwe, Bridget – Journal of Education for Business, 2008
The author investigated the impact of assessment methods on student performance on accounting tests. Specifically, the author used analysis of variance to determine whether the use of computer-based tests instead of paper-based tests affects students' traditional test scores in accounting examinations. The author included 2 independent variables,…
Descriptors: Student Evaluation, Testing, Statistical Analysis, Methods
Ferne, Tracy; Rupp, Andre A. – Language Assessment Quarterly, 2007
This article reviews research on differential item functioning (DIF) in language testing conducted primarily between 1990 and 2005 with an eye toward providing methodological guidelines for developing, conducting, and disseminating research in this area. The article contains a synthesis of 27 studies with respect to five essential sets of…
Descriptors: Test Bias, Evaluation Research, Testing, Language Tests

Peer reviewed
Direct link
