ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	12

Descriptor

Methods	12
Test Format	12
Comparative Analysis	4
Item Response Theory	4
Accuracy	3
Computer Assisted Testing	3
Academic Achievement	2
Bayesian Statistics	2
Computation	2
Differences	2
Equated Scores	2
Error of Measurement	2
Models	2
Multiple Choice Tests	2
Simulation	2
Statistical Analysis	2
Test Construction	2
Test Items	2
Test Length	2
Testing	2
Ability	1
Accounting	1
Achievement Tests	1
Advanced Placement Programs	1
Automation	1
More ▼

Source

ProQuest LLC	2
Applied Measurement in…	1
Applied Psychological…	1
Educational and Psychological…	1
International Working Group…	1
Journal of Education for…	1
Journal of Educational…	1
Journal of Educational and…	1
Language Assessment Quarterly	1
Measurement:…	1
Society for Research on…	1
More ▼

Publication Type

Reports - Research	9
Journal Articles	8
Dissertations/Theses -…	2
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Elementary Education	1
Grade 4	1
Grade 5	1
Grade 7	1
Grade 8	1
High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Netherlands

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Evaluating Equating Methods for Varying Levels of Form Difference

Peer reviewed

Direct link

Ting Sun; Stella Yun Kim – Educational and Psychological Measurement, 2024

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude…

Descriptors: Difficulty Level, Data Interpretation, Equated Scores, High School Students

A Comparison of Common IRT Model-Selection Methods with Mixed-Format Tests

Peer reviewed

Direct link

Luo, Yong – Measurement: Interdisciplinary Research and Perspectives, 2021

To date, only frequentist model-selection methods have been studied with mixed-format data in the context of IRT model-selection, and it is unknown how popular Bayesian model-selection methods such as DIC, WAIC, and LOO perform. In this study, we present the results of a comprehensive simulation study that compared the performances of eight…

Descriptors: Item Response Theory, Test Format, Selection, Methods

Classification Consistency and Accuracy for Mixed-Format Tests

Peer reviewed

Direct link

Kim, Stella Y.; Lee, Won-Chan – Applied Measurement in Education, 2019

This study explores classification consistency and accuracy for mixed-format tests using real and simulated data. In particular, the current study compares six methods of estimating classification consistency and accuracy for seven mixed-format tests. The relative performance of the estimation methods is evaluated using simulated data. Study…

Descriptors: Classification, Reliability, Accuracy, Test Format

Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms

Peer reviewed

Direct link

Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017

Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…

Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis

Integrating Test-Form Formatting into Automated Test Assembly

Peer reviewed

Direct link

Diao, Qi; van der Linden, Wim J. – Applied Psychological Measurement, 2013

Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using…

Descriptors: Automation, Test Construction, Test Format, Item Banks

Assessment of Person Fit for Mixed-Format Tests

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015

Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…

Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

Designing Dynamic and Interactive Assessments for English Learners That Directly Measure Targeted Science Constructs

Download full text

Kopriva, Rebecca; Gabel, David; Cameron, Catherine – Society for Research on Educational Effectiveness, 2011

This presentation explains and illustrates how computer-based innovative test tasks are designed in a multi-semiotic environment to effectively and comprehensibly convey meaning to students, especially English language learners (ELLs), students with learning disabilities (LDs), selected other students with disabilities, and non-identified native…

Descriptors: Middle School Students, Speech Communication, Test Format, Reading

Process Mining Online Assessment Data

Download full text

Pechenizkiy, Mykola; Trcka, Nikola; Vasilyeva, Ekaterina; van der Aalst, Wil; De Bra, Paul – International Working Group on Educational Data Mining, 2009

Traditional data mining techniques have been extensively applied to find interesting patterns, build descriptive and predictive models from large volumes of data accumulated through the use of different information systems. The results of data mining can be used for getting a better understanding of the underlying educational processes, for…

Descriptors: Data Analysis, Methods, Computer Software, Computer Assisted Testing

A Comparison Study of IRT Calibration Methods for Mixed-Format Tests in Vertical Scaling

Direct link

Meng, Huijuan – ProQuest LLC, 2007

The purpose of this dissertation was to investigate how different Item Response Theory (IRT)-based calibration methods affect student achievement growth pattern recovery. Ninety-six vertical scales (4 x 2 x 2 x 2 x 3) were constructed using different combinations of IRT calibration methods (separate, pair-wise concurrent, semi-concurrent, and…

Descriptors: Scaling, Test Format, Item Response Theory, Methods

Comparison of Student Performance in Paper-Based versus Computer-Based Testing

Peer reviewed

Direct link

Anakwe, Bridget – Journal of Education for Business, 2008

The author investigated the impact of assessment methods on student performance on accounting tests. Specifically, the author used analysis of variance to determine whether the use of computer-based tests instead of paper-based tests affects students' traditional test scores in accounting examinations. The author included 2 independent variables,…

Descriptors: Student Evaluation, Testing, Statistical Analysis, Methods

A Synthesis of 15 Years of Research on DIF in Language Testing: Methodological Advances, Challenges, and Recommendations

Peer reviewed

Direct link

Ferne, Tracy; Rupp, Andre A. – Language Assessment Quarterly, 2007

This article reviews research on differential item functioning (DIF) in language testing conducted primarily between 1990 and 2005 with an eye toward providing methodological guidelines for developing, conducting, and disseminating research in this area. The article contains a synthesis of 27 studies with respect to five essential sets of…

Descriptors: Test Bias, Evaluation Research, Testing, Language Tests

Ali, Usama S.	1
Anakwe, Bridget	1
Cameron, Catherine	1
De Bra, Paul	1
Debeer, Dries	1
Diao, Qi	1
Ferne, Tracy	1
Gabel, David	1
Kim, Stella Y.	1
Kopriva, Rebecca	1
Lee, Won-Chan	1
Luo, Yong	1
Meng, Huijuan	1
Pechenizkiy, Mykola	1
Rupp, Andre A.	1
Sinharay, Sandip	1
Stella Yun Kim	1
Ting Sun	1
Trcka, Nikola	1
Vasilyeva, Ekaterina	1
Wang, Wei	1
van Rijn, Peter W.	1
van der Aalst, Wil	1
van der Linden, Wim J.	1
More ▼