ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	11
Since 2007 (last 20 years)	20

Descriptor

Scores	27
Test Items	27
Test Length	27
Item Response Theory	12
Test Reliability	12
Simulation	9
Comparative Analysis	8
Error of Measurement	8
Computation	5
Computer Assisted Testing	5
Difficulty Level	4
Sample Size	4
Statistical Analysis	4
Test Bias	4
Test Construction	4
Test Format	4
Adaptive Testing	3
Bayesian Statistics	3
Classification	3
Correlation	3
Elementary School Students	3
Estimation (Mathematics)	3
Foreign Countries	3
Language Tests	3
Monte Carlo Methods	3
More ▼

Source

Applied Psychological…	3
ProQuest LLC	3
ETS Research Report Series	2
Grantee Submission	2
International Journal of…	2
Journal of Educational…	2
Applied Measurement in…	1
Asia Pacific Education Review	1
Assessment & Evaluation in…	1
College Entrance Examination…	1
College Student Journal	1
Education and Information…	1
European Journal of Science…	1
Journal of Technology,…	1
OECD Publishing (NJ1)	1
School Psychology Quarterly	1
More ▼

Publication Type

Journal Articles	17
Reports - Research	17
Reports - Evaluative	7
Dissertations/Theses -…	3
Numerical/Quantitative Data	2
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Elementary Education	3
Secondary Education	3
Elementary Secondary Education	2
High Schools	2
Higher Education	2
Postsecondary Education	2
Grade 11	1
Grade 12	1
Grade 6	1
Intermediate Grades	1
Middle Schools	1
More ▼

Audience

Researchers

Location

Asia	1
Iran	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	2
Armed Forces Qualification…	1
Program for International…	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 27 results Save | Export

Evaluation of Factors Affecting the Performance of the "S - X[superscript 2]" Item-Fit Index

Peer reviewed

Direct link

Kim, Hyung Jin; Lee, Won-Chan – Journal of Educational Measurement, 2022

Orlando and Thissen (2000) introduced the "S - X[superscript 2]" item-fit index for testing goodness-of-fit with dichotomous item response theory (IRT) models. This study considers and evaluates an alternative approach for computing "S - X[superscript 2]" values and other factors associated with collapsing tables of observed…

Descriptors: Goodness of Fit, Test Items, Item Response Theory, Computation

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

A Shorter Short Version of Barron's Ego Strength Scale

Peer reviewed

Direct link

Kelly, William E.; Daughtry, Don – College Student Journal, 2018

This study developed an abbreviated form of Barron's (1953) Ego Strength Scale for use in research among college student samples. A version of Barron's scale was administered to 100 undergraduate college students. Using item-total score correlations and internal consistency, the scale was reduced to 18 items (Es18). The Es18 possessed adequate…

Descriptors: Undergraduate Students, Self Concept Measures, Test Length, Scores

A Comparison of Score Aggregation Methods for Unidimensional Tests on Different Dimensions. Research Report. ETS RR-18-01

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Feng, Yuling – ETS Research Report Series, 2018

In this study, we propose aggregating test scores with unidimensional within-test structure and multidimensional across-test structure based on a 2-level, 1-factor model. In particular, we compare 6 score aggregation methods: average of standardized test raw scores (M1), regression factor score estimate of the 1-factor model based on the…

Descriptors: Comparative Analysis, Scores, Correlation, Standardized Tests

Designing CAT MOCCA: Guiding Principles and Simulation Research. MOCCA Technical Report MTR-2021-1

Peer reviewed
PDF on ERIC

Download full text

Mark L. Davison; David J. Weiss; Ozge Ersan; Joseph N. DeWeese; Gina Biancarosa; Patrick C. Kennedy – Grantee Submission, 2021

MOCCA is an online assessment of inferential reading comprehension for students in 3rd through 6th grades. It can be used to identify good readers and, for struggling readers, identify those who overly rely on either a Paraphrasing process or an Elaborating process when their comprehension is incorrect. Here a propensity to over-rely on…

Descriptors: Reading Tests, Computer Assisted Testing, Reading Comprehension, Elementary School Students

Profile Analyses as Feedback by Evaluating the Balance in Exam Scores

Peer reviewed
PDF on ERIC

Download full text

Vaheoja, Monika; Verhelst, N. D.; Eggen, T.J.H.M. – European Journal of Science and Mathematics Education, 2019

In this article, the authors applied profile analysis to Maths exam data to demonstrate how different exam forms, differing in difficulty and length, can be reported and easily interpreted. The results were presented for different groups of participants and for different institutions in different Maths domains by evaluating the balance. Some…

Descriptors: Feedback (Response), Foreign Countries, Statistical Analysis, Scores

Assessing the Performance of Classical Test Theory Item Discrimination Estimators in Monte Carlo Simulations

Peer reviewed

Direct link

Bazaldua, Diego A. Luna; Lee, Young-Sun; Keller, Bryan; Fellers, Lauren – Asia Pacific Education Review, 2017

The performance of various classical test theory (CTT) item discrimination estimators has been compared in the literature using both empirical and simulated data, resulting in mixed results regarding the preference of some discrimination estimators over others. This study analyzes the performance of various item discrimination estimators in CTT:…

Descriptors: Test Items, Monte Carlo Methods, Item Response Theory, Correlation

Item Parameter Drift in a Time-Varying Predictor

Peer reviewed

Direct link

Lee, HyeSun – Applied Measurement in Education, 2018

The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…

Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores

Identifying Aberrant Responding: Use of Multiple Measures

Direct link

Steinkamp, Susan Christa – ProQuest LLC, 2017

For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…

Descriptors: Test Items, Item Response Theory, Scores, Test Wiseness

Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Peer reviewed

Direct link

Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

Descriptors: Test Bias, Test Reliability, Performance, Scores

Identifying Sets of Maximally Efficient Items from the Academic Competence Evaluation Scales-Teacher Form

Peer reviewed

Direct link

Anthony, Christopher James; DiPerna, James Clyde – School Psychology Quarterly, 2017

The Academic Competence Evaluation Scales-Teacher Form (ACES-TF; DiPerna & Elliott, 2000) was developed to measure student academic skills and enablers (interpersonal skills, engagement, motivation, and study skills). Although ACES-TF scores have demonstrated psychometric adequacy, the length of the measure may be prohibitive for certain…

Descriptors: Test Items, Efficiency, Item Response Theory, Test Length

Broadening the Scope of Reading Comprehension Using Scenario-Based Assessments: Preliminary Findings and Challenges

Peer reviewed
PDF on ERIC

Download full text

Sabatini, J.; O'Reilly, T.; Halderman, L.; Bruce, K. – Grantee Submission, 2014

Existing reading comprehension assessments have been criticized by researchers, educators, and policy makers, especially regarding their coverage, utility, and authenticity. The purpose of the current study was to evaluate a new assessment of reading comprehension that was designed to broaden the construct of reading. In light of these issues, we…

Descriptors: Reading Comprehension, Vignettes, Reading Tests, Elementary School Students

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Test Length and Decision Quality in Personnel Selection: When Is Short Too Short?

Peer reviewed

Direct link

Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012

Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…

Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement

Bi-Factor Multidimensional Item Response Theory Modeling for Subscores Estimation, Reliability, and Classification

Direct link

Md Desa, Zairul Nor Deana – ProQuest LLC, 2012

In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…

Descriptors: Item Response Theory, Computation, Reliability, Classification

Previous Page | Next Page »

Pages: 1 | 2

Lee, Yi-Hsuan	2
Zhang, Jinming	2
Allspach, Jill R.	1
Anthony, Christopher James	1
Bazaldua, Diego A. Luna	1
Bruce, K.	1
Burton, Nancy	1
Burton, Richard F.	1
Cui, Zhongmin	1
Daughtry, Don	1
David J. Weiss	1
DiPerna, James Clyde	1
Eggen, T.J.H.M.	1
Emons, Wilco H. M.	1
Feigenbaum, Miriam	1
Fellers, Lauren	1
Feng, Yuling	1
Fu, Jianbin	1
Gelbal, Selahattin	1
Gilmer, Jerry S.	1
Gina Biancarosa	1
Halderman, L.	1
Henning, Grant	1
Joseph N. DeWeese	1
More ▼