ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	8

Descriptor

Bayesian Statistics	14
Scoring	14
Test Items	14
Item Response Theory	6
Comparative Analysis	5
Maximum Likelihood Statistics	5
Adaptive Testing	3
Computer Assisted Testing	3
Difficulty Level	3
Models	3
Probability	3
Simulation	3
Accuracy	2
Classification	2
Computer Simulation	2
Equations (Mathematics)	2
Item Analysis	2
Mathematical Models	2
Monte Carlo Methods	2
Scores	2
Selection	2
Statistical Bias	2
Test Construction	2
Test Format	2
Test Reliability	2
More ▼

Source

Educational and Psychological…	2
Applied Measurement in…	1
ETS Research Report Series	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Practical Assessment,…	1
ProQuest LLC	1
Psychometrika	1

Publication Type

Journal Articles	9
Reports - Research	8
Reports - Evaluative	3
Reports - Descriptive	2
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Numerical/Quantitative Data	1

Education Level

Early Childhood Education	1
Higher Education	1
Postsecondary Education	1
Preschool Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Comprehensive Tests of Basic…	1
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Aggregating Polytomous DIF Results over Multiple Test Administrations

Peer reviewed

Direct link

Zwick, Rebecca; Ye, Lei; Isham, Steven – Journal of Educational Measurement, 2018

In typical differential item functioning (DIF) assessments, an item's DIF status is not influenced by its status in previous test administrations. An item that has shown DIF at multiple administrations may be treated the same way as an item that has shown DIF in only the most recent administration. Therefore, much useful information about the…

Descriptors: Test Bias, Testing, Test Items, Bayesian Statistics

Evaluating the Effectiveness of the Expectation-Maximization (EM) Algorithm for Bayesian Network Calibration

Direct link

Tingir, Seyfullah – ProQuest LLC, 2019

Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…

Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability

Implementing Confidence Assessment in Low-Stakes, Formative Mathematics Assessments

Peer reviewed

Direct link

Foster, Colin – International Journal of Science and Mathematics Education, 2022

Confidence assessment (CA) involves students stating alongside each of their answers a confidence rating (e.g. 0 low to 10 high) to express how certain they are that their answer is correct. Each student's score is calculated as the sum of the confidence ratings on the items that they answered correctly, minus the sum of the confidence ratings on…

Descriptors: Mathematics Tests, Mathematics Education, Secondary School Students, Meta Analysis

Treatment of Not-Administered Items on Individually Administered Intelligence Tests

Peer reviewed

Direct link

He, Wei; Wolfe, Edward W. – Educational and Psychological Measurement, 2012

In administration of individually administered intelligence tests, items are commonly presented in a sequence of increasing difficulty, and test administration is terminated after a predetermined number of incorrect answers. This practice produces stochastically censored data, a form of nonignorable missing data. By manipulating four factors…

Descriptors: Individual Testing, Intelligence Tests, Test Items, Test Length

Scoring and Classifying Examinees Using Measurement Decision Theory

Peer reviewed

Direct link

Rudner, Lawrence M. – Practical Assessment, Research & Evaluation, 2009

This paper describes and evaluates the use of measurement decision theory (MDT) to classify examinees based on their item response patterns. The model has a simple framework that starts with the conditional probabilities of examinees in each category or mastery state responding correctly to each item. The presented evaluation investigates: (1) the…

Descriptors: Classification, Scoring, Item Response Theory, Measurement

A Compensatory Approach to Optimal Selection with Mastery Scores. Research Report 94-2.

Download full text

van der Linden, Wim J.; Vos, Hans J. – 1994

This paper presents some Bayesian theories of simultaneous optimization of decision rules for test-based decisions. Simultaneous decision making arises when an institution has to make a series of selection, placement, or mastery decisions with respect to subjects from a population. An obvious example is the use of individualized instruction in…

Descriptors: Bayesian Statistics, Decision Making, Foreign Countries, Scores

Covariates of the Rating Process in Hierarchical Models for Multiple Ratings of Test Items

Peer reviewed

Direct link

Mariano, Louis T.; Junker, Brian W. – Journal of Educational and Behavioral Statistics, 2007

When constructed response test items are scored by more than one rater, the repeated ratings allow for the consideration of individual rater bias and variability in estimating student proficiency. Several hierarchical models based on item response theory have been introduced to model such effects. In this article, the authors demonstrate how these…

Descriptors: Test Items, Item Response Theory, Rating Scales, Scoring

Fixed-Weight Methods of Scoring Computer-Based Adaptive Tests. Computerized Testing Report. LSAC Research Report Series.

PDF pending restoration

Green, Bert F. – 2002

Maximum likelihood and Bayesian estimates of proficiency, typically used in adaptive testing, use item weights that depend on test taker proficiency to estimate test taker proficiency. In this study, several methods were explored through computer simulation using fixed item weights, which depend mainly on the items difficulty. The simpler scores…

Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Computer Simulation

Use of Three-Parameter Item Response Theory in the Development of CTBS, Form U, and TCS.

Yen, Wendy M. – 1982

The three-parameter logistic model discussed was used by CTB/McGraw-Hill in the development of the Comprehensive Tests of Basic Skills, Form U (CTBS/U) and the Test of Cognitive Skills (TCS), published in the fall of 1981. The development, standardization, and scoring of the tests are described, particularly as these procedures were influenced by…

Descriptors: Achievement Tests, Bayesian Statistics, Cognitive Processes, Data Collection

Multidimensional Adaptive Testing.

Peer reviewed

Segall, Daniel O. – Psychometrika, 1996

Maximum likelihood and Bayesian procedures are presented for item selection and scoring of multidimensional adaptive tests. A demonstration with simulated response data illustrates that multidimensional adaptive testing can provide equal or higher reliabilities with fewer items than are required in one-dimensional adaptive testing. (SLD)

Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Equations (Mathematics)

Analysis of Data from an Admissions Test with Item Models. Research Report. ETS RR-05-06

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Johnson, Matthew – ETS Research Report Series, 2005

"Item models" (LaDuca, Staples, Templeton, & Holzman, 1986) are classes from which it is possible to generate/produce items that are equivalent/isomorphic to other items from the same model (e.g., Bejar, 1996; Bejar, 2002). They have the potential to produce large number of high-quality items at reduced cost. This paper introduces…

Descriptors: Item Analysis, Test Items, Scoring, Psychometrics

Estimation of Ability Level by Using Only Observable Quantities in Adaptive Testing.

Download full text

Kirisci, Levent; Hsu, Tse-Chi – 1992

A predictive adaptive testing (PAT) strategy was developed based on statistical predictive analysis, and its feasibility was studied by comparing PAT performance to those of the Flexilevel, Bayesian modal, and expected a posteriori (EAP) strategies in a simulated environment. The proposed adaptive test is based on the idea of using item difficulty…

Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Analysis, Computer Assisted Testing

Foster, Colin	1
Green, Bert F.	1
He, Wei	1
Hsu, Tse-Chi	1
Isham, Steven	1
Jiayi Deng	1
Johnson, Matthew	1
Joseph A. Rios	1
Junker, Brian W.	1
Kim, Stella Yun	1
Kirisci, Levent	1
Lee, Won-Chan	1
Mariano, Louis T.	1
Rudner, Lawrence M.	1
Segall, Daniel O.	1
Sinharay, Sandip	1
Tingir, Seyfullah	1
Vos, Hans J.	1
Wolfe, Edward W.	1
Ye, Lei	1
Yen, Wendy M.	1
Zwick, Rebecca	1
van der Linden, Wim J.	1
More ▼