ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	8

Descriptor

Bayesian Statistics	13
Models	13
Test Construction	13
Item Response Theory	7
Test Items	6
Psychometrics	4
Scores	4
Item Analysis	3
Multiple Choice Tests	3
Responses	3
Achievement Tests	2
Adaptive Testing	2
Criterion Referenced Tests	2
Difficulty Level	2
Educational Assessment	2
Educational Research	2
Error Patterns	2
Evaluation Methods	2
Guessing (Tests)	2
Knowledge Level	2
Measurement Techniques	2
Motivation	2
Probability	2
Simulation	2
Statistical Analysis	2
More ▼

Source

Journal of Educational and…	3
Psychometrika	2
Alberta Journal of…	1
Applied Measurement in…	1
Applied Psychological…	1
ETS Research Report Series	1
Educational and Psychological…	1
International Educational…	1

Publication Type

Journal Articles	10
Reports - Research	7
Reports - Evaluative	4
Speeches/Meeting Papers	2
Collected Works - Proceedings	1
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	1
Middle Schools	1
Postsecondary Education	1

Audience

Researchers

Location

Canada

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Different Approaches to Covariate Inclusion in the Mixture Rasch Model

Peer reviewed

Direct link

Li, Tongyun; Jiao, Hong; Macready, George B. – Educational and Psychological Measurement, 2016

The present study investigates different approaches to adding covariates and the impact in fitting mixture item response theory models. Mixture item response theory models serve as an important methodology for tackling several psychometric issues in test development, including the detection of latent differential item functioning. A Monte Carlo…

Descriptors: Item Response Theory, Psychometrics, Test Construction, Monte Carlo Methods

Paradoxical Results and Item Bundles

Peer reviewed

Direct link

Hooker, Giles; Finkelman, Matthew – Psychometrika, 2010

Hooker, Finkelman, and Schwartzman ("Psychometrika," 2009, in press) defined a paradoxical result as the attainment of a higher test score by changing answers from correct to incorrect and demonstrated that such results are unavoidable for maximum likelihood estimates in multidimensional item response theory. The potential for these results to…

Descriptors: Models, Scores, Item Response Theory, Psychometrics

Bayesian Network Models for Local Dependence among Observable Outcome Variables

Peer reviewed

Direct link

Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli – Journal of Educational and Behavioral Statistics, 2009

Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…

Descriptors: Bayesian Statistics, Models, Observation, Experiments

Modeling Change in Large-Scale Longitudinal Studies of Educational Growth: Four Decades of Contributions to the Assessment of Educational Growth. Research Report. ETS RR-12-04. ETS R&D Scientific and Policy Contributions Series. ETS SPC-12-01

Peer reviewed
PDF on ERIC

Download full text

Rock, Donald A. – ETS Research Report Series, 2012

This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…

Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development

14 Conversations about Three Things

Peer reviewed

Direct link

Wainer, Howard – Journal of Educational and Behavioral Statistics, 2010

In this essay, the author tries to look forward into the 21st century to divine three things: (i) What skills will researchers in the future need to solve the most pressing problems? (ii) What are some of the most likely candidates to be those problems? and (iii) What are some current areas of research that seem mined out and should not distract…

Descriptors: Research Skills, Researchers, Internet, Access to Information

The Effect of Examinee Motivation on Test Construction within an IRT Framework

Peer reviewed

Direct link

van Barneveld, Christina – Applied Psychological Measurement, 2007

The purpose of this study is to examine the effects of a false assumption regarding the motivation of examinees on test construction. Simulated data were generated using two models of item responses (the three-parameter logistic item response model alone and in combination with Wise's examinee persistence model) and were calibrated using a…

Descriptors: Test Construction, Item Response Theory, Models, Bayesian Statistics

A Bayesian Random Effects Model for Testlets.

Peer reviewed

Bradlow, Eric T.; Wainer, Howard; Wang, Xiaohui – Psychometrika, 1999

Proposes a parametric approach that involves a modification of standard Item Response Theory models that explicitly accounts for the nesting of items within the same testlets and that can be applied to multiple-choice sections comprising a mixture of independent items and testlets. (Author/SLD)

Descriptors: Bayesian Statistics, Item Response Theory, Models, Multiple Choice Tests

The Effects of Examinee Motivation on Multiple-Choice Item Parameter Estimates

Peer reviewed

Direct link

van Barneveld, Christina – Alberta Journal of Educational Research, 2003

The purpose of this study was to examine the potential effect of false assumptions regarding the motivation of examinees on item calibration and test construction. A simulation study was conducted using data generated by means of several models of examinee item response behaviors (the three-parameter logistic model alone and in combination with…

Descriptors: Simulation, Motivation, Computation, Test Construction

The Role of Instructional Sensitivity in the Empirical Review of Criterion-Referenced Test Items.

Haladyna, Tom; Roid, Gale – 1980

An empirical review of test items is described as an essential step in criterion-referenced test development. The concept of test items' instructional sensitivity is introduced, and research is briefly reviewed which describes four theoretical contexts in which instructional sensitivity indexes have been observed: criterion-referenced; classical…

Descriptors: Achievement Tests, Bayesian Statistics, Course Objectives, Criterion Referenced Tests

Model Diagnostics for Bayesian Networks

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2006

Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…

Descriptors: Models, Educational Assessment, Diagnostic Tests, Evaluation Methods

Proceedings of the First Conference on Computerized Adaptive Testing (Washington, D.C., June 12-13, 1975).

Download full text

Clark, Cynthia L., Ed. – 1976

The principal objectives of this conference were to exchange information, discuss theoretical and empirical developments, and to coordinate research efforts. The papers and their authors are: "The Graded Response Model of Latent Trait Theory and Tailored Testing" by Fumiko Samejima; (Incomplete Orders and Computerized Testing" by…

Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Branching

Wainer, Howard	2
van Barneveld, Christina	2
Abu-Ghazalah, Rashid M.	1
Almond, Russell G.	1
Bradlow, Eric T.	1
Clark, Cynthia L., Ed.	1
Dubins, David N.	1
Finkelman, Matthew	1
Haladyna, Tom	1
Hemat, Lisa A.	1
Hooker, Giles	1
Jiao, Hong	1
Li, Tongyun	1
Macready, George B.	1
Mulder, Joris	1
Piech, Chris	1
Poon, Gregory M. K.	1
Rock, Donald A.	1
Roid, Gale	1
Sinharay, Sandip	1
Tack, Anaïs	1
Wang, Xiaohui	1
Yan, Duanli	1
More ▼