NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Location
Canada1
Laws, Policies, & Programs
Assessments and Surveys
Early Childhood Longitudinal…1
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023
Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…
Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Tongyun; Jiao, Hong; Macready, George B. – Educational and Psychological Measurement, 2016
The present study investigates different approaches to adding covariates and the impact in fitting mixture item response theory models. Mixture item response theory models serve as an important methodology for tackling several psychometric issues in test development, including the detection of latent differential item functioning. A Monte Carlo…
Descriptors: Item Response Theory, Psychometrics, Test Construction, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Hooker, Giles; Finkelman, Matthew – Psychometrika, 2010
Hooker, Finkelman, and Schwartzman ("Psychometrika," 2009, in press) defined a paradoxical result as the attainment of a higher test score by changing answers from correct to incorrect and demonstrated that such results are unavoidable for maximum likelihood estimates in multidimensional item response theory. The potential for these results to…
Descriptors: Models, Scores, Item Response Theory, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli – Journal of Educational and Behavioral Statistics, 2009
Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…
Descriptors: Bayesian Statistics, Models, Observation, Experiments
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rock, Donald A. – ETS Research Report Series, 2012
This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…
Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development
Peer reviewed Peer reviewed
Direct linkDirect link
Wainer, Howard – Journal of Educational and Behavioral Statistics, 2010
In this essay, the author tries to look forward into the 21st century to divine three things: (i) What skills will researchers in the future need to solve the most pressing problems? (ii) What are some of the most likely candidates to be those problems? and (iii) What are some current areas of research that seem mined out and should not distract…
Descriptors: Research Skills, Researchers, Internet, Access to Information
Peer reviewed Peer reviewed
Direct linkDirect link
van Barneveld, Christina – Applied Psychological Measurement, 2007
The purpose of this study is to examine the effects of a false assumption regarding the motivation of examinees on test construction. Simulated data were generated using two models of item responses (the three-parameter logistic item response model alone and in combination with Wise's examinee persistence model) and were calibrated using a…
Descriptors: Test Construction, Item Response Theory, Models, Bayesian Statistics
Peer reviewed Peer reviewed
Bradlow, Eric T.; Wainer, Howard; Wang, Xiaohui – Psychometrika, 1999
Proposes a parametric approach that involves a modification of standard Item Response Theory models that explicitly accounts for the nesting of items within the same testlets and that can be applied to multiple-choice sections comprising a mixture of independent items and testlets. (Author/SLD)
Descriptors: Bayesian Statistics, Item Response Theory, Models, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
van Barneveld, Christina – Alberta Journal of Educational Research, 2003
The purpose of this study was to examine the potential effect of false assumptions regarding the motivation of examinees on item calibration and test construction. A simulation study was conducted using data generated by means of several models of examinee item response behaviors (the three-parameter logistic model alone and in combination with…
Descriptors: Simulation, Motivation, Computation, Test Construction
Haladyna, Tom; Roid, Gale – 1980
An empirical review of test items is described as an essential step in criterion-referenced test development. The concept of test items' instructional sensitivity is introduced, and research is briefly reviewed which describes four theoretical contexts in which instructional sensitivity indexes have been observed: criterion-referenced; classical…
Descriptors: Achievement Tests, Bayesian Statistics, Course Objectives, Criterion Referenced Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2006
Bayesian networks are frequently used in educational assessments primarily for learning about students' knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A…
Descriptors: Models, Educational Assessment, Diagnostic Tests, Evaluation Methods
Clark, Cynthia L., Ed. – 1976
The principal objectives of this conference were to exchange information, discuss theoretical and empirical developments, and to coordinate research efforts. The papers and their authors are: "The Graded Response Model of Latent Trait Theory and Tailored Testing" by Fumiko Samejima; (Incomplete Orders and Computerized Testing" by…
Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Branching