ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	6
Since 2017 (last 10 years)	10
Since 2007 (last 20 years)	20

Descriptor

Bayesian Statistics	65
Test Construction	65
Test Items	28
Adaptive Testing	21
Computer Assisted Testing	21
Item Response Theory	16
Mathematical Models	16
Simulation	14
Comparative Analysis	13
Item Analysis	13
Latent Trait Theory	13
Models	13
Criterion Referenced Tests	12
Item Banks	10
Maximum Likelihood Statistics	10
Estimation (Mathematics)	9
Probability	9
Statistical Analysis	9
Test Reliability	9
Measurement Techniques	8
Psychometrics	8
Scores	8
Ability	7
Mastery Tests	7
Monte Carlo Methods	7
More ▼

Publication Type

Reports - Research	35
Journal Articles	31
Reports - Evaluative	18
Speeches/Meeting Papers	10
Information Analyses	3
Reports - Descriptive	3
Numerical/Quantitative Data	2
Collected Works - General	1
Collected Works - Proceedings	1
Collected Works - Serials	1
ERIC Digests in Full Text	1
ERIC Publications	1
More ▼

Education Level

Higher Education	3
Postsecondary Education	2
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Kindergarten	1
Middle Schools	1
Primary Education	1

Audience

Researchers

Location

Canada	1
Germany (Berlin)	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

COMPASS (Computer Assisted…	1
Comprehensive Tests of Basic…	1
Early Childhood Longitudinal…	1
Michigan Test of English…	1
Preschool and Kindergarten…	1
School and College Ability…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 65 results Save | Export

Exploration of Latent Structure in Test Revision and Review Log Data

Peer reviewed

Direct link

Zhang, Susu; Li, Anqi; Wang, Shiyu – Educational Measurement: Issues and Practice, 2023

In computer-based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable-length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test-taking behavior, which can inform test development and…

Descriptors: Computer Assisted Testing, Test Construction, Test Wiseness, Test Items

Comparison of Item Response Theory Ability and Item Parameters According to Classical and Bayesian Estimation Methods

Peer reviewed
PDF on ERIC

Download full text

Eray Selçuk; Ergül Demir – International Journal of Assessment Tools in Education, 2024

This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item…

Descriptors: Item Response Theory, Item Analysis, Test Items, Simulation

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Psychometric Properties of a Swedish Translation of the Preschool and Kindergarten Behavior Scales (PKBS): A Bayesian Structural Equation Modeling Analysis

Peer reviewed

Direct link

Thomas, Sarah; Eichas, Kyle; Eninger, Lilianne; Ferrer-Wreder, Laura – Scandinavian Journal of Educational Research, 2021

This cross-sectional study established the psychometric properties and factor structure of the Preschool and Kindergarten Behavior Scales (PKBS) and an index of empathy in a sample of Swedish four to six year olds (N = 115). Using Bayesian structural equation modeling, we found that a five-factor PKBS and one-factor empathy model provided good fit…

Descriptors: Psychometrics, Swedish, Foreign Countries, Test Construction

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Theoretical Model and Quantitative Assessment of Scientific Thinking and Reasoning

Peer reviewed

Direct link

Bao, Lei; Koenig, Kathleen; Xiao, Yang; Fritchman, Joseph; Zhou, Shaona; Chen, Cheng – Physical Review Physics Education Research, 2022

Abilities in scientific thinking and reasoning have been emphasized as core areas of initiatives, such as the Next Generation Science Standards or the College Board Standards for College Success in Science, which focus on the skills the future will demand of today's students. Although there is rich literature on studies of how these abilities…

Descriptors: Physics, Science Instruction, Teaching Methods, Thinking Skills

Bayesian Diagnostics for Test Design and Analysis

Peer reviewed
PDF on ERIC

Download full text

Silva, R. M.; Guan, Y.; Swartz, T. B. – Journal on Efficiency and Responsibility in Education and Science, 2017

This paper attempts to bridge the gap between classical test theory and item response theory. It is demonstrated that the familiar and popular statistics used in classical test theory can be translated into a Bayesian framework where all of the advantages of the Bayesian paradigm can be realized. In particular, prior opinion can be introduced and…

Descriptors: Item Response Theory, Bayesian Statistics, Test Construction, Markov Processes

Applying Evidence-Centered Design for the Development of Game-Based Assessments in Physics Playground

Peer reviewed

Direct link

Kim, Yoon Jeon; Almond, Russell G.; Shute, Valerie J. – International Journal of Testing, 2016

Game-based assessment (GBA) is a specific use of educational games that employs game activities to elicit evidence for educationally valuable skills and knowledge. While this approach can provide individualized and diagnostic information about students, the design and development of assessment mechanics for a GBA is a nontrivial task. In this…

Descriptors: Design, Evidence Based Practice, Test Construction, Physics

Definite Integral Automatic Analysis Mechanism Research and Development Using the "Find the Area by Integration" Unit as an Example

Peer reviewed

Direct link

Ting, Mu Yu – EURASIA Journal of Mathematics, Science & Technology Education, 2017

Using the capabilities of expert knowledge structures, the researcher prepared test questions on the university calculus topic of "finding the area by integration." The quiz is divided into two types of multiple choice items (one out of four and one out of many). After the calculus course was taught and tested, the results revealed that…

Descriptors: Calculus, Mathematics Instruction, College Mathematics, Multiple Choice Tests

A Comparative Study of Online Item Calibration Methods in Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Chen, Ping – Journal of Educational and Behavioral Statistics, 2017

Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…

Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing

Different Approaches to Covariate Inclusion in the Mixture Rasch Model

Peer reviewed

Direct link

Li, Tongyun; Jiao, Hong; Macready, George B. – Educational and Psychological Measurement, 2016

The present study investigates different approaches to adding covariates and the impact in fitting mixture item response theory models. Mixture item response theory models serve as an important methodology for tackling several psychometric issues in test development, including the detection of latent differential item functioning. A Monte Carlo…

Descriptors: Item Response Theory, Psychometrics, Test Construction, Monte Carlo Methods

Designing Medical Tests: The Other Side of Bayes' Theorem

Peer reviewed

Direct link

Ross, Andrew M. – College Mathematics Journal, 2012

To compute the probability of having a disease, given a positive test result, is a standard probability problem. The sensitivity and specificity of the test must be given and the prevalence of the disease. We ask how a test-maker might determine the tradeoff between sensitivity and specificity. Adding hypothetical costs for detecting or failing to…

Descriptors: Diseases, Probability, Bayesian Statistics, Test Construction

Developing a Computer-Based Assessment of Complex Problem Solving in Chemistry

Peer reviewed

Direct link

Scherer, Ronny; Meßinger-Koppelt, Jenny; Tiemann, Rüdiger – International Journal of STEM Education, 2014

Background: Complex problem-solving competence is regarded as a key construct in science education. But due to the necessity of using interactive and intransparent assessment procedures, appropriate measures of the construct are rare. This paper consequently presents the development and validation of a computer-based problem-solving environment,…

Descriptors: Computer Assisted Testing, Problem Solving, Chemistry, Science Tests

Item Pool Design for an Operational Variable-Length Computerized Adaptive Test

Peer reviewed

Direct link

He, Wei; Reckase, Mark D. – Educational and Psychological Measurement, 2014

For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…

Descriptors: Item Banks, Test Length, Computer Assisted Testing, Adaptive Testing

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Journal of Educational and…	5
Applied Psychological…	4
Psychometrika	3
Applied Measurement in…	2
Educational and Psychological…	2
Alberta Journal of…	1
College Mathematics Journal	1
ETS Research Report Series	1
EURASIA Journal of…	1
Education and Information…	1
Educational Measurement:…	1
Evaluation and the Health…	1
International Educational…	1
International Journal of…	1
International Journal of STEM…	1
International Journal of…	1
Journal of Education…	1
Journal of Educational…	1
Journal of Educational…	1
Journal on Efficiency and…	1
Physical Review Physics…	1
Programmed Learning and…	1
Review of Educational Research	1
Scandinavian Journal of…	1
More ▼

Reckase, Mark D.	3
Weiss, David J.	3
Almond, Russell G.	2
Glas, Cees A. W.	2
Hambleton, Ronald K.	2
Huynh, Huynh	2
McKinley, Robert L.	2
Rock, Donald A.	2
Rudner, Lawrence M.	2
Sinharay, Sandip	2
Vos, Hans J.	2
Wainer, Howard	2
van Barneveld, Christina	2
van der Linden, Wim J.	2
Abu-Ghazalah, Rashid M.	1
Bao, Lei	1
Bejar, Isaac I.	1
Berger, Martijn P. F.	1
Bradlow, Eric T.	1
Chen, Cheng	1
Chen, Ping	1
Chernyshenko, Oleksandr S.	1
Chou, Chih-Ping	1
Clark, Cynthia L., Ed.	1
More ▼