Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 6 |
| Since 2017 (last 10 years) | 10 |
Descriptor
| Bayesian Statistics | 10 |
| Test Construction | 10 |
| Test Items | 6 |
| Comparative Analysis | 4 |
| Computer Assisted Testing | 4 |
| Item Analysis | 4 |
| Item Response Theory | 3 |
| Scores | 3 |
| Simulation | 3 |
| Test Reliability | 3 |
| Decision Making | 2 |
| More ▼ | |
Source
Author
| Abu-Ghazalah, Rashid M. | 1 |
| Bao, Lei | 1 |
| Chen, Cheng | 1 |
| Chen, Ping | 1 |
| Dubins, David N. | 1 |
| Eichas, Kyle | 1 |
| Eninger, Lilianne | 1 |
| Eray Selçuk | 1 |
| Ergül Demir | 1 |
| Ferrer-Wreder, Laura | 1 |
| Fritchman, Joseph | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 9 |
| Reports - Research | 8 |
| Information Analyses | 1 |
| Reports - Descriptive | 1 |
| Reports - Evaluative | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
| Higher Education | 2 |
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| Kindergarten | 1 |
| Postsecondary Education | 1 |
| Primary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
| Preschool and Kindergarten… | 1 |
What Works Clearinghouse Rating
Zhang, Susu; Li, Anqi; Wang, Shiyu – Educational Measurement: Issues and Practice, 2023
In computer-based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable-length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test-taking behavior, which can inform test development and…
Descriptors: Computer Assisted Testing, Test Construction, Test Wiseness, Test Items
Eray Selçuk; Ergül Demir – International Journal of Assessment Tools in Education, 2024
This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item…
Descriptors: Item Response Theory, Item Analysis, Test Items, Simulation
Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023
Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…
Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Thomas, Sarah; Eichas, Kyle; Eninger, Lilianne; Ferrer-Wreder, Laura – Scandinavian Journal of Educational Research, 2021
This cross-sectional study established the psychometric properties and factor structure of the Preschool and Kindergarten Behavior Scales (PKBS) and an index of empathy in a sample of Swedish four to six year olds (N = 115). Using Bayesian structural equation modeling, we found that a five-factor PKBS and one-factor empathy model provided good fit…
Descriptors: Psychometrics, Swedish, Foreign Countries, Test Construction
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Bao, Lei; Koenig, Kathleen; Xiao, Yang; Fritchman, Joseph; Zhou, Shaona; Chen, Cheng – Physical Review Physics Education Research, 2022
Abilities in scientific thinking and reasoning have been emphasized as core areas of initiatives, such as the Next Generation Science Standards or the College Board Standards for College Success in Science, which focus on the skills the future will demand of today's students. Although there is rich literature on studies of how these abilities…
Descriptors: Physics, Science Instruction, Teaching Methods, Thinking Skills
Silva, R. M.; Guan, Y.; Swartz, T. B. – Journal on Efficiency and Responsibility in Education and Science, 2017
This paper attempts to bridge the gap between classical test theory and item response theory. It is demonstrated that the familiar and popular statistics used in classical test theory can be translated into a Bayesian framework where all of the advantages of the Bayesian paradigm can be realized. In particular, prior opinion can be introduced and…
Descriptors: Item Response Theory, Bayesian Statistics, Test Construction, Markov Processes
Ting, Mu Yu – EURASIA Journal of Mathematics, Science & Technology Education, 2017
Using the capabilities of expert knowledge structures, the researcher prepared test questions on the university calculus topic of "finding the area by integration." The quiz is divided into two types of multiple choice items (one out of four and one out of many). After the calculus course was taught and tested, the results revealed that…
Descriptors: Calculus, Mathematics Instruction, College Mathematics, Multiple Choice Tests
Chen, Ping – Journal of Educational and Behavioral Statistics, 2017
Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…
Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing

Peer reviewed
Direct link
