Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 7 |
Descriptor
| Bayesian Statistics | 11 |
| Comparative Analysis | 11 |
| Scoring | 11 |
| Test Items | 5 |
| Adaptive Testing | 4 |
| Difficulty Level | 3 |
| Item Response Theory | 3 |
| Models | 3 |
| Scores | 3 |
| Simulation | 3 |
| Computation | 2 |
| More ▼ | |
Source
| ETS Research Report Series | 2 |
| Applied Measurement in… | 1 |
| Educational and Psychological… | 1 |
| International Journal of… | 1 |
| Journal of Educational… | 1 |
| National Center for Research… | 1 |
| ProQuest LLC | 1 |
Author
| Bruce L. Sherin | 1 |
| DeAyala, R. J. | 1 |
| Foster, Colin | 1 |
| Hsu, Tse-Chi | 1 |
| Iseli, Markus R. | 1 |
| Isham, Steven | 1 |
| James W. Stigler | 1 |
| Johnson, Matthew | 1 |
| Kim, Sooyeon | 1 |
| Kim, Stella Yun | 1 |
| Kirisci, Levent | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 9 |
| Journal Articles | 6 |
| Speeches/Meeting Papers | 2 |
| Dissertations/Theses -… | 1 |
| Reports - Evaluative | 1 |
Education Level
| Higher Education | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023
This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…
Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation
Zwick, Rebecca; Ye, Lei; Isham, Steven – Journal of Educational Measurement, 2018
In typical differential item functioning (DIF) assessments, an item's DIF status is not influenced by its status in previous test administrations. An item that has shown DIF at multiple administrations may be treated the same way as an item that has shown DIF in only the most recent administration. Therefore, much useful information about the…
Descriptors: Test Bias, Testing, Test Items, Bayesian Statistics
Foster, Colin – International Journal of Science and Mathematics Education, 2022
Confidence assessment (CA) involves students stating alongside each of their answers a confidence rating (e.g. 0 low to 10 high) to express how certain they are that their answer is correct. Each student's score is calculated as the sum of the confidence ratings on the items that they answered correctly, minus the sum of the confidence ratings on…
Descriptors: Mathematics Tests, Mathematics Education, Secondary School Students, Meta Analysis
Kim, Sooyeon; Moses, Tim; Yoo, Hanwook Henry – ETS Research Report Series, 2015
The purpose of this inquiry was to investigate the effectiveness of item response theory (IRT) proficiency estimators in terms of estimation bias and error under multistage testing (MST). We chose a 2-stage MST design in which 1 adaptation to the examinees' ability levels takes place. It includes 4 modules (1 at Stage 1, 3 at Stage 2) and 3 paths…
Descriptors: Item Response Theory, Computation, Statistical Bias, Error of Measurement
Nicole B. Kersting; Bruce L. Sherin; James W. Stigler – Educational and Psychological Measurement, 2014
In this study, we explored the potential for machine scoring of short written responses to the Classroom-Video-Analysis (CVA) assessment, which is designed to measure teachers' usable mathematics teaching knowledge. We created naïve Bayes classifiers for CVA scales assessing three different topic areas and compared computer-generated scores to…
Descriptors: Scoring, Automation, Video Technology, Teacher Evaluation
Suh, Hongwook – ProQuest LLC, 2010
Response time has been regarded as an important source for investigating the relationship between human performance and response speed. It is important to examine the relationship between response time and item characteristics, especially in the perspective of the relationship between response time and various factors that affect examinee's…
Descriptors: Bayesian Statistics, Computation, Reaction Time, Item Response Theory
Iseli, Markus R.; Koenig, Alan D.; Lee, John J.; Wainess, Richard – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2010
Assessment of complex task performance is crucial to evaluating personnel in critical job functions such as Navy damage control operations aboard ships. Games and simulations can be instrumental in this process, as they can present a broad range of complex scenarios without involving harm to people or property. However, "automatic"…
Descriptors: Performance Tests, Performance Based Assessment, Decision Making Skills, Military Training
DeAyala, R. J.; Koch, William R. – 1986
A computerized flexilevel test was implemented and its ability estimates were compared with those of a Bayesian estimation based computerized adaptive test (CAT) as well as with known true ability estimates. Results showed that when the flexilevel test was terminated according to Lord's criterion, its ability estimates were highly and…
Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Comparative Analysis
Sinharay, Sandip; Johnson, Matthew – ETS Research Report Series, 2005
"Item models" (LaDuca, Staples, Templeton, & Holzman, 1986) are classes from which it is possible to generate/produce items that are equivalent/isomorphic to other items from the same model (e.g., Bejar, 1996; Bejar, 2002). They have the potential to produce large number of high-quality items at reduced cost. This paper introduces…
Descriptors: Item Analysis, Test Items, Scoring, Psychometrics
Kirisci, Levent; Hsu, Tse-Chi – 1992
A predictive adaptive testing (PAT) strategy was developed based on statistical predictive analysis, and its feasibility was studied by comparing PAT performance to those of the Flexilevel, Bayesian modal, and expected a posteriori (EAP) strategies in a simulated environment. The proposed adaptive test is based on the idea of using item difficulty…
Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Analysis, Computer Assisted Testing
Weiss, David J. – 1976
Three and one-half years of research on computerized ability testing are summarized. The original objectives of the research were: (1) to develop and implement the stratified computer-based ability test; (2) to compare, on psychometric criteria, the various approaches to computer-based ability testing, including the stratified computerized test,…
Descriptors: Adaptive Testing, Bayesian Statistics, Branching, Comparative Analysis

Peer reviewed
Direct link
