Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 18 |
Descriptor
Bayesian Statistics | 24 |
Item Response Theory | 10 |
Models | 10 |
Computation | 9 |
Statistical Analysis | 9 |
Scores | 7 |
Markov Processes | 6 |
Psychometrics | 6 |
Accuracy | 5 |
Comparative Analysis | 5 |
Monte Carlo Methods | 5 |
More ▼ |
Source
ETS Research Report Series | 24 |
Author
Kim, Sooyeon | 3 |
Moses, Tim | 3 |
Rock, Donald A. | 3 |
Sinharay, Sandip | 3 |
Almond, Russell G. | 2 |
Oh, Hyeonjoo J. | 2 |
Yan, Duanli | 2 |
Almond, Russell | 1 |
Bauer, Malcolm | 1 |
Blew, Edwin O. | 1 |
Bradlow, Eric T. | 1 |
More ▼ |
Publication Type
Journal Articles | 24 |
Reports - Research | 24 |
Numerical/Quantitative Data | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Education | 3 |
Higher Education | 3 |
Postsecondary Education | 3 |
Secondary Education | 3 |
Early Childhood Education | 2 |
Grade 8 | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Grade 1 | 1 |
High Schools | 1 |
Kindergarten | 1 |
More ▼ |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Early Childhood Longitudinal… | 3 |
Graduate Record Examinations | 2 |
National Assessment of… | 2 |
National Merit Scholarship… | 1 |
Pre Professional Skills Tests | 1 |
Preliminary Scholastic… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2016
The purpose of this study is to evaluate the extent to which item response theory (IRT) proficiency estimation methods are robust to the presence of aberrant responses under the "GRE"® General Test multistage adaptive testing (MST) design. To that end, a wide range of atypical response behaviors affecting as much as 10% of the test items…
Descriptors: Item Response Theory, Computation, Robustness (Statistics), Response Style (Tests)
von Davier, Matthias – ETS Research Report Series, 2014
Diagnostic models combine multiple binary latent variables in an attempt to produce a latent structure that provides more information about test takers' performance than do unidimensional latent variable models. Recent developments in diagnostic modeling emphasize the possibility that multiple skills may interact in a conjunctive way within the…
Descriptors: Models, Equations (Mathematics), Measurement Techniques, Item Response Theory
Kim, Sooyeon; Moses, Tim; Yoo, Hanwook Henry – ETS Research Report Series, 2015
The purpose of this inquiry was to investigate the effectiveness of item response theory (IRT) proficiency estimators in terms of estimation bias and error under multistage testing (MST). We chose a 2-stage MST design in which 1 adaptation to the examinees' ability levels takes place. It includes 4 modules (1 at Stage 1, 3 at Stage 2) and 3 paths…
Descriptors: Item Response Theory, Computation, Statistical Bias, Error of Measurement
Fu, Jianbin; Zapata, Diego; Mavronikolas, Elia – ETS Research Report Series, 2014
Simulation or game-based assessments produce outcome data and process data. In this article, some statistical models that can potentially be used to analyze data from simulation or game-based assessments are introduced. Specifically, cognitive diagnostic models that can be used to estimate latent skills from outcome data so as to scale these…
Descriptors: Simulation, Evaluation Methods, Games, Data Collection
van Rijn, Peter W.; Rijmen, Frank – ETS Research Report Series, 2012
Hooker and colleagues addressed a paradoxical situation that can arise in the application of multidimensional item response theory (MIRT) models to educational test data. We demonstrate that this MIRT paradox is an instance of the explaining-away phenomenon in Bayesian networks, and we attempt to enhance the understanding of MIRT models by placing…
Descriptors: Item Response Theory, Educational Testing, Bayesian Statistics, Statistical Analysis
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Qian, Xiaoyu; Nandakumar, Ratna; Glutting, Joseoph; Ford, Danielle; Fifield, Steve – ETS Research Report Series, 2017
In this study, we investigated gender and minority achievement gaps on 8th-grade science items employing a multilevel item response methodology. Both gaps were wider on physics and earth science items than on biology and chemistry items. Larger gender gaps were found on items with specific topics favoring male students than other items, for…
Descriptors: Item Analysis, Gender Differences, Achievement Gap, Grade 8
Moses, Tim; Oh, Hyeonjoo J. – ETS Research Report Series, 2009
Pseudo Bayes probability estimates are weighted averages of raw and modeled probabilities; these estimates have been studied primarily in nonpsychometric contexts. The purpose of this study was to evaluate pseudo Bayes probability estimates as applied to the estimation of psychometric test score distributions and chained equipercentile equating…
Descriptors: Bayesian Statistics, Computation, Equated Scores, Probability
Rock, Donald A. – ETS Research Report Series, 2012
This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…
Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development
Oh, Hyeonjoo J.; Guo, Hongwen; Walker, Michael E. – ETS Research Report Series, 2009
Issues of equity and fairness across subgroups of the population (e.g., gender or ethnicity) must be seriously considered in any standardized testing program. For this reason, many testing programs require some means for assessing test characteristics, such as reliability, for subgroups of the population. However, often only small sample sizes are…
Descriptors: Standardized Tests, Test Reliability, Sample Size, Bayesian Statistics
Kim, Sooyeon; Linvingston, Samuel A.; Lewis, Charles – ETS Research Report Series, 2008
This paper describes an empirical evaluation of a Bayesian procedure for equating scores on test forms taken by small numbers of examinees, using collateral information from the equating of other test forms. In this procedure, a separate Bayesian estimate is derived for the equated score at each raw-score level, making it unnecessary to specify a…
Descriptors: Equated Scores, Statistical Analysis, Sample Size, Bayesian Statistics
Hansen, Eric G.; Mislevy, Robert J.; Steinberg, Linda S. – ETS Research Report Series, 2008
Accommodations play a key role in enabling individuals with disabilities to participate in the National Assessment of Educational Progress (NAEP) and other large-scale assessments. However, it can be difficult to know how accommodations affect the validity of results, thus making it difficult to determine which accommodations should be allowed.…
Descriptors: National Competency Tests, Disabilities, Reading Instruction, Mathematics Instruction
Almond, Russell G. – ETS Research Report Series, 2007
Over the course of instruction, instructors generally collect a great deal of information about each student. Integrating that information intelligently requires models for how a student's proficiency changes over time. Armed with such models, instructors can "filter" the data--more accurately estimate the student's current proficiency…
Descriptors: Markov Processes, Decision Making, Student Evaluation, Learning Processes
Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli – ETS Research Report Series, 2006
Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task that may be dependent. This paper explores four design patterns for modeling locally dependent observations from the same task: (1) No context--Ignore dependence among observables; (2) Compensatory…
Descriptors: Bayesian Statistics, Networks, Models, Design
Shute, Valerie J.; Ventura, Matthew; Bauer, Malcolm; Zapata-Rivera, Diego – ETS Research Report Series, 2008
To reveal what is being learned during the gaming experience, this report proposes an approach for embedding assessments in immersive games, drawing on recent advances in assessment design. Key to this approach are formative assessment to guide instructional experiences and evidence-centered design to systematically analyze the assessment argument…
Descriptors: Educational Games, Formative Evaluation, Instructional Design, Evidence Based Practice
Previous Page | Next Page »
Pages: 1 | 2