NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)3
Since 2006 (last 20 years)7
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of…11
Program for International…1
Trends in International…1
What Works Clearinghouse Rating
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zehner, Fabian; Harrison, Scott; Eichmann, Beate; Deribo, Tobias; Bengs, Daniel; Andersen, Nico; Hahnel, Carolin – International Educational Data Mining Society, 2020
The "2nd Annual WPI-UMASS-UPENN EDM Data Mining Challenge" required contestants to predict efficient testtaking based on log data. In this paper, we describe our theory-driven and psychometric modeling approach. For feature engineering, we employed the Log-Normal Response Time Model for estimating latent person speed, and the Generalized…
Descriptors: Data Analysis, Competition, Classification, Prediction
Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019
The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…
Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Braun, Henry; von Davier, Matthias – Large-scale Assessments in Education, 2017
Background: Economists are making increasing use of measures of student achievement obtained through large-scale survey assessments such as NAEP, TIMSS, and PISA. The construction of these measures, employing plausible value (PV) methodology, is quite different from that of the more familiar test scores associated with assessments such as the SAT…
Descriptors: Scores, Test Use, Measurement, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G. – Applied Psychological Measurement, 2012
When tests consist of multiple-choice and constructed-response items, researchers are confronted with the question of which item response theory (IRT) model combination will appropriately represent the data collected from these mixed-format tests. This simulation study examined the performance of six model selection criteria, including the…
Descriptors: Item Response Theory, Models, Selection, Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2010
This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates…
Descriptors: Item Response Theory, Statistical Analysis, Regression (Statistics), Models
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2008
Xu and von Davier (2006) demonstrated the feasibility of using the general diagnostic model (GDM) to analyze National Assessment of Educational Progress (NAEP) proficiency data. Their work showed that the GDM analysis not only led to conclusions for gender and race groups similar to those published in the NAEP Report Card, but also allowed…
Descriptors: National Competency Tests, Models, Data Analysis, Reading Tests
Burton, Nancy W.; And Others – 1976
Assessment exercises (items) in three different formats--multiple-choice with an "I don't know" (IDK) option, multiple-choice without the IDK, and open-ended--were placed at the beginning, middle and end of 45-minute assessment packages (instruments). A balanced incomplete blocks analysis of variance was computed to determine the biasing…
Descriptors: Age Differences, Difficulty Level, Educational Assessment, Guessing (Tests)
Rudner, Lawrence M.; And Others – 1995
Fit statistics provide a direct measure of assessment accuracy by analyzing the fit of measurement models to an individual's (or group's) response pattern. Students that lose interest during the assessment, for example, will miss exercises that are within their abilities. Such students will respond correctly to some more difficult items and…
Descriptors: Difficulty Level, Educational Assessment, Goodness of Fit, Measurement Techniques
Beaton, Albert E.; Johnson, Eugene G. – 1990
When the Educational Testing Service became the administrator of the National Assessment of Educational Progress (NAEP) in 1983, it introduced scales based on item response theory (IRT) as a way of presenting results of the assessment to the general public. Some properties of the scales and their uses are discussed. Initial attempts at presenting…
Descriptors: Academic Achievement, Data Interpretation, Educational Assessment, Elementary Secondary Education
Wainer, Howard; And Others – 1992
Four researchers at the Educational Testing Service describe what they consider some of the most vexing research problems they face. While these problems are not completely statistical, they all have major statistical components. Following the introduction (section 1), in section 2, "Problems with the Simultaneous Estimation of Many True…
Descriptors: Adaptive Testing, Bayesian Statistics, Educational Research, Estimation (Mathematics)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2006
More than a dozen statistical models have been developed for the purpose of cognitive diagnosis. These models are supposed to extract a much finer level of information from item responses than traditional unidimensional item response models. In this paper, a general diagnostic model (GDM) was used to analyze a set of simulated sparse data and real…
Descriptors: Statistical Analysis, National Competency Tests, Diagnostic Tests, Item Response Theory