Publication Date
| In 2026 | 0 |
| Since 2025 | 59 |
| Since 2022 (last 5 years) | 416 |
| Since 2017 (last 10 years) | 919 |
| Since 2007 (last 20 years) | 1970 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Moses, Tim – Journal of Educational Measurement, 2012
The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…
Descriptors: Error of Measurement, Prediction, Regression (Statistics), True Scores
Margrett, Jennifer A.; Hsieh, Wen-Hua; Heinz, Melinda; Martin, Peter – International Journal of Aging and Human Development, 2012
Equivocal evidence exists regarding the degree of cognitive stability and prevalence of cognitive impairment in very late life. The objective of the current study was to examine mental status performance and change over time within a sample of Iowa centenarians. The baseline sample consisted of 152 community-dwelling and institutionalized…
Descriptors: Program Effectiveness, Error of Measurement, Older Adults, Cognitive Ability
Yang, Ji Seung; Hansen, Mark; Cai, Li – Educational and Psychological Measurement, 2012
Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical…
Descriptors: Item Response Theory, Scores, Statistical Analysis, Comparative Analysis
Magis, David; De Boeck, Paul – Educational and Psychological Measurement, 2012
The identification of differential item functioning (DIF) is often performed by means of statistical approaches that consider the raw scores as proxies for the ability trait level. One of the most popular approaches, the Mantel-Haenszel (MH) method, belongs to this category. However, replacing the ability level by the simple raw score is a source…
Descriptors: Test Bias, Data, Error of Measurement, Raw Scores
Puhan, Gautam – Journal of Educational Measurement, 2012
Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…
Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis
Tong, Xin; Zhang, Zhiyong – Multivariate Behavioral Research, 2012
Growth curve models with different types of distributions of random effects and of intraindividual measurement errors for robust analysis are compared. After demonstrating the influence of distribution specification on parameter estimation, 3 methods for diagnosing the distributions for both random effects and intraindividual measurement errors…
Descriptors: Models, Robustness (Statistics), Statistical Analysis, Error of Measurement
Kolen, Michael J.; Wang, Tianyou; Lee, Won-Chan – International Journal of Testing, 2012
Composite scores are often formed from test scores on educational achievement test batteries to provide a single index of achievement over two or more content areas or two or more item types on that test. Composite scores are subject to measurement error, and as with scores on individual tests, the amount of error variability typically depends on…
Descriptors: Mathematics Tests, Achievement Tests, College Entrance Examinations, Error of Measurement
Voelkle, Manuel C.; Oud, Johan H. L.; von Oertzen, Timo; Lindenberger, Ulman – Structural Equation Modeling: A Multidisciplinary Journal, 2012
This article has 3 objectives that build on each other. First, we demonstrate how to obtain maximum likelihood estimates for dynamic factor models (the direct autoregressive factor score model) with arbitrary "T" and "N" by means of structural equation modeling (SEM) and compare the approach to existing methods. Second, we go beyond standard time…
Descriptors: Structural Equation Models, Maximum Likelihood Statistics, Computation, Factor Analysis
Bramley, Tom; Dhawan, Vikas – Research Papers in Education, 2013
This paper discusses the issues involved in calculating indices of composite reliability for "modular" or "unitised" assessments of the kind used in GCSEs, AS and A level examinations in England. The increasingly widespread use of on-screen marking has meant that the item-level data required for calculating indices of…
Descriptors: Foreign Countries, Exit Examinations, Secondary Education, Test Reliability
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M. – Society for Research on Educational Effectiveness, 2013
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
Descriptors: Probability, Scores, Statistical Analysis, Statistical Bias
Burt, Keith B.; Obradovic, Jelena – Developmental Review, 2013
The purpose of this paper is to review major statistical and psychometric issues impacting the study of psychophysiological reactivity and discuss their implications for applied developmental researchers. We first cover traditional approaches such as the observed difference score (DS) and the observed residual score (RS), including a review of…
Descriptors: Measurement Techniques, Psychometrics, Data Analysis, Researchers
Kline, Rex B. – Educational Research and Evaluation, 2013
Test fairness and test bias are not synonymous concepts. Test bias refers to statistical evidence that the psychometrics or interpretation of test scores depend on group membership, such as gender or race, when such differences are not expected. A test that is grossly biased may be judged to be unfair, but test fairness concerns the broader, more…
Descriptors: Factor Analysis, Social Justice, Psychometrics, Test Bias
Roberts, Ros; Johnson, Philip – Curriculum Journal, 2015
Recent school science curriculum developments in many countries emphasise that scientists derive evidence for their claims through different approaches; that such practices are bound up with disciplinary knowledge; and that the quality of data should be appreciated. This position paper presents an understanding of the validity of data as a set of…
Descriptors: Educational Quality, Data, Concept Mapping, Scientific Concepts
Dong, Nianbo – American Journal of Evaluation, 2015
Researchers have become increasingly interested in programs' main and interaction effects of two variables (A and B, e.g., two treatment variables or one treatment variable and one moderator) on outcomes. A challenge for estimating main and interaction effects is to eliminate selection bias across A-by-B groups. I introduce Rubin's causal model to…
Descriptors: Probability, Statistical Analysis, Research Design, Causal Models
Watkins, Ann E.; Bargagliotti, Anna; Franklin, Christine – Journal of Statistics Education, 2014
Although the use of simulation to teach the sampling distribution of the mean is meant to provide students with sound conceptual understanding, it may lead them astray. We discuss a misunderstanding that can be introduced or reinforced when students who intuitively understand that "bigger samples are better" conduct a simulation to…
Descriptors: Simulation, Sampling, Sample Size, Misconceptions

Peer reviewed
Direct link
