Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 12 |
| Since 2007 (last 20 years) | 47 |
Descriptor
| Computation | 67 |
| Evaluation Methods | 67 |
| Statistical Analysis | 67 |
| Educational Research | 15 |
| Effect Size | 15 |
| Models | 13 |
| Simulation | 13 |
| Correlation | 11 |
| Error of Measurement | 11 |
| Monte Carlo Methods | 11 |
| Sample Size | 11 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 5 |
Laws, Policies, & Programs
Assessments and Surveys
| Attitudes Toward Women Scale | 1 |
| National Assessment of… | 1 |
What Works Clearinghouse Rating
Ponce-Renova, Hector F. – Journal of New Approaches in Educational Research, 2022
This paper's objective was to teach the Equivalence Testing applied to Educational Research to emphasize recommendations and to increase quality of research. Equivalence Testing is a technique used to compare effect sizes or means of two different studies to ascertain if they would be statistically equivalent. For making accessible Equivalence…
Descriptors: Educational Research, Effect Size, Statistical Analysis, Intervals
Castellano, Katherine E.; McCaffrey, Daniel F. – Journal of Educational Measurement, 2020
The residual gain score has been of historical interest, and its percentile rank has been of interest more recently given its close correspondence to the popular Student Growth Percentile. However, these estimators suffer from low accuracy and systematic bias (bias conditional on prior latent achievement). This article explores three…
Descriptors: Accuracy, Student Evaluation, Measurement Techniques, Evaluation Methods
Yang, Shitao; Black, Ken – Teaching Statistics: An International Journal for Teachers, 2019
Summary Employing a Wald confidence interval to test hypotheses about population proportions could lead to an increase in Type I or Type II errors unless the hypothesized value, p0, is used in computing its standard error rather than the sample proportion. Whereas the Wald confidence interval to estimate a population proportion uses the sample…
Descriptors: Error Patterns, Evaluation Methods, Error of Measurement, Measurement Techniques
Hyunsuk Han – ProQuest LLC, 2018
In Huggins-Manley & Han (2017), it was shown that WLSMV global model fit indices used in structural equating modeling practice are sensitive to person parameter estimate RMSE and item difficulty parameter estimate RMSE that results from local dependence in 2-PL IRT models, particularly when conditioning on number of test items and sample size.…
Descriptors: Models, Statistical Analysis, Item Response Theory, Evaluation Methods
Swank, Jacqueline M.; Mullen, Patrick R. – Measurement and Evaluation in Counseling and Development, 2017
The article serves as a guide for researchers in developing evidence of validity using bivariate correlations, specifically construct validity. The authors outline the steps for calculating and interpreting bivariate correlations. Additionally, they provide an illustrative example and discuss the implications.
Descriptors: Correlation, Construct Validity, Guidelines, Data Interpretation
Porter, Kristin E. – Journal of Research on Educational Effectiveness, 2018
Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs) are statistical…
Descriptors: Statistical Analysis, Program Effectiveness, Intervention, Hypothesis Testing
Porter, Kristin E. – Society for Research on Educational Effectiveness, 2016
In recent years, there has been increasing focus on the issue of multiple hypotheses testing in education evaluation studies. In these studies, researchers are typically interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time or across multiple treatment groups. When…
Descriptors: Hypothesis Testing, Intervention, Error Patterns, Evaluation Methods
Dimitrov, Dimiter M. – Measurement and Evaluation in Counseling and Development, 2017
This article offers an approach to examining differential item functioning (DIF) under its item response theory (IRT) treatment in the framework of confirmatory factor analysis (CFA). The approach is based on integrating IRT- and CFA-based testing of DIF and using bias-corrected bootstrap confidence intervals with a syntax code in Mplus.
Descriptors: Test Bias, Item Response Theory, Factor Analysis, Evaluation Methods
Guasch, Marc; Haro, Juan; Boada, Roger – Psicologica: International Journal of Methodology and Experimental Psychology, 2017
With the increasing refinement of language processing models and the new discoveries about which variables can modulate these processes, stimuli selection for experiments with a factorial design is becoming a tough task. Selecting sets of words that differ in one variable, while matching these same words into dozens of other confounding variables…
Descriptors: Factor Analysis, Language Processing, Design, Cluster Grouping
Porter, Kristin E. – Grantee Submission, 2017
Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs) are statistical…
Descriptors: Statistical Analysis, Program Effectiveness, Intervention, Hypothesis Testing
Jackson, Dan; Bowden, Jack; Baker, Rose – Research Synthesis Methods, 2015
Moment-based estimators of the between-study variance are very popular when performing random effects meta-analyses. This type of estimation has many advantages including computational and conceptual simplicity. Furthermore, by using these estimators in large samples, valid meta-analyses can be performed without the assumption that the treatment…
Descriptors: Meta Analysis, Hierarchical Linear Modeling, Computation, Evaluation Methods
Porter, Kristin E. – MDRC, 2016
In education research and in many other fields, researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple…
Descriptors: Statistical Analysis, Program Effectiveness, Intervention, Hypothesis Testing
Ganzfried, Sam; Yusuf, Farzana – Education Sciences, 2018
A problem faced by many instructors is that of designing exams that accurately assess the abilities of the students. Typically, these exams are prepared several days in advance, and generic question scores are used based on rough approximation of the question difficulty and length. For example, for a recent class taught by the author, there were…
Descriptors: Weighted Scores, Test Construction, Student Evaluation, Multiple Choice Tests
Westlund, Erik; Stuart, Elizabeth A. – American Journal of Evaluation, 2017
This article discusses the nonuse, misuse, and proper use of pilot studies in experimental evaluation research. The authors first show that there is little theoretical, practical, or empirical guidance available to researchers who seek to incorporate pilot studies into experimental evaluation research designs. The authors then discuss how pilot…
Descriptors: Use Studies, Pilot Projects, Evaluation Research, Experiments
Raykov, Tenko; Dimitrov, Dimiter M.; von Eye, Alexander; Marcoulides, George A. – Educational and Psychological Measurement, 2013
A latent variable modeling method for evaluation of interrater agreement is outlined. The procedure is useful for point and interval estimation of the degree of agreement among a given set of judges evaluating a group of targets. In addition, the approach allows one to test for identity in underlying thresholds across raters as well as to identify…
Descriptors: Interrater Reliability, Models, Statistical Analysis, Computation

Peer reviewed
Direct link
