Publication Date
| In 2026 | 0 |
| Since 2025 | 53 |
| Since 2022 (last 5 years) | 411 |
| Since 2017 (last 10 years) | 914 |
| Since 2007 (last 20 years) | 1965 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Durand, Guillaume; Goutte, Cyril; Léger, Serge – International Educational Data Mining Society, 2018
Knowledge tracing is a fundamental area of educational data modeling that aims at gaining a better understanding of the learning occurring in tutoring systems. Knowledge tracing models fit various parameters on observed student performance and are evaluated through several goodness of fit metrics. Fitted parameter values are of crucial interest in…
Descriptors: Error of Measurement, Models, Goodness of Fit, Predictive Validity
Luke W. Miratrix; Jasjeet S. Sekhon; Alexander G. Theodoridis; Luis F. Campos – Grantee Submission, 2018
The popularity of online surveys has increased the prominence of using weights that capture units' probabilities of inclusion for claims of representativeness. Yet, much uncertainty remains regarding how these weights should be employed in analysis of survey experiments: Should they be used or ignored? If they are used, which estimators are…
Descriptors: Online Surveys, Weighted Scores, Data Interpretation, Robustness (Statistics)
Zhichao Jiang; Peng Ding – Grantee Submission, 2018
Frequently, empirical studies are plagued with missing data. When the data are missing not at random, the parameter of interest is not identifiable in general. Without additional assumptions, we can derive bounds of the parameters of interest, which, unfortunately, are often too wide to be informative. Therefore, it is of great importance to…
Descriptors: Foreign Countries, Acquired Immunodeficiency Syndrome (AIDS), Public Health, Data
Sizova, Zhanna M.; Semenova, Tatyana V.; Naydenova , Natalia N.; Narbut, Victoria V.; Chelyshkova, Marina B.; Masalimova, Alfiya R. – EURASIA Journal of Mathematics, Science and Technology Education, 2019
The main purpose of this article is to present a Differential Item Functioning method of item analysis. It is designed to minimize the discriminatory effect of individual items in the accreditation of graduates of medical universities with different training programs. The one-parameter Item Response Theory model is used to align graduates' rights…
Descriptors: Accreditation (Institutions), Medical Schools, Universities, College Graduates
West, Brady T.; Li, Dan – Sociological Methods & Research, 2019
In face-to-face surveys, interviewer observations are a cost-effective source of paradata for nonresponse adjustment of survey estimates and responsive survey designs. Unfortunately, recent studies have suggested that the accuracy of these observations can vary substantially among interviewers, even after controlling for household-, area-, and…
Descriptors: Observation, Interviews, Error of Measurement, Accuracy
Marti´nez-Lemos, R. I.; Ayán-Pérez, Cárlos; Bouzas-Rico, Sara – International Journal of Developmental Disabilities, 2019
Objectives: The main objective was to identify the test-retest reliability of the Wii Balance Board (WBB) for assessing standing balance when administered to a population of people with intellectual disability (ID). A secondary objective was to provide information regarding the reliability of the WBB, taking into account the severity of cognitive…
Descriptors: Test Reliability, Human Posture, Psychomotor Skills, Mild Intellectual Disability
DiStefano, Christine; McDaniel, Heather L.; Zhang, Liyun; Shi, Dexin; Jiang, Zhehan – Educational and Psychological Measurement, 2019
A simulation study was conducted to investigate the model size effect when confirmatory factor analysis (CFA) models include many ordinal items. CFA models including between 15 and 120 ordinal items were analyzed with mean- and variance-adjusted weighted least squares to determine how varying sample size, number of ordered categories, and…
Descriptors: Factor Analysis, Effect Size, Data, Sample Size
Aucejo, Esteban; Romano, Teresa; Taylor, Eric S. – Centre for Economic Performance, 2019
Performance evaluation may change employee effort and decisions in unintended ways, for example, in multitask jobs where the evaluation measure captures only a subset of (differentially weights) the job tasks. We show evidence of this multitask distortion in schools, with teachers allocating effort across students (tasks). Teachers are evaluated…
Descriptors: Teacher Evaluation, Student Evaluation, Mathematics Tests, Scores
Victoria, Konidari – Education Inquiry, 2021
This paper argues that the consideration of educational disadvantage should go beyond the micro-scale contextual level of individual students, and explore eventual connections with hybrid forms of disadvantage in the social field. The paper draws on the capability approach and the concept of dwelling to introduce dwelling in time as functioning.…
Descriptors: Educationally Disadvantaged, Vocational Education, Adolescents, Secondary School Students
van der Lans, Rikkert M.; Maulana, Ridwan; Helms-Lorenz, Michelle; Fernández-García, Carmen-María; Chun, Seyeoung; de Jager, Thelma; Irnidayanti, Yulia; Inda-Caro, Mercedes; Lee, Okhwa; Coetzee, Thys; Fadhilah, Nurul; Jeon, Meae; Moorer, Peter – SAGE Open, 2021
This study examines measurement invariance of student perceptions of teaching quality collected in five countries: Indonesia (n students = 6,331), the Netherlands (n students = 6,738), South Africa (n students = 3,422), South Korea (n students = 6,997) and Spain (n students = 4,676). The administered questionnaire was the My Teacher Questionnaire…
Descriptors: Foreign Countries, Student Attitudes, Student Evaluation of Teacher Performance, Teacher Effectiveness
Qin, Xu; Hong, Guanglei – Journal of Educational and Behavioral Statistics, 2017
When a multisite randomized trial reveals between-site variation in program impact, methods are needed for further investigating heterogeneous mediation mechanisms across the sites. We conceptualize and identify a joint distribution of site-specific direct and indirect effects under the potential outcomes framework. A method-of-moments procedure…
Descriptors: Randomized Controlled Trials, Hierarchical Linear Modeling, Statistical Analysis, Probability
McNeish, Daniel – Educational and Psychological Measurement, 2017
In behavioral sciences broadly, estimating growth models with Bayesian methods is becoming increasingly common, especially to combat small samples common with longitudinal data. Although Mplus is becoming an increasingly common program for applied research employing Bayesian methods, the limited selection of prior distributions for the elements of…
Descriptors: Models, Bayesian Statistics, Statistical Analysis, Computer Software
Conger, Anthony J. – Educational and Psychological Measurement, 2017
Drawing parallels to classical test theory, this article clarifies the difference between rater accuracy and reliability and demonstrates how category marginal frequencies affect rater agreement and Cohen's kappa. Category assignment paradigms are developed: comparing raters to a standard (index) versus comparing two raters to one another…
Descriptors: Interrater Reliability, Evaluators, Accuracy, Statistical Analysis
Trafimow, David – Educational and Psychological Measurement, 2017
There has been much controversy over the null hypothesis significance testing procedure, with much of the criticism centered on the problem of inverse inference. Specifically, p gives the probability of the finding (or one more extreme) given the null hypothesis, whereas the null hypothesis significance testing procedure involves drawing a…
Descriptors: Statistical Inference, Hypothesis Testing, Probability, Intervals
Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017
Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…
Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy

Peer reviewed
Direct link
