Publication Date
| In 2026 | 0 |
| Since 2025 | 53 |
| Since 2022 (last 5 years) | 411 |
| Since 2017 (last 10 years) | 914 |
| Since 2007 (last 20 years) | 1965 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Gardner, John; Cowan, Pamela – Assessment in Education Principles Policy and Practice, 2005
This paper sets out the findings from a large-scale analysis of the Northern Ireland Transfer Procedure Tests, used to select pupils for grammar schools. As it was not possible to get completed test scripts from government agencies, over 3000 practice scripts were completed in simulated conditions and were analysed to establish whether the tests…
Descriptors: Foreign Countries, Educational Testing, Error of Measurement, Test Use
Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items
Vermunt, Jeroen K. – Multivariate Behavioral Research, 2005
A well-established approach to modeling clustered data introduces random effects in the model of interest. Mixed-effects logistic regression models can be used to predict discrete outcome variables when observations are correlated. An extension of the mixed-effects logistic regression model is presented in which the dependent variable is a latent…
Descriptors: Predictor Variables, Correlation, Maximum Likelihood Statistics, Error of Measurement
Hopwood, Christopher J.; Richard, David C. S. – Assessment, 2005
Research on the Wechsler Adult Intelligence Scale-Revised and Wechsler Adult Intelligence Scale-Third Edition (WAIS-III) suggests that practicing clinical psychologists and graduate students make item-level scoring errors that affect IQ, index, and subtest scores. Studies have been limited in that Full-Scale IQ (FSIQ) and examiner administration,…
Descriptors: Scoring, Psychologists, Intelligence Quotient, Graduate Students
Flanagan, Kristin Denton; McPhee, Cameron – National Center for Education Statistics, 2009
Using data from the final two rounds of the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B), a longitudinal study begun in 2001, this First Look provides a snapshot of the demographic characteristics, reading and mathematics knowledge, fine motor skills, school characteristics, and before- and after-school care arrangements of the cohort…
Descriptors: Child Development, Kindergarten, Longitudinal Studies, Cohort Analysis
Stacey, Kaye; Steinle, Vicki – Mathematics Education Research Journal, 2006
The basic theory of Rasch measurement applies to situations where a person has a certain level of a trait being investigated, and this level of ability is what determines (to within a measurement error) how well the person does on each item in a test. This paper responds to frequent suggestions from colleagues that the use of Rasch measurement…
Descriptors: Measurement, Error of Measurement, Item Response Theory, Construct Validity
Ardoin, Scott P. – Psychology in the Schools, 2006
Extensive evidence exists demonstrating the utility of Curriculum-Based Measurement in reading (R-CBM) for progress-monitoring purposes; however, most studies have evaluated R-CBM from a traditional psychometric perspective, which allows for variability in individual student's data that is not a function of increased skills (i.e., measurement…
Descriptors: Psychometrics, Measurement, Maintenance, Intervention
Yechiam, Eldad; Goodnight, Jackson; Bates, John E.; Busemeyer, Jerome R.; Dodge, Kenneth A.; Pettit, Gregory S.; Newman, Joseph P. – Psychological Assessment, 2006
This article proposes and tests a formal cognitive model for the go/no-go discrimination task. In this task, the performer chooses whether to respond to stimuli and receives rewards for responding to certain stimuli and punishments for responding to others. Three cognitive models were evaluated on the basis of data from a longitudinal study…
Descriptors: Evaluation Research, Task Analysis, Adolescents, Longitudinal Studies
Zwick, Rebecca; And Others – 1993
Simulated data were used to investigate the performance of modified versions of the Mantel-Haenszel and standardization methods of differential item functioning (DIF) analysis in computer-adaptive tests (CATs). Each "examinee" received 25 items out of a 75-item pool. A three-parameter logistic item response model was assumed, and…
Descriptors: Adaptive Testing, Computer Assisted Testing, Correlation, Error of Measurement
Olson, Jeffery E. – 1992
Often, all of the variables in a model are latent, random, or subject to measurement error, or there is not an obvious dependent variable. When any of these conditions exist, an appropriate method for estimating the linear relationships among the variables is Least Principal Components Analysis. Least Principal Components are robust, consistent,…
Descriptors: Error of Measurement, Factor Analysis, Goodness of Fit, Mathematical Models
Longford, Nicholas T. – 1993
A model-based approach to rater reliability for essays read by multiple readers is presented. Variation of rater severity (between-rater variation) and rater inconsistency (within-rater variation) is considered in the presence of between-examinee variation. An additive variance component model is posited and the method of moments for its…
Descriptors: Educational Diagnosis, Error of Measurement, Essays, Estimation (Mathematics)
Nevitt, Johnathan; Hancock, Gregory R. – 1998
Though common structural equation modeling (SEM) methods are predicated upon the assumption of multivariate normality, applied researchers often find themselves with data clearly violating this assumption and without sufficient sample size to use distribution-free estimation methods. Fortunately, promising alternatives are being integrated into…
Descriptors: Chi Square, Computer Software, Error of Measurement, Estimation (Mathematics)
Pike, Gary R. – 1991
Because change is fundamental to education and the measurement of change assesses the quality and effectiveness of postsecondary education, this study examined three methods of measuring change: (1) gain scores; (2) residual scores; and (3) repeated measures. Data for the study was obtained from transcripts of 722 graduating seniors at the…
Descriptors: Academic Achievement, College Seniors, Error of Measurement, Higher Education
Green, Donald Ross; And Others – 1988
Potential benefits of using item response theory in test construction are evaluated, based on the experience and evidence accumulated during 9 years of using a three-parameter model in the construction of major achievement batteries. Specific benefits covered include obtaining sample-free item calibrations and item-free person measurement,…
Descriptors: Achievement Tests, Computer Assisted Testing, Difficulty Level, Elementary Secondary Education
MacPhee, David – 1983
As data on the reliability and validity of ratings of infant temperament have accumulated, researchers have begun to ask what caregiver ratings really measure. An argument has been made that ratings of social behavior are less a reflection of enduring individual differences than a measure of rater characteristics and error variance. This study…
Descriptors: Error of Measurement, Experimenter Characteristics, Infants, Knowledge Level

Peer reviewed
Direct link
