Publication Date
| In 2026 | 0 |
| Since 2025 | 53 |
| Since 2022 (last 5 years) | 411 |
| Since 2017 (last 10 years) | 914 |
| Since 2007 (last 20 years) | 1965 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Peer reviewedWilliams, Richard H.; Zimmerman, Donald W. – Educational and Psychological Measurement, 1977
The usual formulas for the reliability of differences between two test scores are based on the assumption that the error scores are uncorrelated. Formulas are presented for the general case where this assumption is unnecessary. (Author/JKS)
Descriptors: Correlation, Error of Measurement, Error Patterns, Scores
Peer reviewedRowley, Glenn L.; Traub, Ross E. – Journal of Educational Measurement, 1977
The consequences of formula scoring versus number right scoring are examined in relation to the assumptions commonly made about the behavior of examinees in testing situations. The choice between the two is shown to be dependent upon having reduced error variance or unbiasedness as a goal. (Author/JKS)
Descriptors: Error of Measurement, Scoring Formulas, Statistical Bias, Test Wiseness
Peer reviewedHarris, Deborah J.; Kolen, Michael J. – Educational and Psychological Measurement, 1988
Three methods of estimating point-biserial correlation coefficient standard errors were compared: (1) assuming normality; (2) not assuming normality; and (3) bootstrapping. Although errors estimated assuming normality were biased, such estimates were less variable and easier to compute, suggesting that this might be the method of choice in some…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Analysis, Statistical Analysis
Peer reviewedPrien, Erich P.; Hughes, Garry L. – Personnel Psychology, 1987
Reports research which used 18,814 performance records to generate error statisitics using the Mixed Standard Scale evaluation format. Reduction of system error rates following revision guided by statistical analysis is a clear indication that the quality control features of this format are useful in the management of performance evaluation…
Descriptors: Error of Measurement, Evaluation, Evaluation Methods, Personnel Evaluation
Peer reviewedHuynh, Huynh – Psychometrika, 1986
Under the assumption of normalcy, a formula is derived for the reliability of the maximum score. It is shown that the maximum score is more reliable than each of the single observations but less reliable than their composite score. (Author/LMO)
Descriptors: Error of Measurement, Mathematical Models, Reliability, Scores
Peer reviewedChambers, William V. – Social Behavior and Personality, 1985
Personal construct psychologists have suggested various psychological functions explain differences in the stability of constructs. Among these functions are constellatory and loose construction. This paper argues that measurement error is a more parsimonious explanation of the differences in construct stability reported in these studies. (Author)
Descriptors: Error of Measurement, Test Construction, Test Format, Test Reliability
Peer reviewedKnight, Robert G. – Journal of Consulting and Clinical Psychology, 1983
Discusses the significance of confidence intervals around IQ scores based on a misleading interpretation of the standard error of measurement terms provided in the Wechsler Adult Intelligence Scale-Revised (WAIS-R) manual. Presents standard error values and a table for determining the abnormality of verbal and performance IQ discrepancies.…
Descriptors: Error of Measurement, Foreign Countries, Intelligence Tests, Test Interpretation
McCollum, Janet; Thompson, Bruce – Online Submission, 1980
Response error refers to the tendency to respond to items based on the perceived social desirability or undesirability of given responses. Response error can be particularly problematic when all or most of the items on a measure are extremely attractive or unattractive. The present paper proposes a method of (a) distinguishing among preferences…
Descriptors: Methods, Response Style (Tests), Social Desirability, Reliability
Interval Estimation for True Scores under Various Scale Transformations. ACT Research Report Series.
Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – 2002
This paper reviews various procedures for constructing an interval for an individual's true score given the assumption that errors of measurement are distributed as binomial. This paper also presents two general interval estimation procedures (i.e., normal approximation and endpoints conversion methods) for an individual's true scale score;…
Descriptors: Bayesian Statistics, Error of Measurement, Estimation (Mathematics), Scaling
Rudner, Lawrence M.; Schafer, William D. – 2001
This digest discusses sources of error in testing, several approaches to estimating reliability, and several ways to increase test reliability. Reliability has been defined in different ways by different authors, but the best way to look at reliability may be the extent to which measurements resulting from a test are characteristics of those being…
Descriptors: Educational Testing, Error of Measurement, Reliability, Scores
Tsai, Tsung-Hsun – 1997
The primary objective of this study was to find the smallest sample size for which equating based on a random groups design could be expected to result in less overall equating error than had no equating been conducted. Mean, linear, and equipercentile equating methods were considered. Some of the analyses presented in this paper assumed that the…
Descriptors: Equated Scores, Error of Measurement, Estimation (Mathematics), Sample Size
Li, Yuan H.; Schafer, William D. – 2002
An empirical study of the Yen (W. Yen, 1997) analytic formula for the standard error of a percent-above-cut [SE(PAC)] was conducted. This formula was derived from variance component information gathered in the context of generalizability theory. SE(PAC)s were estimated by different methods of estimating variance components (e.g., W. Yens…
Descriptors: Cutting Scores, Error of Measurement, Generalizability Theory, Simulation
PDF pending restorationDimitrov, Dimiter M. – 2002
Exact formulas for classical error variance are provided for Rasch measurement with logistic distributions. An approximation formula with the normal ability distribution is also provided. With the proposed formulas, the additive contribution of individual items to the population error variance can be determined without knowledge of the other test…
Descriptors: Ability, Error of Measurement, Item Response Theory, Test Items
Lee, Guemin – 1999
Previous studies have indicated that the reliability of test scores composed of testlets is overestimated by conventional item-based reliability estimation methods (S. Sireci, D. Thissen, and H. Wainer, 1991; H. Wainer, 1995; H. Wainer and D. Thissen, 1996; G. Lee and D. Frisbie). In light of these studies, it seems reasonable to ask whether the…
Descriptors: Definitions, Error of Measurement, Estimation (Mathematics), Reliability
Lee, Guemin – 1998
The primary purpose of this study was to investigate the appropriateness and implication of incorporating a testlet definition into the estimation of the conditional standard error of measurement (SEM) for tests composed of testlets. The five conditional SEM estimation methods used in this study were classified into two categories: item-based and…
Descriptors: Definitions, Error of Measurement, Estimation (Mathematics), Reliability


