Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 7 |
| Since 2007 (last 20 years) | 8 |
Descriptor
Source
| Educational and Psychological… | 97 |
Author
Publication Type
| Journal Articles | 64 |
| Reports - Research | 54 |
| Reports - Evaluative | 10 |
| Reports - Descriptive | 4 |
| Book/Product Reviews | 1 |
| Numerical/Quantitative Data | 1 |
| Speeches/Meeting Papers | 1 |
Education Level
| High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Wind, Stefanie A. – Educational and Psychological Measurement, 2023
Rating scale analysis techniques provide researchers with practical tools for examining the degree to which ordinal rating scales (e.g., Likert-type scales or performance assessment rating scales) function in psychometrically useful ways. When rating scales function as expected, researchers can interpret ratings in the intended direction (i.e.,…
Descriptors: Rating Scales, Testing Problems, Item Response Theory, Models
von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023
Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…
Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit
Sinharay, Sandip – Educational and Psychological Measurement, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests.…
Descriptors: Mastery Tests, Computer Assisted Testing, Probability, Test Wiseness
Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022
Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…
Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory
Schweizer, Karl; Reiß, Siegbert; Troche, Stefan – Educational and Psychological Measurement, 2019
The article reports three simulation studies conducted to find out whether the effect of a time limit for testing impairs model fit in investigations of structural validity, whether the representation of the assumed source of the effect prevents impairment of model fit and whether it is possible to identify and discriminate this method effect from…
Descriptors: Timed Tests, Testing, Barriers, Testing Problems
Miller, Jeff – Educational and Psychological Measurement, 2017
Critics of null hypothesis significance testing suggest that (a) its basic logic is invalid and (b) it addresses a question that is of no interest. In contrast to (a), I argue that the underlying logic of hypothesis testing is actually extremely straightforward and compelling. To substantiate that, I present examples showing that hypothesis…
Descriptors: Hypothesis Testing, Testing Problems, Test Validity, Relevance (Education)
Sinharay, Sandip; Johnson, Matthew S. – Educational and Psychological Measurement, 2017
In a pioneering research article, Wollack and colleagues suggested the "erasure detection index" (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction,…
Descriptors: Deception, Identification, Testing Problems, Error of Measurement
Robitzsch, Alexander; Rupp, Andre A. – Educational and Psychological Measurement, 2009
This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…
Descriptors: Test Bias, Simulation, Interaction, Effect Size
Peer reviewedBurnett, J. Dale – Educational and Psychological Measurement, 1974
The general use of the Spearman-Brown formula for calculating the reliability of parallel tests with different lengths is reviewed. The importance of the assumption that the component tests be parallel is noted and the property that parallel tests must be non-negatively correlated is derived. (Author)
Descriptors: Statistical Analysis, Test Reliability, Testing Problems
Peer reviewedBarcikowski, Robert S. – Educational and Psychological Measurement, 1974
Descriptors: Error of Measurement, Item Sampling, Testing Problems
Peer reviewedWagner, Edwin E.; Hoover, Thomas O. – Educational and Psychological Measurement, 1974
Descriptors: Affective Measures, Serial Ordering, Test Bias, Testing Problems
Peer reviewedGreen, Samuel B. – Educational and Psychological Measurement, 1981
The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)
Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems
Peer reviewedCziko, Gary A. – Educational and Psychological Measurement, 1984
Some problems associated with the criteria of reproducibility and scalability as they are used in Guttman scalogram analysis to evaluate cumulative, nonparametric scales of dichotomous items are discussed. A computer program is presented which analyzes response patterns elicited by dichotomous scales designed to be cumulative. (Author/DWH)
Descriptors: Scaling, Statistical Analysis, Test Construction, Test Items
Peer reviewedMalloy, Thomas E.; Copeland, Ellis P. – Educational and Psychological Measurement, 1980
The Fundamental Interpersonal Relations Orientation Behavior (FIRO-B) scale is a measure of inclusion, control and affection. Examination of the component algorithms which yield its global compatibility score suggest an inconsistent use of absolute values and real numbers. A modification of Schutz's original mathematical schema is presented.…
Descriptors: Behavior Rating Scales, Interpersonal Relationship, Mathematical Formulas, Testing Problems
Peer reviewedMiley, Alan D. – Educational and Psychological Measurement, 1980
The tendency to extreme scores (TES) can affect sensitive indices, such as Cattell's coefficient of pattern similarity, so that a flat profile will, in general, be found more similar to a standard than will an extreme profile. TES is especially critical when profile matching is used in clinical diagnosis. (Author/BW)
Descriptors: Clinical Diagnosis, Profiles, Statistical Analysis, Test Interpretation

Direct link
