NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Educational and Psychological…97
Education Level
High Schools1
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 97 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A. – Educational and Psychological Measurement, 2023
Rating scale analysis techniques provide researchers with practical tools for examining the degree to which ordinal rating scales (e.g., Likert-type scales or performance assessment rating scales) function in psychometrically useful ways. When rating scales function as expected, researchers can interpret ratings in the intended direction (i.e.,…
Descriptors: Rating Scales, Testing Problems, Item Response Theory, Models
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023
Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…
Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational and Psychological Measurement, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests.…
Descriptors: Mastery Tests, Computer Assisted Testing, Probability, Test Wiseness
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022
Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…
Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Schweizer, Karl; Reiß, Siegbert; Troche, Stefan – Educational and Psychological Measurement, 2019
The article reports three simulation studies conducted to find out whether the effect of a time limit for testing impairs model fit in investigations of structural validity, whether the representation of the assumed source of the effect prevents impairment of model fit and whether it is possible to identify and discriminate this method effect from…
Descriptors: Timed Tests, Testing, Barriers, Testing Problems
Peer reviewed Peer reviewed
Direct linkDirect link
Miller, Jeff – Educational and Psychological Measurement, 2017
Critics of null hypothesis significance testing suggest that (a) its basic logic is invalid and (b) it addresses a question that is of no interest. In contrast to (a), I argue that the underlying logic of hypothesis testing is actually extremely straightforward and compelling. To substantiate that, I present examples showing that hypothesis…
Descriptors: Hypothesis Testing, Testing Problems, Test Validity, Relevance (Education)
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Johnson, Matthew S. – Educational and Psychological Measurement, 2017
In a pioneering research article, Wollack and colleagues suggested the "erasure detection index" (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction,…
Descriptors: Deception, Identification, Testing Problems, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Robitzsch, Alexander; Rupp, Andre A. – Educational and Psychological Measurement, 2009
This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…
Descriptors: Test Bias, Simulation, Interaction, Effect Size
Peer reviewed Peer reviewed
Burnett, J. Dale – Educational and Psychological Measurement, 1974
The general use of the Spearman-Brown formula for calculating the reliability of parallel tests with different lengths is reviewed. The importance of the assumption that the component tests be parallel is noted and the property that parallel tests must be non-negatively correlated is derived. (Author)
Descriptors: Statistical Analysis, Test Reliability, Testing Problems
Peer reviewed Peer reviewed
Barcikowski, Robert S. – Educational and Psychological Measurement, 1974
Descriptors: Error of Measurement, Item Sampling, Testing Problems
Peer reviewed Peer reviewed
Wagner, Edwin E.; Hoover, Thomas O. – Educational and Psychological Measurement, 1974
Descriptors: Affective Measures, Serial Ordering, Test Bias, Testing Problems
Peer reviewed Peer reviewed
Green, Samuel B. – Educational and Psychological Measurement, 1981
The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)
Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems
Peer reviewed Peer reviewed
Cziko, Gary A. – Educational and Psychological Measurement, 1984
Some problems associated with the criteria of reproducibility and scalability as they are used in Guttman scalogram analysis to evaluate cumulative, nonparametric scales of dichotomous items are discussed. A computer program is presented which analyzes response patterns elicited by dichotomous scales designed to be cumulative. (Author/DWH)
Descriptors: Scaling, Statistical Analysis, Test Construction, Test Items
Peer reviewed Peer reviewed
Malloy, Thomas E.; Copeland, Ellis P. – Educational and Psychological Measurement, 1980
The Fundamental Interpersonal Relations Orientation Behavior (FIRO-B) scale is a measure of inclusion, control and affection. Examination of the component algorithms which yield its global compatibility score suggest an inconsistent use of absolute values and real numbers. A modification of Schutz's original mathematical schema is presented.…
Descriptors: Behavior Rating Scales, Interpersonal Relationship, Mathematical Formulas, Testing Problems
Peer reviewed Peer reviewed
Miley, Alan D. – Educational and Psychological Measurement, 1980
The tendency to extreme scores (TES) can affect sensitive indices, such as Cattell's coefficient of pattern similarity, so that a flat profile will, in general, be found more similar to a standard than will an extreme profile. TES is especially critical when profile matching is used in clinical diagnosis. (Author/BW)
Descriptors: Clinical Diagnosis, Profiles, Statistical Analysis, Test Interpretation
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7