ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	4
Since 2017 (last 10 years)	7
Since 2007 (last 20 years)	8

Descriptor

Testing Problems	97
Test Validity	29
Test Reliability	22
Higher Education	21
Response Style (Tests)	20
Rating Scales	14
Elementary Education	11
Statistical Analysis	11
Test Construction	11
Test Interpretation	11
Personality Measures	10
Error of Measurement	9
Factor Analysis	9
Test Items	9
Computer Assisted Testing	8
Correlation	7
Item Response Theory	7
Scores	7
Scoring	7
Test Bias	7
College Students	6
Comparative Analysis	6
Factor Structure	6
Guessing (Tests)	6
Measurement Techniques	6
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	64
Reports - Research	54
Reports - Evaluative	10
Reports - Descriptive	4
Book/Product Reviews	1
Numerical/Quantitative Data	1
Speeches/Meeting Papers	1

Education Level

High Schools

Audience

Location

India	1
Israel	1
Mexico	1
Nicaragua	1
Pakistan	1
Taiwan (Taipei)	1

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 97 results Save | Export

Detecting Rating Scale Malfunctioning with the Partial Credit Model and Generalized Partial Credit Model

Peer reviewed

Direct link

Wind, Stefanie A. – Educational and Psychological Measurement, 2023

Rating scale analysis techniques provide researchers with practical tools for examining the degree to which ordinal rating scales (e.g., Likert-type scales or performance assessment rating scales) function in psychometrically useful ways. When rating scales function as expected, researchers can interpret ratings in the intended direction (i.e.,…

Descriptors: Rating Scales, Testing Problems, Item Response Theory, Models

A Robust Method for Detecting Item Misfit in Large-Scale Assessments

Peer reviewed

Direct link

von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023

Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…

Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit

Estimating Probabilities of Passing for Examinees with Incomplete Data in Mastery Tests

Peer reviewed

Direct link

Sinharay, Sandip – Educational and Psychological Measurement, 2022

Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests.…

Descriptors: Mastery Tests, Computer Assisted Testing, Probability, Test Wiseness

Hybrid Threshold-Based Sequential Procedures for Detecting Compromised Items in a Computerized Adaptive Testing Licensure Exam

Peer reviewed

Direct link

Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022

Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…

Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory

Does the Effect of a Time Limit for Testing Impair Structural Investigations by Means of Confirmatory Factor Models?

Peer reviewed

Direct link

Schweizer, Karl; Reiß, Siegbert; Troche, Stefan – Educational and Psychological Measurement, 2019

The article reports three simulation studies conducted to find out whether the effect of a time limit for testing impairs model fit in investigations of structural validity, whether the representation of the assumed source of the effect prevents impairment of model fit and whether it is possible to identify and discriminate this method effect from…

Descriptors: Timed Tests, Testing, Barriers, Testing Problems

Hypothesis Testing in the Real World

Peer reviewed

Direct link

Miller, Jeff – Educational and Psychological Measurement, 2017

Critics of null hypothesis significance testing suggest that (a) its basic logic is invalid and (b) it addresses a question that is of no interest. In contrast to (a), I argue that the underlying logic of hypothesis testing is actually extremely straightforward and compelling. To substantiate that, I present examples showing that hypothesis…

Descriptors: Hypothesis Testing, Testing Problems, Test Validity, Relevance (Education)

Three New Methods for Analysis of Answer Changes

Peer reviewed

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Educational and Psychological Measurement, 2017

In a pioneering research article, Wollack and colleagues suggested the "erasure detection index" (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction,…

Descriptors: Deception, Identification, Testing Problems, Error of Measurement

Impact of Missing Data on the Detection of Differential Item Functioning: The Case of Mantel-Haenszel and Logistic Regression Analysis

Peer reviewed

Direct link

Robitzsch, Alexander; Rupp, Andre A. – Educational and Psychological Measurement, 2009

This article describes the results of a simulation study to investigate the impact of missing data on the detection of differential item functioning (DIF). Specifically, it investigates how four methods for dealing with missing data (listwise deletion, zero imputation, two-way imputation, response function imputation) interact with two methods of…

Descriptors: Test Bias, Simulation, Interaction, Effect Size

Parallel Measurements and the Spearman-Brown Formula

Peer reviewed

Burnett, J. Dale – Educational and Psychological Measurement, 1974

The general use of the Spearman-Brown formula for calculating the reliability of parallel tests with different lengths is reviewed. The importance of the assumption that the component tests be parallel is noted and the property that parallel tests must be non-negatively correlated is derived. (Author)

Descriptors: Statistical Analysis, Test Reliability, Testing Problems

The Effects of Item Discrimination on the Standard Errors of Estimate Associated with Item-Examinee Sampling Procedures

Peer reviewed

Barcikowski, Robert S. – Educational and Psychological Measurement, 1974

Descriptors: Error of Measurement, Item Sampling, Testing Problems

The Effect of Serial Position on Ranking Error

Peer reviewed

Wagner, Edwin E.; Hoover, Thomas O. – Educational and Psychological Measurement, 1974

Descriptors: Affective Measures, Serial Ordering, Test Bias, Testing Problems

A Comparison of Three Indexes of Agreement between Observers: Proportion of Agreement, G-Index, and Kappa.

Peer reviewed

Green, Samuel B. – Educational and Psychological Measurement, 1981

The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)

Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems

An Improvement over Guttman Scalogram Analysis: A Computer Program for Evaluating Cumulative, Nonparametric Scales of Dichotomous Items.

Peer reviewed

Cziko, Gary A. – Educational and Psychological Measurement, 1984

Some problems associated with the criteria of reproducibility and scalability as they are used in Guttman scalogram analysis to evaluate cumulative, nonparametric scales of dichotomous items are discussed. A computer program is presented which analyzes response patterns elicited by dichotomous scales designed to be cumulative. (Author/DWH)

Descriptors: Scaling, Statistical Analysis, Test Construction, Test Items

Firo-B Interpersonal Compatibility: A Suggested Modification.

Peer reviewed

Malloy, Thomas E.; Copeland, Ellis P. – Educational and Psychological Measurement, 1980

The Fundamental Interpersonal Relations Orientation Behavior (FIRO-B) scale is a measure of inclusion, control and affection. Examination of the component algorithms which yield its global compatibility score suggest an inconsistent use of absolute values and real numbers. A modification of Schutz's original mathematical schema is presented.…

Descriptors: Behavior Rating Scales, Interpersonal Relationship, Mathematical Formulas, Testing Problems

Individual-to-Group Profile Comparisons by [Coefficient of Pattern Similarity]: Elevation, Scatter, and Extreme Scores.

Peer reviewed

Miley, Alan D. – Educational and Psychological Measurement, 1980

The tendency to extreme scores (TES) can affect sensitive indices, such as Cattell's coefficient of pattern similarity, so that a flat profile will, in general, be found more similar to a standard than will an extreme profile. TES is especially critical when profile matching is used in clinical diagnosis. (Author/BW)

Descriptors: Clinical Diagnosis, Profiles, Statistical Analysis, Test Interpretation

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Conger, Anthony J.	2
Fiske, Donald W.	2
Michael, William B.	2
Modjeski, Richard B.	2
Schriesheim, Chester A.	2
Sinharay, Sandip	2
Wagner, Edwin E.	2
Aguinis, Herman	1
Aiken, Lewis R.	1
Alderman, Donald L.	1
Alliger, George M.	1
Andrulis, Richard S.	1
Ansari, Z. A.	1
Applegate, James L.	1
Arce-Ferrer, Alvaro J.	1
Austin, J. Sue	1
Baldauf, Richard B., Jr.	1
Barcikowski, Robert S.	1
Benson, Jeri	1
Bezirhan, Ummugul	1
Bird, Kevin D.	1
Boldt, R. F.	1
Bourne, Edmund J.	1
Burnett, J. Dale	1
More ▼

Minnesota Multiphasic…	2
Personal Orientation Inventory	2
SAT (College Admission Test)	2
California Achievement Tests	1
California Psychological…	1
Comprehensive Tests of Basic…	1
Conners Teacher Rating Scale	1
Coopersmith Self Esteem…	1
Cornell Critical Thinking Test	1
Differential Aptitude Test	1
Fundamental Interpersonal…	1
General Aptitude Test Battery	1
General Educational…	1
Learning Style Inventory	1
Minnesota Teacher Attitude…	1
Piers Harris Childrens Self…	1
Raven Progressive Matrices	1
Rorschach Test	1
Sarason Test Anxiety Scale…	1
Watson Glaser Critical…	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1
More ▼