ERIC - Search Results

Descriptor

Statistical Studies	18
Statistical Distributions	5
Higher Education	4
Mathematical Models	4
Test Items	4
Ability	3
Equated Scores	3
Error of Measurement	3
Latent Trait Theory	3
Sampling	3
Scaling	3
Statistical Bias	3
Test Reliability	3
Test Theory	3
Achievement Tests	2
Bayesian Statistics	2
Comparative Analysis	2
Correlation	2
Cutting Scores	2
Error Patterns	2
Estimation (Mathematics)	2
Intermediate Grades	2
Item Response Theory	2
Models	2
Scores	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	17
Reports - Research	13
Reports - Evaluative	3
Opinion Papers	1

Education Level

Audience

Researchers

Location

New Jersey

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Grouping Continuous Data in Discrete Intervals: Information Loss and Recovery

Peer reviewed

Shaw, Dale G; And Others – Journal of Educational Measurement, 1987

Information loss occurs when continuous data are grouped in discrete intervals. After calculating the squared correlation coefficients between continuous data and corresponding grouped data for four population distributions, the effects of population distribution, number of intervals, and interval width on information loss and recovery were…

Descriptors: Intervals, Rating Scales, Sampling, Scaling

Can a Test be too Reliable?

Peer reviewed

Wainer, Howard – Journal of Educational Measurement, 1986

An example demonstrates and explains that summary statistics commonly used to measure test quality can be seriously misleading and that summary statistics for the whole test are not sufficient for judging the quality of the test. (Author/LMO)

Descriptors: Correlation, Item Analysis, Statistical Bias, Statistical Studies

Group Scores: A Response to Baglin.

Peer reviewed

Burket, George R. – Journal of Educational Measurement, 1987

This response to the Baglin paper (1986) points out the fallacy in inferring that inappropriate scaling procedures cause apparent discrepancies between medians and means and between means calculated using different units. (LMO)

Descriptors: Norm Referenced Tests, Scaling, Scoring, Statistical Distributions

Evidence on the Quality of Several Approximations for Commonly Used Measurement Statistics

Peer reviewed

McMorris, Robert F. – Journal of Educational Measurement, 1972

Approximations were compared with exact statistics obtained on 85 different classroom tests constructed and administered by professors in a variety of fields; means and standard deviation of the resulting differences supported the use of approximations in practical situations. (Author)

Descriptors: Error of Measurement, Measurement Instruments, Reliability, Statistical Analysis

Estimating the Reliability of Multiple True-False Tests.

Peer reviewed

Frisbie, David A.; Druva, Cynthia A. – Journal of Educational Measurement, 1986

This study was designed to examine the level of dependence within multiple true-false test-item clusters by computing sets of item correlations with data from a test composed of both multiple true-false and multiple-choice items. (Author/LMO)

Descriptors: Cluster Analysis, Correlation, Higher Education, Multiple Choice Tests

The Choice of Scale for Educational Measurement: An IRT Perspective.

Peer reviewed

Yen, Wendy M. – Journal of Educational Measurement, 1986

Two methods of constucting equal-interval scales for educational achievement are discussed: Thurstone's absolute scaling method and Item Response Theory. Alternative criteria for choosing a scale are contrasted. It is argued that clearer criteria are needed for judging the appropriateness and usefulness of alternative scaling procedures.…

Descriptors: Achievement Tests, Latent Trait Theory, Mathematical Models, Scaling

The Effects of Score Group Width on the Mantel-Haenszel Procedure.

Peer reviewed

Clauser, Brian; And Others – Journal of Educational Measurement, 1994

The effect of reducing the number of score groups in the matching criterion of the Mantel-Haenszel procedure when screening for differential item functioning was investigated with a simulated data set. Results suggest that more than modest reductions cannot be recommended when ability distributions of reference and focal groups differ. (SLD)

Descriptors: Ability, Experimental Groups, Item Bias, Reference Groups

A Comparison of Examinee Sampling and Multiple Matrix Sampling in Test Development.

Peer reviewed

Garg, Rashmi; And Others – Journal of Educational Measurement, 1986

For the purpose of obtaining data to use in test development, multiple matrix sampling plans were compared to examinee sampling plans. Data were simulated for examinees, sampled from a population with a normal distribution of ability, responding to items selected from an item universe. (Author/LMO)

Descriptors: Difficulty Level, Monte Carlo Methods, Sampling, Statistical Studies

An Examination of the Assumption that the Equating of Parallel Forms is Population-Independent.

Peer reviewed

Angoff, William H.; Cowell, William R. – Journal of Educational Measurement, 1986

Linear conversions were developed relating scores on recent forms of the Graduate Record Examinations. Conversions based on specially selected subpopulations were compared with total-group conversions and evaluated. Conclusions indicated that the data clearly support the assumption of population independence for homogenoeous tests, but not quite…

Descriptors: College Entrance Examinations, Equated Scores, Groups, Higher Education

Using Demographic Subgroup and Dummy Variable Equations to Predict College Freshman Grade Average.

Peer reviewed

Sawyer, Richard – Journal of Educational Measurement, 1986

This study was designed to determine whether adjustments for the differential prediction observed among sex, racial/ethnic, or age subgroups in one freshman class at a college could be used to improve prediction accuracy for these subgroups in future freshman classes. (Author/LMO)

Descriptors: College Freshmen, Error of Measurement, Grade Prediction, Higher Education

Factors Which Influence Precision of School-Level IRT Ability Estimates.

Peer reviewed

Tate, Richard L.; King, F. J. – Journal of Educational Measurement, 1994

The precision of the group-based item-response theory (IRT) model applied to school ability estimation is described, assuming use of Bayesian estimation with precision represented by the standard deviation of the posterior distribution. Similarities with and differences between the school-based model and the individual-level IRT are explored. (SLD)

Descriptors: Ability, Bayesian Statistics, Estimation (Mathematics), Item Response Theory

The Effects of the Deletion of Misfitting Persons on Vertical Equating via the Rasch Model.

Peer reviewed

Phillips, S. E. – Journal of Educational Measurement, 1986

Rasch model equatings of multilevel achievement test data before and after the deletion of misfitting persons were compared. Rasch equatings were also compared with an equating obtained using the equipercentile method. No basis could be found in the results for choosing between the two Rasch equatings. (Author/LMO)

Descriptors: Achievement Tests, Equated Scores, Goodness of Fit, Guessing (Tests)

Assessing Dimensionality of a Set of Item Responses--Comparison of Different Approaches.

Peer reviewed

Nandakumar, Ratna – Journal of Educational Measurement, 1994

Using simulated and real data, this study compares the performance of three methodologies for assessing unidimensionality: (1) DIMTEST; (2) the approach of Holland and Rosenbaum; and (3) nonlinear factor analysis. All three models correctly confirm unidimensionality, but they differ in their ability to detect the lack of unidimensionality.…

Descriptors: Ability, Comparative Analysis, Evaluation Methods, Factor Analysis

Standard Errors of Measurement at Different Ability Levels.

Peer reviewed

Lord, Frederic M. – Journal of Educational Measurement, 1984

Four methods are outlined for estimating or approximating from a single test administration the standard error of measurement of number-right test score at specified ability levels or cutting scores. The methods are illustrated and compared on one set of real test data. (Author)

Descriptors: Academic Ability, Cutting Scores, Error of Measurement, Scoring Formulas

A Comparison of Equal Percentile and Partial Credit Equatings for Performance-Based Assessments Composed of Free-Response Items.

Peer reviewed

Huynh, Huynh; Ferrara, Steven – Journal of Educational Measurement, 1994

Equal percentile (EP) and partial credit (PC) equatings for raw scores from performance-based assessments with free-response items are compared through the use of data from the Maryland School Performance Assessment Program. Results suggest that EP and PC methods do not give equivalent results when distributions are markedly skewed. (SLD)

Descriptors: Comparative Analysis, Equated Scores, Mathematics Tests, Performance Based Assessment

Previous Page | Next Page »

Pages: 1 | 2

Lord, Frederic M.	2
Angoff, William H.	1
Burket, George R.	1
Clauser, Brian	1
Cowell, William R.	1
Druva, Cynthia A.	1
Ferrara, Steven	1
Frisbie, David A.	1
Garg, Rashmi	1
Huynh, Huynh	1
King, F. J.	1
Koffler, Stephen L.	1
McMorris, Robert F.	1
Nandakumar, Ratna	1
Phillips, S. E.	1
Sawyer, Richard	1
Shaw, Dale G	1
Tate, Richard L.	1
Wainer, Howard	1
Webb, Noreen M.	1
Yen, Wendy M.	1
More ▼