ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Source

Journal of Educational…

Publication Type

Journal Articles	14
Reports - Evaluative	14
Speeches/Meeting Papers	2
Book/Product Reviews	1

Education Level

Audience

Location

United Kingdom (Scotland)

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	3
SAT (College Admission Test)	2

What Works Clearinghouse Rating

Showing all 14 results Save | Export

An Exploration of an Improved Aggregate Student Growth Measure Using Data from Two States

Peer reviewed

Direct link

Castellano, Katherine E.; McCaffrey, Daniel F.; Lockwood, J. R. – Journal of Educational Measurement, 2023

The simple average of student growth scores is often used in accountability systems, but it can be problematic for decision making. When computed using a small/moderate number of students, it can be sensitive to the sample, resulting in inaccurate representations of growth of the students, low year-to-year stability, and inequities for…

Descriptors: Academic Achievement, Accountability, Decision Making, Computation

Assessment of Person Fit Using Resampling-Based Approaches

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2016

De la Torre and Deng suggested a resampling-based approach for person-fit assessment (PFA). The approach involves the use of the [math equation unavailable] statistic, a corrected expected a posteriori estimate of the examinee ability, and the Monte Carlo (MC) resampling method. The Type I error rate of the approach was closer to the nominal level…

Descriptors: Sampling, Research Methodology, Error Patterns, Monte Carlo Methods

Preequating with Empirical Item Characteristic Curves: An Observed-Score Preequating Method

Peer reviewed

Direct link

Zu, Jiyun; Puhan, Gautam – Journal of Educational Measurement, 2014

Preequating is in demand because it reduces score reporting time. In this article, we evaluated an observed-score preequating method: the empirical item characteristic curve (EICC) method, which makes preequating without item response theory (IRT) possible. EICC preequating results were compared with a criterion equating and with IRT true-score…

Descriptors: Item Response Theory, Equated Scores, Item Analysis, Item Sampling

Validating the Interpretations and Uses of Test Scores

Peer reviewed

Direct link

Kane, Michael T. – Journal of Educational Measurement, 2013

To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…

Descriptors: Test Interpretation, Validity, Scores, Test Use

Note on Sources of Sampling Variability in Science Performance Assessments.

Peer reviewed

Shavelson, Richard J.; Ruiz-Primo, Maria Araceli; Wiley, Edward W. – Journal of Educational Measurement, 1999

Reports a reanalysis of data collected in a person x task x occasion rater or method G-study design (M. Ruiz-Primo and others, 1993), and brings this reanalysis to bear on the interpretation of task-sampling variability and the convergence of different performance-assessment methods. (SLD)

Descriptors: Performance Based Assessment, Sampling, Sciences

An Expected Response Function Approach to Graphical Differential Item Functioning.

Peer reviewed

Scrams, David J.; McLeod, Lori D. – Journal of Educational Measurement, 2000

Presents an approach to graphical differential item functioning (DIF) based on a sampling-theory approach to expected response functions. Applied the approach to a set of pretest items and compared results to traditional Mantel Haenszel DIF statistics. Discusses implications of the method as a complement to the approach of P. Pashley (1992). (SLD)

Descriptors: Item Bias, Pretests Posttests, Sampling

Applying the Mantel-Haenszel Procedure to Complex Samples of Items.

Peer reviewed

Allen, Nancy L.; Donoghue, John R. – Journal of Educational Measurement, 1996

Examined the effect of complex sampling of items on the measurement of differential item functioning (DIF) using the Mantel-Haenszel procedure through a Monte Carlo study. Suggests the superiority of the pooled booklet method when items are selected for examinees according to a balanced incomplete block design. Discusses implications for other DIF…

Descriptors: Item Bias, Monte Carlo Methods, Research Design, Sampling

Infeasibility in Automated Test Assembly Models: A Comparison Study of Different Methods

Peer reviewed

Direct link

Huitzing, Hiddo A.; Veldkamp, Bernard P.; Verschoor, Angela J. – Journal of Educational Measurement, 2005

Several techniques exist to automatically put together a test meeting a number of specifications. In an item bank, the items are stored with their characteristics. A test is constructed by selecting a set of items that fulfills the specifications set by the test assembler. Test assembly problems are often formulated in terms of a model consisting…

Descriptors: Testing Programs, Programming, Mathematics, Item Sampling

Comparing State SAT Scores: Problems, Biases, and Corrections.

Peer reviewed

Gohmann, Stephen F. – Journal of Educational Measurement, 1988

One method to correct for selection bias in comparing Scholastic Aptitude Test (SAT) scores among states is presented, which is a modification of J. J. Heckman's Selection Bias Correction (1976, 1979). Empirical results suggest that sample selection bias is present in SAT score regressions. (SLD)

Descriptors: Regression (Statistics), Sampling, Scoring, Selection

Issues in the Design and Reporting of the National Assessment of Educational Progress.

Peer reviewed

Linn, Robert L.; Dunbar, Stephen B. – Journal of Educational Measurement, 1992

Several issues related to the design and reporting of results from the National Assessment of Educational Progress (NAEP) are discussed in the context of current expectations for the NAEP and its origins. These issues include: (1) content coverage and format; (2) estimation procedures; and (3) reporting problems. (SLD)

Descriptors: Content Analysis, Educational Assessment, Elementary Secondary Education, Estimation (Mathematics)

A Longitudinal Hierarchical Linear Model for Estimating School Effects and Their Stability.

Peer reviewed

Willms, J. Douglas; Raudenbush, Stephen W. – Journal of Educational Measurement, 1989

A general longitudinal model is presented for estimating school effects and their stability. The model, capable of separating true changes from sampling and measurement error, controls statistically for effects of factors exogenous to the school system. The model is illustrated with data from large cohorts of students in Scotland. (SLD)

Descriptors: Elementary Secondary Education, Equations (Mathematics), Error of Measurement, Estimation (Mathematics)

Sampling Variability of Performance Assessments.

Peer reviewed

Shavelson, Richard J.; And Others – Journal of Educational Measurement, 1993

Evidence is presented on the generalizability and convergent validity of performance assessments using data from six studies of student achievement that sampled a wide range of measurement facets and methods. Results at individual and school levels indicate that task-sampling variability is the major source of measurement error. (SLD)

Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Generalizability Theory

Estimating Population Characteristics from Sparse Matrix Samples of Item Responses.

Peer reviewed

Mislevy, Robert J.; And Others – Journal of Educational Measurement, 1992

Concepts behind plausible values in estimating population characteristics from sparse matrix samples of item responses are discussed. The use of marginal analyses is described in the context of the National Assessment of Educational Progress, and the approach is illustrated with Scholastic Aptitude Test data for 9,075 high school seniors. (SLD)

Descriptors: College Entrance Examinations, Educational Assessment, Equations (Mathematics), Estimation (Mathematics)

The Design of the National Assessment of Educational Progress.

Peer reviewed

Johnson, Eugene G. – Journal of Educational Measurement, 1992

Features of the design of the National Assessment of Educational Progress (NAEP) are discussed, with emphasis on the design of the 1992 assessment. Student sample designs for the NAEP and the Trial State Assessment are described, and the focused-balanced incomplete block spiraling method of item sampling is discussed. (SLD)

Descriptors: Academic Achievement, Educational Assessment, Educational Change, Elementary Secondary Education

Sampling	12
Educational Assessment	4
Item Response Theory	4
Academic Achievement	3
Elementary Secondary Education	3
Estimation (Mathematics)	3
Mathematical Models	3
Measurement Techniques	3
National Surveys	3
Research Design	3
Test Items	3
Computation	2
Equations (Mathematics)	2
Error of Measurement	2
Item Bias	2
Item Sampling	2
Monte Carlo Methods	2
Performance Based Assessment	2
Student Evaluation	2
Test Construction	2
Test Results	2
Testing Programs	2
Accountability	1
Accuracy	1
Alternative Assessment	1
More ▼

Shavelson, Richard J.	2
Allen, Nancy L.	1
Castellano, Katherine E.	1
Donoghue, John R.	1
Dunbar, Stephen B.	1
Gohmann, Stephen F.	1
Huitzing, Hiddo A.	1
Johnson, Eugene G.	1
Kane, Michael T.	1
Linn, Robert L.	1
Lockwood, J. R.	1
McCaffrey, Daniel F.	1
McLeod, Lori D.	1
Mislevy, Robert J.	1
Puhan, Gautam	1
Raudenbush, Stephen W.	1
Ruiz-Primo, Maria Araceli	1
Scrams, David J.	1
Sinharay, Sandip	1
Veldkamp, Bernard P.	1
Verschoor, Angela J.	1
Wiley, Edward W.	1
Willms, J. Douglas	1
Zu, Jiyun	1
More ▼