ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	15

Descriptor

Computation	15
Accuracy	9
Statistical Analysis	9
Equated Scores	7
Error of Measurement	6
Comparative Analysis	5
Simulation	5
Bayesian Statistics	4
Sample Size	4
Adaptive Testing	3
Item Response Theory	3
Regression (Statistics)	3
Data Analysis	2
Difficulty Level	2
Models	2
Probability	2
Response Style (Tests)	2
Scores	2
Scoring	2
Statistical Bias	2
Statistical Distributions	2
Statistical Significance	2
Test Bias	2
Ability	1
College Entrance Examinations	1
More ▼

Source

ETS Research Report Series	7
Journal of Educational…	3
Journal of Educational and…	3
Educational Testing Service	1
Educational and Psychological…	1

Author

Moses, Tim	15
Kim, Sooyeon	4
Holland, Paul	2
Miao, Jing	2
Dorans, Neil	1
Dorans, Neil J.	1
Holland, Paul W.	1
Oh, Hyeonjoo J.	1
Yoo, Hanwook	1
Yoo, Hanwook Henry	1
Zhang, Wenmin	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	9
Reports - Evaluative	5
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing all 15 results Save | Export

The Impact of Aberrant Responses and Detection in Forced-Choice Noncognitive Assessment. Research Report. ETS RR-18-32

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2018

The purpose of this study is to assess the impact of aberrant responses on the estimation accuracy in forced-choice format assessments. To that end, a wide range of aberrant response behaviors (e.g., fake, random, or mechanical responses) affecting upward of 20%--30% of the responses was manipulated under the multi-unidimensional pairwise…

Descriptors: Measurement Techniques, Response Style (Tests), Accuracy, Computation

Investigating Robustness of Item Response Theory Proficiency Estimators to Atypical Response Behaviors under Two-Stage Multistage Testing. ETS GRE® Board Research Report. ETS GRE®-16-03. ETS Research Report No. RR-16-22

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2016

The purpose of this study is to evaluate the extent to which item response theory (IRT) proficiency estimation methods are robust to the presence of aberrant responses under the "GRE"® General Test multistage adaptive testing (MST) design. To that end, a wide range of atypical response behaviors affecting as much as 10% of the test items…

Descriptors: Item Response Theory, Computation, Robustness (Statistics), Response Style (Tests)

A Comparison of IRT Proficiency Estimation Methods under Adaptive Multistage Testing

Peer reviewed

Direct link

Kim, Sooyeon; Moses, Tim; Yoo, Hanwook – Journal of Educational Measurement, 2015

This inquiry is an investigation of item response theory (IRT) proficiency estimators' accuracy under multistage testing (MST). We chose a two-stage MST design that includes four modules (one at Stage 1, three at Stage 2) and three difficulty paths (low, middle, high). We assembled various two-stage MST panels (i.e., forms) by manipulating two…

Descriptors: Comparative Analysis, Item Response Theory, Computation, Accuracy

Alternative Smoothing and Scaling Strategies for Weighted Composite Scores

Peer reviewed

Direct link

Moses, Tim – Educational and Psychological Measurement, 2014

In this study, smoothing and scaling approaches are compared for estimating subscore-to-composite scaling results involving composites computed as rounded and weighted combinations of subscores. The considered smoothing and scaling approaches included those based on raw data, on smoothing the bivariate distribution of the subscores, on smoothing…

Descriptors: Weighted Scores, Scaling, Data Analysis, Comparative Analysis

Effectiveness of Item Response Theory (IRT) Proficiency Estimation Methods under Adaptive Multistage Testing. Research Report. ETS RR-15-11

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Moses, Tim; Yoo, Hanwook Henry – ETS Research Report Series, 2015

The purpose of this inquiry was to investigate the effectiveness of item response theory (IRT) proficiency estimators in terms of estimation bias and error under multistage testing (MST). We chose a 2-stage MST design in which 1 adaptation to the examinees' ability levels takes place. It includes 4 modules (1 at Stage 1, 3 at Stage 2) and 3 paths…

Descriptors: Item Response Theory, Computation, Statistical Bias, Error of Measurement

Relationships of Measurement Error and Prediction Error in Observed-Score Regression

Peer reviewed

Direct link

Moses, Tim – Journal of Educational Measurement, 2012

The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…

Descriptors: Error of Measurement, Prediction, Regression (Statistics), True Scores

Standard Errors of Equating Differences: Prior Developments, Extensions, and Simulations

Peer reviewed

Direct link

Moses, Tim; Zhang, Wenmin – Journal of Educational and Behavioral Statistics, 2011

The purpose of this article was to extend the use of standard errors for equated score differences (SEEDs) to traditional equating functions. The SEEDs are described in terms of their original proposal for kernel equating functions and extended so that SEEDs for traditional linear and traditional equipercentile equating functions can be computed.…

Descriptors: Equated Scores, Error Patterns, Evaluation Research, Statistical Analysis

A Comparison of Strategies for Estimating Conditional DIF

Peer reviewed

Direct link

Moses, Tim; Miao, Jing; Dorans, Neil J. – Journal of Educational and Behavioral Statistics, 2010

In this study, the accuracies of four strategies were compared for estimating conditional differential item functioning (DIF), including raw data, logistic regression, log-linear models, and kernel smoothing. Real data simulations were used to evaluate the estimation strategies across six items, DIF and No DIF situations, and four sample size…

Descriptors: Test Bias, Statistical Analysis, Computation, Comparative Analysis

Pseudo Bayes Estimates for Test Score Distributions and Chained Equipercentile Equating. Research Report. ETS RR-09-47

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Oh, Hyeonjoo J. – ETS Research Report Series, 2009

Pseudo Bayes probability estimates are weighted averages of raw and modeled probabilities; these estimates have been studied primarily in nonpsychometric contexts. The purpose of this study was to evaluate pseudo Bayes probability estimates as applied to the estimation of psychometric test score distributions and chained equipercentile equating…

Descriptors: Bayesian Statistics, Computation, Equated Scores, Probability

Selection Strategies for Univariate Loglinear Smoothing Models and Their Effect on Equating Function Accuracy

Peer reviewed

Direct link

Moses, Tim; Holland, Paul W. – Journal of Educational Measurement, 2009

In this study, we compared 12 statistical strategies proposed for selecting loglinear models for smoothing univariate test score distributions and for enhancing the stability of equipercentile equating functions. The major focus was on evaluating the effects of the selection strategies on equating function accuracy. Selection strategies' influence…

Descriptors: Equated Scores, Selection, Statistical Analysis, Models

A Comparison of Methods for Estimating Conditional Item Score Differences in Differential Item Functioning (DIF) Assessments. Research Report. ETS RR-10-15

Download full text

Moses, Tim; Miao, Jing; Dorans, Neil – Educational Testing Service, 2010

This study compared the accuracies of four differential item functioning (DIF) estimation methods, where each method makes use of only one of the following: raw data, logistic regression, loglinear models, or kernel smoothing. The major focus was on the estimation strategies' potential for estimating score-level, conditional DIF. A secondary focus…

Descriptors: Test Bias, Statistical Analysis, Computation, Scores

Notes on a General Framework for Observed Score Equating. Research Report. ETS RR-08-59

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Holland, Paul – ETS Research Report Series, 2008

The purpose of this paper is to extend von Davier, Holland, and Thayer's (2004b) framework of kernel equating so that it can incorporate raw data and traditional equipercentile equating methods. One result of this more general framework is that previous equating methodology research can be viewed more comprehensively. Another result is that the…

Descriptors: Equated Scores, Error of Measurement, Statistical Analysis, Computation

Using the Kernel Method of Test Equating for Estimating the Standard Errors of Population Invariance Measures

Peer reviewed

Direct link

Moses, Tim – Journal of Educational and Behavioral Statistics, 2008

Equating functions are supposed to be population invariant, meaning that the choice of subpopulation used to compute the equating function should not matter. The extent to which equating functions are population invariant is typically assessed in terms of practical difference criteria that do not account for equating functions' sampling…

Descriptors: Equated Scores, Error of Measurement, Sampling, Evaluation Methods

Kernel and Traditional Equipercentile Equating with Degrees of Presmoothing. Research Report. ETS RR-07-15

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Holland, Paul – ETS Research Report Series, 2007

The purpose of this study was to empirically evaluate the impact of loglinear presmoothing accuracy on equating bias and variability across chained and post-stratification equating methods, kernel and percentile-rank continuization methods, and sample sizes. The results of evaluating presmoothing on equating accuracy generally agreed with those of…

Descriptors: Equated Scores, Statistical Analysis, Accuracy, Sample Size

Using the Kernel Method of Test Equating for Estimating the Standard Errors of Population Invariance Measures. Research Report. ETS RR-06-20

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim – ETS Research Report Series, 2006

Population invariance is an important requirement of test equating. An equating function is said to be population invariant when the choice of (sub)population used to compute the equating function does not matter. In recent studies, the extent to which equating functions are population invariant is typically addressed in terms of practical…

Descriptors: Equated Scores, Computation, Error of Measurement, Statistical Analysis