Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 9 |
Descriptor
| Statistical Analysis | 10 |
| Test Reliability | 10 |
| Scores | 6 |
| Comparative Analysis | 4 |
| Scoring | 4 |
| Test Items | 4 |
| College Outcomes Assessment | 3 |
| Correlation | 3 |
| Item Response Theory | 3 |
| Test Validity | 3 |
| Accuracy | 2 |
| More ▼ | |
Source
| ETS Research Report Series | 10 |
Author
| Kim, Sooyeon | 2 |
| Ling, Guangming | 2 |
| Livingston, Samuel A. | 2 |
| Chen, Haiwen H. | 1 |
| Gentile, Claudia | 1 |
| Guo, Hongwen | 1 |
| Haberman, Shelby | 1 |
| Kantor, Robert | 1 |
| Kyllonen, Patrick | 1 |
| Lee, Yong-Won | 1 |
| Liu, Ou Lydia | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 10 |
| Reports - Research | 10 |
Education Level
| Higher Education | 3 |
| Postsecondary Education | 2 |
Audience
Location
| Asia | 1 |
| South America | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| Major Field Achievement Test… | 2 |
| ACT Assessment | 1 |
| SAT (College Admission Test) | 1 |
| Test of English as a Foreign… | 1 |
| Test of English for… | 1 |
What Works Clearinghouse Rating
Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017
The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…
Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing
Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016
In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…
Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics
Livingston, Samuel A.; Chen, Haiwen H. – ETS Research Report Series, 2015
Quantitative information about test score reliability can be presented in terms of the distribution of equated scores on an alternate form of the test for test takers with a given score on the form taken. In this paper, we describe a procedure for estimating that distribution, for any specified score on the test form taken, by estimating the joint…
Descriptors: Scores, Statistical Distributions, Research Reports, Equated Scores
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
Wei, Youhua; Low, Albert – ETS Research Report Series, 2017
In most large-scale programs of tests that aid in making high-stakes decisions, such as the "TOEIC"® family of products and service, it is not unusual for a significant portion of test takers to retake the test at multiple times.The study reported here used multilevel growth modeling to explore the score change patterns of nearly 20,000…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores
Ling, Guangming – ETS Research Report Series, 2013
One concern when repurposing a test to a new population is whether the test is measuring the same construct in a valid and reliable way that is comparable to the intended population. Following the guidelines of the International Test Commission and the ETS Standards for Quality and Fairness, this study was designed to collect evidence in support…
Descriptors: College Outcomes Assessment, Business Administration Education, Test Validity, Test Reliability
Ling, Guangming – ETS Research Report Series, 2012
To assess the value of individual students' subscores on the Major Field Test in Business (MFT Business), I examined the test's internal structure with factor analysis and structural equation model methods, and analyzed the subscore reliabilities using the augmented scores method. Analyses of the internal structure suggested that the MFT Business…
Descriptors: Factor Analysis, Construct Validity, Structural Equation Models, Correlation
Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – ETS Research Report Series, 2008
The main purpose of the study was to investigate the distinctness and reliability of analytic (or multitrait) rating dimensions and their relationships to holistic scores and "e-rater"® essay feature variables in the context of the TOEFL® computer-based test (CBT) writing assessment. Data analyzed in the study were analytic and holistic…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Scoring
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – ETS Research Report Series, 2006
This study addresses the sample error and linking bias that occur with small and unrepresentative samples in a non-equivalent groups anchor test (NEAT) design. We propose a linking method called the "synthetic function," which is a weighted average of the identity function (the trivial equating function for forms that are known to be…
Descriptors: Equated Scores, Sample Size, Test Items, Statistical Bias

Peer reviewed
