NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Educational Testing Service24
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 24 results Save | Export
Papageorgiou, Spiros; Davis, Larry; Norris, John M.; Garcia Gomez, Pablo; Manna, Venessa F.; Monfils, Lora – Educational Testing Service, 2021
The "TOEFL® Essentials"™ test is a new English language proficiency test in the "TOEFL"® family of assessments. It measures foundational language skills and communication abilities in academic and general (daily life) contexts. The test covers the four language skills of reading, listening, writing, and speaking and is intended…
Descriptors: Language Tests, English (Second Language), Second Language Learning, Language Proficiency
Bailey, Alison L.; Wolf, Mikyung Kim; Ballard, Laura – Educational Testing Service, 2022
The research note focuses on the alignment aspect of English language proficiency (ELP) assessments, one of the required types of validity evidence for the federal peer review process of states' assessment systems. A basic tenant of current U.S. education policy is the alignment between what a test assesses and what content has been determined as…
Descriptors: English (Second Language), Second Language Learning, Language Proficiency, Alignment (Education)
Papageorgiou, Spiros; Xu, Xiaoqiu; Timpe-Laughlin, Veronika; Dugdale, Deborah M. – Educational Testing Service, 2020
The purpose of this study is to examine the appropriateness of using the "TOEFL Primary®" tests to evaluate the language abilities of students learning English as a foreign language (EFL) through an online-delivered curriculum, the VIPKid Major Course (MC). Data include student test scores on the TOEFL Primary Listening and Reading tests…
Descriptors: Alignment (Education), Language Tests, English (Second Language), Second Language Learning
Schmidgall, Jonathan – Educational Testing Service, 2021
The redesigned "TOEIC Bridge"® tests are designed to measure the reading, listening, speaking, and writing proficiency of beginning to low-intermediate English learners in the context of everyday adult life. This report describes the comprehensive and multifaceted process used to enhance the meaningfulness of TOEIC Bridge test score…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Language Proficiency
Haberman, Shelby J.; Yan, Duanli – Educational Testing Service, 2011
Continuous exponential families are applied to linking test forms via an internal anchor. This application combines work on continuous exponential families for single-group designs and work on continuous exponential families for equivalent-group designs. Results are compared to those for kernel and equipercentile equating in the case of chained…
Descriptors: Equated Scores, Statistical Analysis, Language Tests, Mathematics Tests
Kim, Sooyeon; Walker, Michael E. – Educational Testing Service, 2011
This study examines the use of subpopulation invariance indices to evaluate the appropriateness of using a multiple-choice (MC) item anchor in mixed-format tests, which include both MC and constructed-response (CR) items. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using an MC-only anchor set for 4…
Descriptors: Test Format, Multiple Choice Tests, Test Items, Gender Differences
Sinharay, Sandip; Haberman, Shelby J.; Jia, Helena – Educational Testing Service, 2011
Standard 3.9 of the "Standards for Educational and Psychological Testing" (American Educational Research Association, American Psychological Association, & National Council for Measurement in Education, 1999) demands evidence of model fit when an item response theory (IRT) model is used to make inferences from a data set. We applied two recently…
Descriptors: Item Response Theory, Goodness of Fit, Statistical Analysis, Language Tests
Haberman, Shelby J. – Educational Testing Service, 2011
Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Descriptors: Writing Tests, Scoring, Essays, Language Tests
Attali, Yigal – Educational Testing Service, 2011
The e-rater[R] automated essay scoring system is used operationally in the scoring of TOEFL iBT[R] independent essays. Previous research has found support for a 3-factor structure of the e-rater features. This 3-factor structure has an attractive hierarchical linguistic interpretation with a word choice factor, a grammatical convention within a…
Descriptors: Essay Tests, Language Tests, Test Scoring Machines, Automation
Haberman, Shelby J.; Sinharay, Sandip – Educational Testing Service, 2011
Subscores are reported for several operational assessments. Haberman (2008) suggested a method based on classical test theory to determine if the true subscore is predicted better by the corresponding subscore or the total score. Researchers are often interested in learning how different subgroups perform on subtests. Stricker (1993) and…
Descriptors: True Scores, Test Theory, Prediction, Group Membership
Attali, Yigal – Educational Testing Service, 2011
This paper proposes an alternative content measure for essay scoring, based on the "difference" in the relative frequency of a word in high-scored versus low-scored essays. The "differential word use" (DWU) measure is the average of these differences across all words in the essay. A positive value indicates the essay is using…
Descriptors: Scoring, Essay Tests, Word Frequency, Content Analysis
DeCarlo, Lawrence T. – Educational Testing Service, 2010
A basic consideration in large-scale assessments that use constructed response (CR) items, such as essays, is how to allocate the essays to the raters that score them. Designs that are used in practice are incomplete, in that each essay is scored by only a subset of the raters, and also unbalanced, in that the number of essays scored by each rater…
Descriptors: Test Items, Responses, Essay Tests, Scoring
Li, Yanmei; Li, Shuhong; Wang, Lin – Educational Testing Service, 2010
Many standardized educational tests include groups of items based on a common stimulus, known as "testlets". Standard unidimensional item response theory (IRT) models are commonly used to model examinees' responses to testlet items. However, it is known that local dependence among testlet items can lead to biased item parameter estimates…
Descriptors: English, Language Tests, Reading Tests, Item Response Theory
Powers, Donald E.; Kim, Hae-Jin; Yu, Feng; Weng, Vincent Z.; VanWinkle, Waverely – Educational Testing Service, 2009
To facilitate the interpretation of test scores from the new TOEIC[R] (Test of English for International Communications[TM]) speaking and writing tests as measures of English-language proficiency, we administered a self-assessment inventory to TOEIC examinees in Japan and Korea, to gather their perceptions of their ability to perform a variety of…
Descriptors: English for Special Purposes, Language Tests, Writing Tests, Speech Tests
Liu, Ou Lydia; Schedl, Mary; Malloy, Jeanne; Kong, Nan – Educational Testing Service, 2009
The TOEFL iBT[TM] has increased the length of the reading passages in the reading section compared to the passages on the TOEFL[R] computer-based test (CBT) to better approximate academic reading in North American universities, resulting in a reduced number of passages in the reading test. A concern arising from this change is whether the decrease…
Descriptors: English (Second Language), Language Tests, Internet, Computer Assisted Testing
Previous Page | Next Page »
Pages: 1  |  2