ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	27

Descriptor

Item Analysis	33
Test Items	29
Statistical Analysis	16
Item Response Theory	15
Comparative Analysis	11
Difficulty Level	11
Language Tests	9
Test Construction	9
Computer Assisted Testing	8
Mathematics Tests	8
Scores	8
English (Second Language)	7
Models	7
Simulation	7
Correlation	6
Reading Tests	6
Scoring	6
Second Language Learning	6
College Entrance Examinations	5
Gender Differences	5
Psychometrics	5
Regression (Statistics)	5
Sample Size	5
Test Validity	5
Accuracy	4
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	33
Reports - Research	33
Tests/Questionnaires	5
Information Analyses	1
Numerical/Quantitative Data	1

Education Level

Higher Education	7
Postsecondary Education	7
Secondary Education	7
Elementary Education	4
Junior High Schools	4
Middle Schools	4
Grade 8	3
Grade 7	2
Early Childhood Education	1
Grade 12	1
High Schools	1
Primary Education	1
More ▼

Audience

Location

Alabama	1
Arizona	1
Arkansas	1
Asia	1
Australia	1
California	1
China	1
Connecticut	1
Delaware	1
Georgia	1
Idaho	1
Illinois	1
Indiana	1
Iowa	1
Italy	1
Kentucky	1
Massachusetts	1
Minnesota	1
Nevada	1
New Jersey	1
Tennessee	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	4
Test of English as a Foreign…	3
National Assessment of…	2
Praxis Series	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 33 results Save | Export

Practical Considerations in Item Calibration with Small Samples under Multistage Test Design: A Case Study. Research Report. ETS RR-24-03

Peer reviewed
PDF on ERIC

Download full text

Hongwen Guo; Matthew S. Johnson; Daniel F. McCaffrey; Lixong Gu – ETS Research Report Series, 2024

The multistage testing (MST) design has been gaining attention and popularity in educational assessments. For testing programs that have small test-taker samples, it is challenging to calibrate new items to replenish the item pool. In the current research, we used the item pools from an operational MST program to illustrate how research studies…

Descriptors: Test Items, Test Construction, Sample Size, Scaling

Alternative Methods for Item Parameter Estimation: From CTT to IRT. Research Report. ETS RR-22-12

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Lu, Ru; Johnson, Matthew S.; McCaffrey, Dan F. – ETS Research Report Series, 2022

It is desirable for an educational assessment to be constructed of items that can differentiate different performance levels of test takers, and thus it is important to estimate accurately the item discrimination parameters in either classical test theory or item response theory. It is particularly challenging to do so when the sample sizes are…

Descriptors: Test Items, Item Response Theory, Item Analysis, Educational Assessment

Using Existing Data to Inform Development of New Item Types. Research Report. ETS RR-20-01

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020

With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…

Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

A Modified "a"-Stratified Method for Computerized Adaptive Testing. Research Report. ETS RR-19-10

Peer reviewed
PDF on ERIC

Download full text

Gu, Lixiong; Ling, Guangming; Qu, Yanxuan – ETS Research Report Series, 2019

Research has found that the "a"-stratified item selection strategy (STR) for computerized adaptive tests (CATs) may lead to insufficient use of high a items at later stages of the tests and thus to reduced measurement precision. A refined approach, unequal item selection across strata (USTR), effectively improves test precision over the…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Use, Test Items

Examining the Accuracy of a Conversation-Based Assessment in Interpreting English Learners' Written Responses. Research Report. ETS RR-21-03

Peer reviewed
PDF on ERIC

Download full text

Lopez, Alexis A.; Guzman-Orth, Danielle; Zapata-Rivera, Diego; Forsyth, Carolyn M.; Luce, Christine – ETS Research Report Series, 2021

Substantial progress has been made toward applying technology enhanced conversation-based assessments (CBAs) to measure the English-language proficiency of English learners (ELs). CBAs are conversation-based systems that use conversations among computer-animated agents and a test taker. We expanded the design and capability of prior…

Descriptors: Accuracy, English Language Learners, Language Proficiency, Language Tests

An Empirical Investigation of the Potential Impact of Item Misfit on Test Scores. Research Report. ETS RR-17-60

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017

In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…

Descriptors: Scores, Test Items, Test Bias, International Assessment

Evaluation of Different Scoring Rules for a Noncognitive Test in Development. Research Report. ETS RR-16-03

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016

In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…

Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics

A Learning Progression for Variability. Research Report. ETS RR-20-05

Peer reviewed
PDF on ERIC

Download full text

Fife, James H.; James, Kofi; Peters, Stephanie – ETS Research Report Series, 2020

The concept of variability is central to statistics. In this research report, we review mathematics education research on variability and, based on that review and on feedback from an expert panel, propose a learning progression (LP) for variability. The structure of the proposed LP consists of 5 levels of sophistication in understanding…

Descriptors: Mathematics Education, Statistics Education, Feedback (Response), Research Reports

Calculator Use on the "GRE"® Revised General Test Quantitative Reasoning Measure. ETS GRE® Board Research Report. ETS GRE®-14-02. ETS Research Report. RR-14-25

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal – ETS Research Report Series, 2014

Previous research on calculator use in standardized assessments of quantitative ability focused on the effect of calculator availability on item difficulty and on whether test developers can predict these effects. With the introduction of an on-screen calculator on the Quantitative Reasoning measure of the "GRE"® revised General Test, it…

Descriptors: College Entrance Examinations, Graduate Study, Calculators, Test Items

A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca – ETS Research Report Series, 2012

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…

Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods

Gender and Minority Achievement Gaps in Science in Eighth Grade: Item Analyses of Nationally Representative Data. Research Report. ETS RR-17-36

Peer reviewed
PDF on ERIC

Download full text

Qian, Xiaoyu; Nandakumar, Ratna; Glutting, Joseoph; Ford, Danielle; Fifield, Steve – ETS Research Report Series, 2017

In this study, we investigated gender and minority achievement gaps on 8th-grade science items employing a multilevel item response methodology. Both gaps were wider on physics and earth science items than on biology and chemistry items. Larger gender gaps were found on items with specific topics favoring male students than other items, for…

Descriptors: Item Analysis, Gender Differences, Achievement Gap, Grade 8

Statistical Report of 2011 "CBAL"™ Multistate Administration of Reading and Writing Tests. Research Report. ETS RR-12-24

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Wise, Maxwell – ETS Research Report Series, 2012

In the Cognitively Based Assessment of, for, and as Learning ("CBAL"™) research initiative, innovative K-12 prototype tests based on cognitive competency models are developed. This report presents the statistical results of the 2 CBAL Grade 8 writing tests and 2 Grade 7 reading tests administered to students in 20 states in spring 2011.…

Descriptors: Cognitive Ability, Grade 8, Writing Tests, Grade 7

Creating Vocabulary Item Types That Measure Students' Depth of Semantic Knowledge. Research Report. ETS RR-14-02

Peer reviewed
PDF on ERIC

Download full text

Deane, Paul; Lawless, René R.; Li, Chen; Sabatini, John; Bejar, Isaac I.; O'Reilly, Tenaha – ETS Research Report Series, 2014

We expect that word knowledge accumulates gradually. This article draws on earlier approaches to assessing depth, but focuses on one dimension: richness of semantic knowledge. We present results from a study in which three distinct item types were developed at three levels of depth: knowledge of common usage patterns, knowledge of broad topical…

Descriptors: Vocabulary, Test Items, Language Tests, Semantics

Improving Content Assessment for English Language Learners: Studies of the Linguistic Modification of Test Items. Research Report. ETS RR-14-23

Peer reviewed
PDF on ERIC

Download full text

Young, John W.; King, Teresa C.; Hauck, Maurice Cogan; Ginsburgh, Mitchell; Kotloff, Lauren; Cabrera, Julio; Cavalie, Carlos – ETS Research Report Series, 2014

This article describes two research studies conducted on the linguistic modification of test items from K-12 content assessments. In the first study, 120 linguistically modified test items in mathematics and science taken by fourth and sixth graders were found to have a wide range of outcomes for English language learners (ELLs) and non-ELLs, with…

Descriptors: English Language Learners, Test Items, Mathematics Tests, Science Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3

Guo, Hongwen	4
Dorans, Neil J.	3
Ling, Guangming	3
Deane, Paul	2
Futagi, Yoko	2
Kim, Sooyeon	2
Kyllonen, Patrick	2
Livingston, Samuel A.	2
Lu, Ru	2
Oranje, Andreas	2
Sinharay, Sandip	2
von Davier, Matthias	2
Attali, Yigal	1
Bejar, Isaac I.	1
Boughton, Keith A.	1
Breland, Hunter	1
Breyer, F. Jay	1
Cabrera, Julio	1
Cavalie, Carlos	1
Cohen, Andrew D.	1
Daniel F. McCaffrey	1
Deng, Weiling	1
Fife, James H.	1
Fifield, Steve	1
Ford, Danielle	1
More ▼