ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	11

Descriptor

Item Analysis	53
Testing Programs	53
Test Items	25
Achievement Tests	16
State Programs	16
Elementary Secondary Education	13
Test Construction	12
Test Interpretation	12
Criterion Referenced Tests	11
Test Validity	11
Academic Achievement	10
Basic Skills	10
Test Results	10
Foreign Countries	9
Scores	9
Test Reliability	9
Mathematics Tests	8
Minimum Competency Testing	8
Standardized Tests	8
Difficulty Level	7
Educational Assessment	7
Educational Testing	7
Elementary Education	7
Item Response Theory	7
Reading Tests	7
More ▼

Source

Applied Measurement in…	3
ETS Research Report Series	2
Educational and Psychological…	2
Educational Testing Service	1
Elementary School Journal	1
Journal of Applied Testing…	1
Journal of Educational…	1
Pedagogies: An International…	1

Publication Type

Reports - Research	53
Speeches/Meeting Papers	16
Journal Articles	11
Numerical/Quantitative Data	2
Tests/Questionnaires	2
Reports - Descriptive	1

Education Level

Elementary Secondary Education	3
Higher Education	3
Postsecondary Education	2
Secondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 1	1
Grade 11	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Researchers

Location

Canada	5
Florida	2
Alabama	1
Alaska	1
Arizona (Phoenix)	1
Australia	1
California	1
Canada (Edmonton)	1
Canada (Vancouver)	1
Delaware	1
Georgia	1
Maine	1
Michigan	1
Netherlands	1
New Jersey	1
New York	1
Ohio	1
South Carolina	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

California Achievement Tests	2
Comprehensive Tests of Basic…	2
SAT (College Admission Test)	2
Alabama High School…	1
California Test of Mental…	1
Gates MacGinitie Reading Tests	1
Graduate Management Admission…	1
Graduate Record Examinations	1
New Jersey High School…	1
Praxis Series	1
Program for International…	1
Raven Progressive Matrices	1
SRA Achievement Series	1
Stanford Achievement Tests	1
Stanford Binet Intelligence…	1
Texas Assessment of Basic…	1
Texas Educational Assessment…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 53 results Save | Export

Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data

Peer reviewed

Direct link

Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024

Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…

Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests

Semisupervised Learning Method to Adjust Biased Item Difficulty Estimates Caused by Nonignorable Missingness in a Virtual Learning Environment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Xue, Kang; Huggins-Manley, Anne Corinne; Leite, Walter – Educational and Psychological Measurement, 2022

In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of…

Descriptors: Virtual Classrooms, Artificial Intelligence, Item Response Theory, Item Analysis

Considering the Use of General and Modified Assessment Items in Computerized Adaptive Testing

Peer reviewed

Direct link

Wyse, Adam E.; Albano, Anthony D. – Applied Measurement in Education, 2015

This article used several data sets from a large-scale state testing program to examine the feasibility of combining general and modified assessment items in computerized adaptive testing (CAT) for different groups of students. Results suggested that several of the assumptions made when employing this type of mixed-item CAT may not be met for…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Testing Programs

Multilevel Modeling of Item Position Effects

Peer reviewed

Direct link

Albano, Anthony D. – Journal of Educational Measurement, 2013

In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…

Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Multimodal Reading Comprehension: Curriculum Expectations and Large-Scale Literacy Testing Practices

Peer reviewed

Direct link

Unsworth, Len – Pedagogies: An International Journal, 2014

Interpreting the image-language interface in multimodal texts is now well recognized as a crucial aspect of reading comprehension in a number of official school syllabi such as the recently published Australian Curriculum: English (ACE). This article outlines the relevant expected student learning outcomes in this curriculum and draws attention to…

Descriptors: Foreign Countries, National Curriculum, Reading Comprehension, Reading Tests

Applying Multidimensional Item Response Theory Models in Validating Test Dimensionality: An Example of K-12 Large-Scale Science Assessment

Peer reviewed

Direct link

Li, Ying; Jiao, Hong; Lissitz, Robert W. – Journal of Applied Testing Technology, 2012

This study investigated the application of multidimensional item response theory (IRT) models to validate test structure and dimensionality. Multiple content areas or domains within a single subject often exist in large-scale achievement tests. Such areas or domains may cause multidimensionality or local item dependence, which both violate the…

Descriptors: Achievement Tests, Science Tests, Item Response Theory, Measures (Individuals)

Constructed-Response DIF Evaluations for Mixed-Format Tests. Research Report. ETS RR-13-33

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J. – ETS Research Report Series, 2013

In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Item Analysis

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Identification of Reading Problems in First Grade within a Response-to-Intervention Framework

Peer reviewed

Direct link

Speece, Deborah L.; Schatschneider, Christopher; Silverman, Rebecca; Case, Lisa Pericola; Cooper, David H.; Jacobs, Dawn M. – Elementary School Journal, 2011

Models of Response to Intervention (RTI) include parameters of assessment and instruction. This study focuses on assessment with the purpose of developing a screening battery that validly and efficiently identifies first-grade children at risk for reading problems. In an RTI model, these children would be candidates for early intervention. We…

Descriptors: Reading Difficulties, Early Intervention, Grade 1, Response to Intervention

Aligning Scales of Certification Tests. Research Report. ETS RR-10-07

Download full text

Dorans, Neil J.; Liang, Longjuan; Puhan, Gautam – Educational Testing Service, 2010

Scores are the most visible and widely used products of a testing program. The choice of score scale has implications for test specifications, equating, and test reliability and validity, as well as for test interpretation. At the same time, the score scale should be viewed as infrastructure likely to require repair at some point. In this report…

Descriptors: Testing Programs, Standard Setting (Scoring), Test Interpretation, Certification

Bootstrap and Traditional Standard Errors of the Point-Biserial.

Peer reviewed

Harris, Deborah J.; Kolen, Michael J. – Educational and Psychological Measurement, 1988

Three methods of estimating point-biserial correlation coefficient standard errors were compared: (1) assuming normality; (2) not assuming normality; and (3) bootstrapping. Although errors estimated assuming normality were biased, such estimates were less variable and easier to compute, suggesting that this might be the method of choice in some…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Analysis, Statistical Analysis

The Numerical Facility Project. Technical Report 1987-1.

Tal, Joseph – 1987

An experimental test battery (the Johnson O'Connor Research Foundation battery) designed to measure numerical facility was administered to 1,451 subjects at 12 testing centers across the United States over a 5-month period. Five work samples were included: (1) arithmetic; (2) counting backwards; (3) number reasoning; (4) rule learning; and (5)…

Descriptors: Aptitude Tests, Arithmetic, Computation, Factor Analysis

Objective-Referenced-Test Rescore Decisions and Item Statistics: A Matter of Congruence.

Shannon, Gregory A. – 1983

Rescoring of Center for Occupational and Professional Assessment objective-referenced tests is decided largely by content experts selected by client organizations. A few of the test items, statistically flagged for review, are not rescored. Some of this incongruence could be due to the use of the biserial correlation (r-biserial) as an…

Descriptors: Adults, Criterion Referenced Tests, Item Analysis, Occupational Tests

Arizona Teacher Proficiency Examination of the Basic Skills. A Special Report.

Martin, Michael T. – 1983

The Arizona Tax Research Association analyzed the Arizona Teacher Proficiency Examination (ATPE). An item analysis of the 150 questions administered to 2,430 applicants from July 1982 through March 1983 was undertaken. A minimum point biserial correlation coefficient of .33 indicated the validity of an item. Nearly two-thirds (64 percent) of the…

Descriptors: Correlation, Elementary Secondary Education, Item Analysis, Screening Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Cahen, Leonard S.	3
Albano, Anthony D.	2
Clarke, S. C. T.	2
Dorans, Neil J.	2
Baker, Jean	1
Bauer, Ernest A.	1
Bennett, Randy Elliot	1
Bolus, Roger	1
Breyer, F. Jay	1
Brian F. French	1
Burnett, Fred	1
Case, Lisa Pericola	1
Case, Susan M.	1
Cliff, Norman	1
Cooper, David H.	1
Cromack, Theodore R.	1
Deng, Weiling	1
Eggen, Theo J. H. M.	1
Engelhard, George, Jr.	1
Estes, Gary D.	1
Green, Donald Ross	1
Grosswald, Jules	1
Harris, Deborah J.	1
Holmes, Susan E.	1
More ▼