Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 17 |
Descriptor
| Item Analysis | 26 |
| Testing Programs | 26 |
| Test Items | 14 |
| Item Response Theory | 10 |
| Elementary Secondary Education | 6 |
| Test Construction | 6 |
| Evaluation Methods | 5 |
| Foreign Countries | 5 |
| Models | 4 |
| Standardized Tests | 4 |
| Test Validity | 4 |
| More ▼ | |
Source
Author
| Albano, Anthony D. | 2 |
| Ackermann, Richard | 1 |
| Breyer, F. Jay | 1 |
| Brian F. French | 1 |
| Carlson, Janet F. | 1 |
| Case, Lisa Pericola | 1 |
| Childs, Ruth A. | 1 |
| Cooper, David H. | 1 |
| Davis, Robbie G. | 1 |
| Deng, Weiling | 1 |
| Diezmann, Carmel M. | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 26 |
| Reports - Research | 11 |
| Reports - Evaluative | 8 |
| Reports - Descriptive | 5 |
| Tests/Questionnaires | 1 |
Education Level
| Elementary Secondary Education | 7 |
| Secondary Education | 3 |
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Adult Education | 1 |
| Early Childhood Education | 1 |
| Elementary Education | 1 |
| Grade 1 | 1 |
| Grade 11 | 1 |
| Grade 3 | 1 |
| Grade 4 | 1 |
| More ▼ | |
Audience
Laws, Policies, & Programs
Assessments and Surveys
| Graduate Record Examinations | 1 |
| National Assessment of… | 1 |
| Praxis Series | 1 |
| Program for International… | 1 |
| SAT (College Admission Test) | 1 |
| Stanford Achievement Tests | 1 |
What Works Clearinghouse Rating
Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data
Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024
Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…
Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests
Xue, Kang; Huggins-Manley, Anne Corinne; Leite, Walter – Educational and Psychological Measurement, 2022
In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of…
Descriptors: Virtual Classrooms, Artificial Intelligence, Item Response Theory, Item Analysis
Wyse, Adam E.; Albano, Anthony D. – Applied Measurement in Education, 2015
This article used several data sets from a large-scale state testing program to examine the feasibility of combining general and modified assessment items in computerized adaptive testing (CAT) for different groups of students. Results suggested that several of the assumptions made when employing this type of mixed-item CAT may not be met for…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Testing Programs
Albano, Anthony D. – Journal of Educational Measurement, 2013
In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…
Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Multimodal Reading Comprehension: Curriculum Expectations and Large-Scale Literacy Testing Practices
Unsworth, Len – Pedagogies: An International Journal, 2014
Interpreting the image-language interface in multimodal texts is now well recognized as a crucial aspect of reading comprehension in a number of official school syllabi such as the recently published Australian Curriculum: English (ACE). This article outlines the relevant expected student learning outcomes in this curriculum and draws attention to…
Descriptors: Foreign Countries, National Curriculum, Reading Comprehension, Reading Tests
Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement
Li, Ying; Jiao, Hong; Lissitz, Robert W. – Journal of Applied Testing Technology, 2012
This study investigated the application of multidimensional item response theory (IRT) models to validate test structure and dimensionality. Multiple content areas or domains within a single subject often exist in large-scale achievement tests. Such areas or domains may cause multidimensionality or local item dependence, which both violate the…
Descriptors: Achievement Tests, Science Tests, Item Response Theory, Measures (Individuals)
Carlson, Janet F.; Geisinger, Kurt F. – International Journal of Testing, 2012
The test review process used by the Buros Center for Testing is described as a series of 11 steps: (1) identifying tests to be reviewed, (2) obtaining tests and preparing test descriptions, (3) determining whether tests meet review criteria, (4) identifying appropriate reviewers, (5) selecting reviewers, (6) sending instructions and materials to…
Descriptors: Testing, Test Reviews, Evaluation Methods, Evaluation Criteria
Marushina, Albina – Journal of Mathematics Education at Teachers College, 2012
This paper aims to tell how the Russian national examination in mathematics (the Uniform State Examination or USE) has been conducted most recently. The author must say at once that the history of the system of secondary school graduation examinations or even the history of the USE will be covered only to the small degree that is necessary for…
Descriptors: Foreign Countries, Mathematics Tests, National Competency Tests, Secondary School Mathematics
Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J. – ETS Research Report Series, 2013
In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Item Analysis
Somerset, Anthony – Compare: A Journal of Comparative and International Education, 2011
Educational practitioners rely predominantly on measures of outcome, rather than of inputs or process, in making judgements as to quality. Outcome measures are available from two main sources: (1) the relatively new international assessment systems; and (2) the traditional national examinations systems. The two types of system differ in their…
Descriptors: Testing Programs, Educational Quality, National Competency Tests, Educational Improvement
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Jacobsen, Jared; Ackermann, Richard; Eguez, Jane; Ganguli, Debalina; Rickard, Patricia; Taylor, Linda – Journal of Applied Testing Technology, 2011
A computer adaptive test (CAT) is a delivery methodology that serves the larger goals of the assessment system in which it is embedded. A thorough analysis of the assessment system for which a CAT is being designed is critical to ensure that the delivery platform is appropriate and addresses all relevant complexities. As such, a CAT engine must be…
Descriptors: Delivery Systems, Testing Programs, Computer Assisted Testing, Foreign Countries
Speece, Deborah L.; Schatschneider, Christopher; Silverman, Rebecca; Case, Lisa Pericola; Cooper, David H.; Jacobs, Dawn M. – Elementary School Journal, 2011
Models of Response to Intervention (RTI) include parameters of assessment and instruction. This study focuses on assessment with the purpose of developing a screening battery that validly and efficiently identifies first-grade children at risk for reading problems. In an RTI model, these children would be candidates for early intervention. We…
Descriptors: Reading Difficulties, Early Intervention, Grade 1, Response to Intervention
Previous Page | Next Page »
Pages: 1 | 2
Peer reviewed
Direct link
