Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 17 |
Descriptor
| Test Items | 17 |
| Testing Problems | 17 |
| Item Response Theory | 9 |
| Foreign Countries | 8 |
| Language Tests | 5 |
| English (Second Language) | 4 |
| Item Analysis | 4 |
| Scores | 4 |
| Second Language Learning | 4 |
| Test Construction | 4 |
| Equated Scores | 3 |
| More ▼ | |
Source
Author
| Allison Ames | 1 |
| Andrés Christiansen | 1 |
| Bezirhan, Ummugul | 1 |
| Brandon Crawford | 1 |
| Camenares, Devin | 1 |
| Chang, Kuo-Feng | 1 |
| Chen, Yunxiao | 1 |
| Daniel Ginting | 1 |
| Ho, Pok Jing | 1 |
| James D. Weese | 1 |
| Janssen, Gerriet | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 16 |
| Reports - Research | 12 |
| Reports - Evaluative | 2 |
| Dissertations/Theses -… | 1 |
| Information Analyses | 1 |
| Reports - Descriptive | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Higher Education | 4 |
| Postsecondary Education | 4 |
| Secondary Education | 2 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| Grade 4 | 1 |
| Intermediate Grades | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
| High School Longitudinal… | 1 |
| Pearson Test of English… | 1 |
| Program for International… | 1 |
| Progress in International… | 1 |
| Youth Risk Behavior Survey | 1 |
What Works Clearinghouse Rating
Chang, Kuo-Feng – ProQuest LLC, 2022
This dissertation was designed to foster a deeper understanding of population invariance in the context of composite-score equating and provide practitioners with guidelines for addressing score equity concerns at the composite score level. The purpose of this dissertation was threefold. The first was to compare different composite equating…
Descriptors: Test Items, Equated Scores, Methods, Design
James D. Weese; Ronna C. Turner; Allison Ames; Xinya Liang; Brandon Crawford – Journal of Experimental Education, 2024
In this study a standardized effect size was created for use with the SIBTEST procedure. Using this standardized effect size, a single set of heuristics was developed that are appropriate for data fitting different item response models (e.g., 2-parameter logistic, 3-parameter logistic). The standardized effect size rescales the raw beta-uni value…
Descriptors: Test Bias, Test Items, Item Response Theory, Effect Size
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores, and hence to incomplete data, on credentialing tests such as the United States Medical Licensing examination. Feinberg compared four approaches for reporting pass-fail decisions to the examinees with incomplete data on credentialing…
Descriptors: Testing Problems, High Stakes Tests, Credentials, Test Items
Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022
Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…
Descriptors: Ability, Tests, Equated Scores, Testing Problems
von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023
Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…
Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit
Andrés Christiansen; Rianne Janssen – Educational Assessment, Evaluation and Accountability, 2024
In international large-scale assessments, students may not be compelled to answer every test item: a student can decide to skip a seemingly difficult item or may drop out before the end of the test is reached. The way these missing responses are treated will affect the estimation of the item difficulty and student ability, and ultimately affect…
Descriptors: Test Items, Item Response Theory, Grade 4, International Assessment
Mario I. Suárez – Educational Studies: Journal of the American Educational Studies Association, 2024
The increase in youth's self-identification as trans in the United States and Canada has created new urgency in schools to meet the needs of these students, yet education survey researchers have yet to find ways to assess their educational outcomes based on sex and gender. In this critical systematic review, I provide an overview of surveys from…
Descriptors: Measures (Individuals), Sexual Identity, Identification (Psychology), LGBTQ People
Chen, Yunxiao; Lee, Yi-Hsuan; Li, Xiaoou – Journal of Educational and Behavioral Statistics, 2022
In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric…
Descriptors: Standardized Tests, Test Items, Test Validity, Scores
Camenares, Devin – International Journal for the Scholarship of Teaching and Learning, 2022
Balancing assessment of learning outcomes with the expectations of students is a perennial challenge in education. Difficult exams, in which many students perform poorly, exacerbate this problem and can inspire a wide variety of interventions, such as a grading curve. However, addressing poor performance can sometimes distort or inflate grades and…
Descriptors: College Students, Student Evaluation, Tests, Test Items
Janssen, Gerriet – Language Testing, 2022
This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs…
Descriptors: Language Tests, Placement Tests, Language Teachers, College Faculty
Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022
Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…
Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory
Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022
The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…
Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores
Yi Zou; Ying Zheng; Jingwen Wang – International Journal of Language Testing, 2025
The Pearson Test of English Academic (PTE-A), a widely used high-stakes language proficiency test for university admissions and migration purposes, underwent a notable change from a three-hour to a two-hour version in November 2021. The implementation of the new version has prompted inquiries into the washback effects on various stakeholders.…
Descriptors: Testing Problems, Test Preparation, High Stakes Tests, English (Second Language)
Robitzsch, Alexander; Lüdtke, Oliver – Large-scale Assessments in Education, 2023
One major aim of international large-scale assessments (ILSA) like PISA is to monitor changes in student performance over time. To accomplish this task, a set of common items (i.e., link items) is repeatedly administered in each assessment. Linking methods based on item response theory (IRT) models are used to align the results from the different…
Descriptors: Educational Trends, Trend Analysis, International Assessment, Achievement Tests
Patrisius Istiarto Djiwandono; Daniel Ginting – Language Education & Assessment, 2025
The teaching of English as a foreign language in Indonesia has a long history, and it is always important to ask whether the assessment of the students' language skills has been valid and reliable. A screening of many articles in several prominent databases reveal that a number of evaluation studies have been done by Indonesian scholars in the…
Descriptors: Foreign Countries, Language Tests, English (Second Language), Second Language Learning
Previous Page | Next Page »
Pages: 1 | 2
Direct link
Peer reviewed
