NotesFAQContact Us
Collection
Advanced
Search Tips
Location
France1
Laws, Policies, & Programs
National Defense Education Act1
Assessments and Surveys
Iowa Tests of Basic Skills1
What Works Clearinghouse Rating
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Dragos Iliescu; Dave Bartram; Pia Zeinoun; Matthias Ziegler; Paula Elosua; Stephen Sireci; Kurt F. Geisinger; Aletta Odendaal; Maria Elena Oliveri; Jon Twing; Wayne Camara – International Journal of Testing, 2024
The "Test Adaptation Reporting Standards" (TARES), or "TARES statement" was developed to alleviate the problems arising from inadequate reporting of test adaptation procedures. The TARES contains a short preamble and a checklist, that comprises an evidence-based minimum set of information for reporting in test adaptations. The…
Descriptors: Test Use, Outcome Measures, Check Lists, Evidence Based Practice
Peer reviewed Peer reviewed
Direct linkDirect link
Kylie Gorney; Mark D. Reckase – Journal of Educational Measurement, 2025
In computerized adaptive testing, item exposure control methods are often used to provide a more balanced usage of the item pool. Many of the most popular methods, including the restricted method (Revuelta and Ponsoda), use a single maximum exposure rate to limit the proportion of times that each item is administered. However, Barrada et al.…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Tatiana Chaiban; Zeinab Nahle; Ghaith Assi; Michelle Cherfane – Discover Education, 2024
Background: Since it was first launched, ChatGPT, a Large Language Model (LLM), has been widely used across different disciplines, particularly the medical field. Objective: The main aim of this review is to thoroughly assess the performance of the distinct version of ChatGPT in subspecialty written medical proficiency exams and the factors that…
Descriptors: Medical Education, Accuracy, Artificial Intelligence, Computer Software
Peer reviewed Peer reviewed
Direct linkDirect link
Joye, Nelly; Broc, Lucie; Marshall, Chloë Ruth; Dockrell, Julie Elizabeth – Journal of Speech, Language, and Hearing Research, 2022
Purpose: This study offers the first description of misspellings across elementary school using the Phonological, Orthographic and Morphological Assessment of Spelling (POMAS), a linguistic framework based on Triple Word Form theory, adapted for French (POMAS-FR). It aims to test the "universality" of POMAS and its suitability to track…
Descriptors: Spelling, Elementary School Students, French, Error Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Jessica B. Koslouski; Sandra M. Chafouleas; Amy Briesch; Jacqueline M. Caemmerer; Brittany Melo – School Mental Health, 2024
We are developing the Equitable Screening to Support Youth (ESSY) Whole Child Screener to address concerns prevalent in existing school-based screenings that impede goals to advance educational equity using universal screeners. Traditional assessment development does not include end users in the early development phases, instead relying on a…
Descriptors: Screening Tests, Psychometrics, Validity, Child Development
Peer reviewed Peer reviewed
Direct linkDirect link
Arikan, Serkan; Aybek, Eren Can – Educational Measurement: Issues and Practice, 2022
Many scholars compared various item discrimination indices in real or simulated data. Item discrimination indices, such as item-total correlation, item-rest correlation, and IRT item discrimination parameter, provide information about individual differences among all participants. However, there are tests that aim to select a very limited number…
Descriptors: Monte Carlo Methods, Item Analysis, Correlation, Individual Differences
Peer reviewed Peer reviewed
Direct linkDirect link
Jessica B. Koslouski; Sandra M. Chafouleas; Amy Briesch; Jacqueline M. Caemmerer; Brittany Melo – Grantee Submission, 2024
We are developing the Equitable Screening to Support Youth (ESSY) Whole Child Screener to address concerns prevalent in existing school-based screenings that impede goals to advance educational equity using universal screeners. Traditional assessment development does not include end users in the early development phases, instead relying on a…
Descriptors: Screening Tests, Usability, Decision Making, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Al-Owidha, Amjed A. – Language Testing in Asia, 2018
Background: This study investigated the psychometric properties of the recently developed Qiyas for L1 Arabic language test using a Rasch measurement framework. Methods: Responses from 271 examinees were analyzed in this study. The test is hypothesized to involve one dominant factor that assesses four skills: reading comprehension, rhetorical…
Descriptors: Semitic Languages, Language Tests, Psychometrics, Reading Comprehension
Peer reviewed Peer reviewed
Direct linkDirect link
Fuller, Edward J.; Hollingworth, Liz – Educational Administration Quarterly, 2014
Purpose: The purpose of this article is to examine the assumptions underlying efforts to evaluate principal effectiveness in terms of student test scores, to review extant research on efforts to estimate principal effectiveness, and to discuss the appropriateness of including estimates of principal effectiveness in evaluations of principals.…
Descriptors: Principals, Administrator Effectiveness, Administrator Evaluation, Computation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
January, Stacy-Ann A.; Ardoin, Scott P.; Christ, Theodore J.; Eckert, Tanya L.; White, Mary Jane – School Psychology Review, 2016
Universal screening in elementary schools often includes administering curriculum-based measurement in reading (CBM-R); but in first grade, nonsense word fluency (NWF) and, to a lesser extent, word identification fluency (WIF) are used because of concerns that CBM-R is too difficult for emerging readers. This study used Kane's argument-based…
Descriptors: Curriculum Based Assessment, Reading Tests, Test Interpretation, Test Use
McLaughlin, Kenneth F. – Office of Education, US Department of Health, Education, and Welfare, 1964
Under Title V, Guidance, Counseling, and Testing of the "National Defense Education Act of 1958," the Congress of the United States has recognized the value of tests as a tool which may be used to help make an early determination of the aptitudes and abilities of the students in U.S. schools. This bulletin attempts to explain the use and…
Descriptors: Educational History, School Guidance, Educational Testing, Aptitude Tests