Publication Date
| In 2026 | 34 |
| Since 2025 | 2433 |
| Since 2022 (last 5 years) | 12948 |
| Since 2017 (last 10 years) | 34073 |
| Since 2007 (last 20 years) | 68564 |
Descriptor
| Foreign Countries | 30631 |
| Test Validity | 21786 |
| Scores | 18282 |
| Academic Achievement | 16944 |
| Test Construction | 16779 |
| Test Reliability | 15055 |
| Achievement Tests | 14883 |
| Standardized Tests | 14734 |
| Comparative Analysis | 14432 |
| Elementary Secondary Education | 13052 |
| Language Tests | 12558 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5034 |
| Teachers | 3394 |
| Researchers | 2630 |
| Policymakers | 1232 |
| Administrators | 979 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2831 |
| Australia | 2433 |
| Canada | 2273 |
| California | 1857 |
| United States | 1729 |
| Texas | 1618 |
| China | 1583 |
| United Kingdom | 1316 |
| Florida | 1312 |
| United Kingdom (England) | 1205 |
| Germany | 1125 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Clarke-Midura, Jody; Silvis, Deborah; Shumway, Jessica F.; Lee, Victor R.; Kozlowski, Joseph S. – Computer Science Education, 2021
Background and Context: There is a need for early childhood assessments of computational thinking (CT). However, there is not consensus on a guiding framework, definition, or set of proxies in which to measure CT. We are addressing this problem by using Evidence Centered Design (ECD) to develop an assessment of kindergarten-aged children's CT.…
Descriptors: Kindergarten, Young Children, Computation, Thinking Skills
Hung, Su-Pin; Wu, Ching-Lin – Creativity Research Journal, 2021
The Remote Associates Test, generally used in creativity research, has Chinese versions for the three levels of "radical-word-vocabulary." However, research has not been conducted on the influence of the item components on the difficulties among these Chinese Remote Associates Tests (CRATs). The present study selected six item components…
Descriptors: Creativity Tests, Chinese, Test Items, Difficulty Level
Cai, Liuhan; Albano, Anthony D.; Roussos, Louis A. – Measurement: Interdisciplinary Research and Perspectives, 2021
Multistage testing (MST), an adaptive test delivery mode that involves algorithmic selection of predefined item modules rather than individual items, offers a practical alternative to linear and fully computerized adaptive testing. However, interactions across stages between item modules and examinee groups can lead to challenges in item…
Descriptors: Adaptive Testing, Test Items, Item Response Theory, Test Construction
Courey, Karyssa A.; Lee, Michael D. – AERA Open, 2021
Student evaluations of teaching are widely used to assess instructors and courses. Using a model-based approach and Bayesian methods, we examine how the direction of the scale, labels on scales, and the number of options affect the ratings. We conduct a within-participants experiment in which respondents evaluate instructors and lectures using…
Descriptors: Student Evaluation of Teacher Performance, Rating Scales, Response Style (Tests), College Students
Deygers, Bart – Language Assessment Quarterly, 2021
To date, language testing research has devoted little attention to adult L2 learners with low levels of alphabetic print literacy (LESLLA), even though this population makes up for a substantial proportion of the candidature of language tests used for migration purposes. This special issue focuses on LESLLA learners, shows how literacy impacts…
Descriptors: Alphabets, Printed Materials, Written Language, Language Tests
Wessels, Marleen D.; Paap, Muirne C. S.; Van der Putten, Annette A. J. – Journal of Intellectual & Developmental Disability, 2021
Background: Research about the psychometric properties of the Behavioural Appraisal Scales (BAS) in people with profound intellectual and multiple disabilities (PIMD) is limited. This study evaluates invariance in factor structure, item bias and convergent validity of the BAS. Methods: Data on the BAS from two studies (n = 25; n = 52) were…
Descriptors: Test Validity, Ability Identification, Severe Intellectual Disability, Multiple Disabilities
Kush, Joseph C.; Canivez, Gary L. – International Journal of School & Educational Psychology, 2021
This study utilized confirmatory factor analyses to examine the latent factor structure of the Wechsler Intelligence Scale for Children--Fourth Edition, Italian adaptation (WISC-IV Italian) standardization sample. One through five, oblique first-order factor models and higher-order as well as bifactor models were examined and compared using CFA.…
Descriptors: Children, Intelligence Tests, Foreign Countries, Construct Validity
Karen Blackburn Hoeve – ProQuest LLC, 2021
High stakes test-based accountability systems primarily rely on aggregates and derivatives of scores from tests that were originally developed to measure individual student mastery of content specifications. Current validity models do not explicitly address this use of aggregate scores to measure the performance of teachers, administrators, and…
Descriptors: Accountability, Test Validity, High Stakes Tests, Hierarchical Linear Modeling
Nichole Guadiano Nava – ProQuest LLC, 2021
In an era focused on testing to ensure accountability, educators question the fairness and wisdom of the standardized tests as a means to evaluate the effectiveness of instruction. The problem addressed by this study was the impact of stress, anxiety, and burnout on elementary teachers as a result of expectations in administering state…
Descriptors: Test Anxiety, Testing, Standardized Tests, Elementary School Teachers
Wang, Lin – ETS Research Report Series, 2019
Rearranging response options in different versions of a test of multiple-choice items can be an effective strategy against cheating on the test. This study investigated if rearranging response options would affect item performance and test score comparability. A study test was assembled as the base version from which 3 variant versions were…
Descriptors: Multiple Choice Tests, Test Items, Test Format, Scores
Ip, Edward H.; Strachan, Tyler; Fu, Yanyan; Lay, Alexandra; Willse, John T.; Chen, Shyh-Huei; Rutkowski, Leslie; Ackerman, Terry – Journal of Educational Measurement, 2019
Test items must often be broad in scope to be ecologically valid. It is therefore almost inevitable that secondary dimensions are introduced into a test during test development. A cognitive test may require one or more abilities besides the primary ability to correctly respond to an item, in which case a unidimensional test score overestimates the…
Descriptors: Test Items, Test Bias, Test Construction, Scores
Karlsson, Linn – Education Inquiry, 2022
This paper analyses the associations between computer use in schools and at home and test scores by using TIMSS data covering over 900,000 children in fourth grade. When controlling for school fixed effects, pupils who use computers at school, especially those who use them frequently are found to achieve less than students who never use computers.…
Descriptors: Scores, Elementary School Students, Achievement Tests, Elementary Secondary Education
Dimova, Slobodanka – Language Teaching Research Quarterly, 2022
Drawing on Glenn Fulcher's extensive work in performance-based language assessment of speaking, this paper explores the assessment of L2 speaking ability in local language testing contexts. For that purpose, I review Fulcher's influential work that highlights the relationship between the speaking construct, the task, the performance, and the…
Descriptors: Language Tests, Speech Communication, Performance Based Assessment, Second Language Learning
Huang, Heng-Tsung Danny; Hung, Shao-Ting Alan; Chao, Hsiu-Yi; Chen, Jyun-Hong; Lin, Tsui-Peng; Shih, Ching-Lin – Language Assessment Quarterly, 2022
Prompted by Taiwanese university students' increasing demand for English proficiency assessment, the absence of a test designed specifically for this demographic subgroup, and the lack of a localized and freely-accessible proficiency measure, this project set out to develop and validate a computerized adaptive English proficiency testing (E-CAT)…
Descriptors: Computer Assisted Testing, English (Second Language), Second Language Learning, Second Language Instruction
Leave Them Kids Alone! The Effects of Abolishing Grade Repetition: Evidence from a Nationwide Reform
Cabrera-Hernandez, Francisco – Education Economics, 2022
This paper evaluates the impact on dropout rates of a policy change in Mexico that eliminates grade retention for all first to third-grade students, causing a sharp reduction in repetition rates. I use a 12-year panel of schools to exploit such variation and estimate Difference-in-Difference models showing an average decrease in dropout rates of…
Descriptors: Grade Repetition, Educational Change, Dropout Rate, Educational Policy

Peer reviewed
Direct link
