Publication Date
| In 2026 | 10 |
| Since 2025 | 2328 |
| Since 2022 (last 5 years) | 12843 |
| Since 2017 (last 10 years) | 33968 |
| Since 2007 (last 20 years) | 68459 |
Descriptor
| Foreign Countries | 30579 |
| Test Validity | 21757 |
| Scores | 18263 |
| Academic Achievement | 16934 |
| Test Construction | 16763 |
| Test Reliability | 15036 |
| Achievement Tests | 14864 |
| Standardized Tests | 14724 |
| Comparative Analysis | 14431 |
| Elementary Secondary Education | 13046 |
| Language Tests | 12551 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5034 |
| Teachers | 3394 |
| Researchers | 2630 |
| Policymakers | 1232 |
| Administrators | 979 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2823 |
| Australia | 2430 |
| Canada | 2270 |
| California | 1854 |
| United States | 1727 |
| Texas | 1615 |
| China | 1579 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1203 |
| Germany | 1123 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Abigail R. Vild; Maggie E. Wilson; Christopher A. Was – Journal of Research in Education, 2025
Theories of self-regulated learning suggest a positive link between knowledge monitoring accuracy (the ability to predict test performance) and performance on tests. Put differently, students who accurately monitor their knowledge of course content more efficiently regulate study of course materials. However, a plethora of literature indicates…
Descriptors: Student Satisfaction, Undergraduate Students, Scores, Prediction
Joshua B. Gilbert; Zachary Himmelsbach; James Soland; Mridul Joshi; Benjamin W. Domingue – Journal of Policy Analysis and Management, 2025
Analyses of heterogeneous treatment effects (HTE) are common in applied causal inference research. However, when outcomes are latent variables assessed via psychometric instruments such as educational tests, standard methods ignore the potential HTE that may exist among the individual items of the outcome measure. Failing to account for…
Descriptors: Item Response Theory, Test Items, Error of Measurement, Scores
Jessica Daikeler; Joss Roßmann; David Bretschi; Tobias Gummer; Henning Silber – Field Methods, 2025
Mostly in web surveys, attention checks have been proposed to identify inattentive respondents in self-administered surveys as previous research has argued that low-quality answers may introduce severe biases in data analyses. The increasing popularity of mixing survey modes for conducting probability-based surveys amplifies the need for…
Descriptors: Online Surveys, Mail Surveys, Attention, Response Style (Tests)
Aysun Acun; Burcu Bayrak Kahraman; Semanur Bilgiç – European Journal of Education, 2025
Humane care refers to an approach that focuses on people, considers individual differences, and aims to provide ideal care, regardless of the needs of the individual. This approach can also be called the "humanistic approach". While nurses are among the main practitioners of this understanding, nursing students should also receive…
Descriptors: Nursing Students, Test Validity, Likert Scales, Health Services
Tim Moses; YoungKoung Kim – Journal of Educational Measurement, 2025
This study considers the estimation of marginal reliability and conditional accuracy measures using a generalized recursion procedure with several IRT-based ability and score estimators. The estimators include MLE, TCC, and EAP abilities, and corresponding test scores obtained with different weightings of the item scores. We consider reliability…
Descriptors: Item Response Theory, Scoring, Reliability, Accuracy
Casandra Koevoets-Beach; Donya Kurdi; Morgan Balabanoff – Practical Assessment, Research & Evaluation, 2025
Confidence tiers have been paired with multiple choice items across different fields since the early twentieth century and have seen widespread adoption in discipline-based education research fields seeking to evaluate aspects of self-regulated learning. The design of two-tiered confidence judgments impacts interpretability and perception of their…
Descriptors: Confidence Testing, Interviews, Metacognition, Undergraduate Students
Kim, Peter – Language Teaching Research Quarterly, 2021
Foreign language aptitude is defined as one's potential to learn a second language. A language learner with higher aptitude is predicted to learn more, faster, and reach a higher level of proficiency. If this is the case, one way to validate the construct of aptitude and its measure is to conduct a validation study in which measures of aptitude is…
Descriptors: Morphology (Languages), Syntax, Second Language Learning, Second Language Instruction
Kondo, Kanako; Mizuta, Masanobu; Kawai, Yoshitaka; Sogami, Tohru; Fujimura, Shintaro; Kojima, Tsuyoshi; Abe, Chika; Tanaka, Ryo; Shiromoto, Osamu; Uozumi, Ryuji; Kishimoto, Yo; Tateya, Ichiro; Omori, Koichi; Haji, Tomoyuki – Journal of Speech, Language, and Hearing Research, 2021
Purpose: Auditory-perceptual evaluation is essential for the assessment of voice quality. The Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) provides a standardized protocol and assessment form for clinicians to analyze the voice quality and has been adapted into several different languages. The aims of this study were to develop the…
Descriptors: Japanese, Test Validity, Test Reliability, Voice Disorders
Lenz, A. Stephen; Rocha, Lauran; Aras, Yahyahan – International Journal for the Advancement of Counselling, 2021
A systematic search was conducted to identify measures of school climate developed and reported between 1993 to 2017. We coded data related to participant and setting characteristics, qualities of measures, amounts of validity evidence, and degrees of reliability estimates. Results indicated 9 school climate measures featuring disparate…
Descriptors: Educational Environment, Evaluation, Literature Reviews, Test Construction
Gao, Xuliang; Ma, Wenchao; Wang, Daxun; Cai, Yan; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2021
This article proposes a class of cognitive diagnosis models (CDMs) for polytomously scored items with different link functions. Many existing polytomous CDMs can be considered as special cases of the proposed class of polytomous CDMs. Simulation studies were carried out to investigate the feasibility of the proposed CDMs and the performance of…
Descriptors: Cognitive Measurement, Models, Test Items, Scoring
Kaya, Fatih – Education Quarterly Reviews, 2021
The aim of this study was to develop a valid and reliable measurement tool in order to determine the democracy levels of teacher candidates. During the scale development process in the research, the validity and reliability studies were conducted through three independent study groups. The first study group consisted of 627 students studying at…
Descriptors: Democracy, Measures (Individuals), Preservice Teachers, Test Validity
Oh, Seungbin; Shillingford-Butler, Ann – Measurement and Evaluation in Counseling and Development, 2021
The authors present the development and examination of the "Client Assessment of Multicultural Competent Behavior" (CAMCB) scores. The CAMCB was designed to measure therapists' multicultural competent behaviors within the context of therapeutic process, from clients' perspective. In this article, three-phases of the study are presented…
Descriptors: Counselor Evaluation, Test Construction, Cultural Awareness, Test Validity
Tunc, Emine Burcu; Parlak, Simel; Uluman, Muge; Eryigit, Derya – International Journal of Assessment Tools in Education, 2021
The aim of this research is to develop Hostility in Pandemic Scale (HPS) for Turkey Population to determine the hostility levels of individuals, which is a factor affecting the mental well-being of the society during the pandemic. The study group consists of 855 individuals between the ages of 18-65 from different genders, and have experienced the…
Descriptors: Psychological Patterns, Pandemics, COVID-19, Test Construction
Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021
Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…
Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring
Markelz, Andrew M.; Riden, Benjamin S.; Zoder-Martell, Kimberly A.; Miller, Joseph E.; Bolinger, Sarah J. – Journal of Positive Behavior Interventions, 2021
Supported by decades of research on praise and its effect on student behaviors, we developed the Behavior-Specific Praise--Observation Tool (BSP-OT) to measure characteristics of effective praise. We evaluated interrater reliability of the BSP-OT to measure praise specificity, contingency, and variety using intraclass correlation (ICC) and Cohen's…
Descriptors: Test Reliability, Classroom Observation Techniques, Positive Reinforcement, Interrater Reliability

Peer reviewed
Direct link
