Publication Date
| In 2026 | 0 |
| Since 2025 | 28 |
| Since 2022 (last 5 years) | 117 |
| Since 2017 (last 10 years) | 228 |
| Since 2007 (last 20 years) | 561 |
Descriptor
| Evaluation Methods | 1408 |
| Test Reliability | 1408 |
| Test Validity | 954 |
| Student Evaluation | 339 |
| Test Construction | 305 |
| Foreign Countries | 217 |
| Higher Education | 183 |
| Measurement Techniques | 170 |
| Psychometrics | 168 |
| Elementary Secondary Education | 147 |
| Evaluation Criteria | 122 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 74 |
| Practitioners | 72 |
| Teachers | 29 |
| Administrators | 18 |
| Policymakers | 11 |
| Students | 4 |
| Counselors | 3 |
| Support Staff | 3 |
| Community | 1 |
| Parents | 1 |
Location
| Australia | 24 |
| United Kingdom | 22 |
| Canada | 18 |
| Turkey | 16 |
| China | 14 |
| United States | 14 |
| California | 11 |
| Netherlands | 10 |
| Florida | 9 |
| Texas | 8 |
| United Kingdom (England) | 8 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Power, Jason Richard; Tanner, David – European Journal of Engineering Education, 2023
Self and peer assessments have been identified as effective strategies to develop a deeper understanding of complex concepts, enhance meta-cognitive capacity, and support learner self-efficacy. This study examines data related to peer and self-assessment exercises completed within a university engineering programme (n=61). Data related to…
Descriptors: Peer Evaluation, Self Evaluation (Individuals), Feedback (Response), Engineering Education
Mohammad Hmoud; Hadeel Swaity; Eman Anjass; Eva María Aguaded-Ramírez – Electronic Journal of e-Learning, 2024
This research aimed to develop and validate a rubric to assess Artificial Intelligence (AI) chatbots' effectiveness in accomplishing tasks, particularly within educational contexts. Given the rapidly growing integration of AI in various sectors, including education, a systematic and robust tool for evaluating AI chatbot performance is essential.…
Descriptors: Artificial Intelligence, Man Machine Systems, Natural Language Processing, Test Construction
Simon Massey – International Journal of Social Research Methodology, 2024
The UK-based article develops a quantitative method for measuring 8-9-year-old children's Gender Ability Beliefs through drawings, assessing the reliability and validity of the measure and its association with respondents' self-reported gender. The measure, originally used in the US by Beilock et al. (2010), required respondents to draw two…
Descriptors: Children, Sex, Childrens Attitudes, Gender Differences
Yuting Han; Zhehan Jiang; Lingling Xu; Fen Cai – AERA Online Paper Repository, 2024
To address the computational constraints of parameter estimation in the polytomous Cognitive Diagnosis Model (pCDM) in large-scale high data volume situations, this study proposes two two-stage polytomous attribute estimation methods: P_max and P_linear. The effects of the two-stage methods were studied via a Monte Carlo simulation study, and the…
Descriptors: Medical Education, Licensing Examinations (Professions), Measurement Techniques, Statistical Data
Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients
Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022
The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…
Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory
Delphine Franco; Ruben Vanderlinde; Martin Valcke – European Journal of Education, 2025
Complex competences, such as managing students' aggressive behaviour, are challenging to develop during teacher training. Recently, video-based simulations have been considered promising, yet suitable assessment instruments are limitedly available. This paper reports on the design and evaluation of a video-based assessment tool tailored to measure…
Descriptors: Preservice Teachers, Preservice Teacher Education, Student Behavior, Aggression
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Amanda Timmerman; Vasiliki Totsika; Valerie Lye; Laura Crane; Audrey Linden; Elizabeth Pellicano – Autism: The International Journal of Research and Practice, 2025
Autistic people are more likely to have co-occurring mental health conditions compared to the general population, and mental health interventions have been identified as a top research priority by autistic people and the wider autism community. Autistic adults have also communicated that quality of life is the outcome that matters most to them in…
Descriptors: Adults, Autism Spectrum Disorders, Quality of Life, Randomized Controlled Trials
Constructing a Roadmap to Measure the Quality of Business Assessments Aimed at Curriculum Management
Silva, Thanuci; Santos, Regiane dos; Mallet, Débora – Journal of Education for Business, 2023
Assuring the quality of education is a concern of learning institutions. To do so, it is necessary to have assertive learning management, with consistent data on students' outcomes. This research provides associate deans and researchers, a roadmap with which to gather evidence to improve the quality of open-ended assessments. Based on statistical…
Descriptors: Student Evaluation, Evaluation Methods, Business Education, Higher Education
Pinar Mihci Türker; Ömer Kirmaci; Emrah Kayabasi; Erinç Karatas; Ebru Kiliç Çakmak; Serçin Karatas – Journal of Educational Technology and Online Learning, 2024
The COVID-19 epidemic has precipitated a rapid and widespread adoption of online education, leading to its normalization in contemporary society. Online education is evident across several educational levels. However, assessing the efficacy and effectiveness of these training programs can only be achieved by implementing a suitable evaluation…
Descriptors: Online Courses, Distance Education, Evaluation Methods, Test Construction
Delia Leuenberger; Elisabeth Moser Opitz; Noemi Gloor – Journal of Numerical Cognition, 2024
Computation competence (CC) in simple addition and subtraction using non-counting (NC) strategies is an important learning objective in Grade 1 mathematics but many children, especially low achievers in mathematics, struggle to acquire these skills. To provide these students with the support they need, it is important to have valid and reliable…
Descriptors: Computation, Mathematics Skills, Addition, Subtraction
Ole J. Kemi – Advances in Physiology Education, 2025
Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…
Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards
Pereira, Diana; Cadime, Irene; Brown, Gavin; Flores, Maria Assunção – European Journal of Higher Education, 2022
Drawing upon a wider piece of research, this paper focuses on the validation of a 'use of assessment' scale in five Portuguese public universities with 5549 students. The study aims to investigate the psychometric properties of the scale, to describe how students look at assessment uses, to analyse their utility perceptions of assessment, and to…
Descriptors: Undergraduate Students, Student Attitudes, Evaluation Methods, Foreign Countries
Regional Educational Laboratory Mid-Atlantic, 2024
These are the appendixes for the report, "Stabilizing School Performance Indicators in New Jersey to Reduce the Effect of Random Error." This study applied a stabilization model called Bayesian hierarchical modeling to group-level data (with groups assigned according to demographic designations) within schools in New Jersey with the aim…
Descriptors: Institutional Evaluation, Elementary Secondary Education, Bayesian Statistics, Test Reliability
Flor de Lis González-Mujico – Education and Information Technologies, 2024
Over the past decade, self-assessment tools have garnered significant attention in the interest of measuring the skillset required by educators and students to function productively and ethically in digitally mediated environments, particularly in relation to education policy implementation. Since stated beliefs do not always align with actual…
Descriptors: Technological Literacy, Evaluation Methods, Test Validity, Test Construction

Peer reviewed
Direct link
