Publication Date
| In 2026 | 0 |
| Since 2025 | 9 |
| Since 2022 (last 5 years) | 112 |
| Since 2017 (last 10 years) | 216 |
| Since 2007 (last 20 years) | 377 |
Descriptor
| Comparative Analysis | 598 |
| Item Analysis | 598 |
| Test Items | 230 |
| Foreign Countries | 182 |
| Scores | 103 |
| Item Response Theory | 98 |
| Statistical Analysis | 97 |
| Correlation | 93 |
| Test Construction | 86 |
| Factor Analysis | 83 |
| Difficulty Level | 80 |
| More ▼ | |
Source
Author
| Hambleton, Ronald K. | 5 |
| Weiss, David J. | 4 |
| Bashaw, W. L. | 3 |
| Benson, Jeri | 3 |
| Blanton, Maria | 3 |
| Facon, Bruno | 3 |
| Gongjun Xu | 3 |
| Haladyna, Tom | 3 |
| Knuth, Eric | 3 |
| Lord, Frederic M. | 3 |
| Reckase, Mark D. | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 15 |
| Practitioners | 4 |
| Teachers | 4 |
| Students | 2 |
| Policymakers | 1 |
Location
| Australia | 13 |
| China | 13 |
| Germany | 13 |
| Turkey | 13 |
| Canada | 8 |
| United Kingdom | 8 |
| United Kingdom (England) | 8 |
| United States | 8 |
| Indonesia | 7 |
| Iran | 7 |
| Japan | 7 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 3 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
David Pierre – ProQuest LLC, 2022
The purpose of this quantitative, non-experimental study was to assess the effects of adult learning principles (ALP) in sermons on parishioners' spiritual growth. Sermons have been central to the spread of Christianity. Given the significant role sermons play in parishioners' spiritual formation and growth, examining sermons' effectiveness is…
Descriptors: Adult Learning, Educational Principles, Christianity, Churches
Soland, James; Kuhfeld, Megan; Register, Brennan – Educational Assessment, 2023
Much of what we know about how children develop is based on survey data. In order to estimate growth across time and, thereby, better understand that development, short survey scales are typically administered at repeated timepoints. Before estimating growth, those repeated measures must be put onto the same scale. Yet, little research examines…
Descriptors: Comparative Analysis, Social Emotional Learning, Scaling, Effect Size
David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023
We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…
Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format
Jurij Selan; Mira Metljak – Center for Educational Policy Studies Journal, 2023
Since research integrity is not external to research but an integral part of it, it should be integrated into research training. However, several hindrances regarding contemporary research integrity education exist. To address them, we have developed a competency profile for teaching and learning research integrity based on four assumptions: 1) to…
Descriptors: Profiles, Integrity, Content Validity, Questionnaires
Babcock, Ben; Siegel, Zachary D. – Practical Assessment, Research & Evaluation, 2022
Research about repeated testing has revealed that retaking the same exam form generally does not advantage or disadvantage failing candidates in selected response-style credentialing exams. Feinberg, Raymond, and Haist (2015) found a contributing factor to this phenomenon: people answering items incorrectly on both attempts give the same incorrect…
Descriptors: Multiple Choice Tests, Item Analysis, Test Items, Response Style (Tests)
Phusee-orn, Songsak; Pongteerawut, Sasipat – Journal of Educational Issues, 2022
The research aimed at studying and comparing the futuristic thinking of Grade 9 students studying in schools of different sizes. The samples of the research were Grade 9 students of semester 2 in academic year 2020 in Sisaket Province, Thailand. The multi-stage random sampling technique was employed for the selection of 860 students from 12…
Descriptors: School Size, Futures (of Society), Correlation, Likert Scales
Gorney, Kylie; Wollack, James A. – Practical Assessment, Research & Evaluation, 2022
Unlike the traditional multiple-choice (MC) format, the discrete-option multiple-choice (DOMC) format does not necessarily reveal all answer options to an examinee. The purpose of this study was to determine whether the reduced exposure of item content affects test security. We conducted an experiment in which participants were allowed to view…
Descriptors: Test Items, Test Format, Multiple Choice Tests, Item Analysis
Dimitrov, Dimiter M.; Atanasov, Dimitar V.; Luo, Yong – Measurement: Interdisciplinary Research and Perspectives, 2020
This study examines and compares four person-fit statistics (PFSs) in the framework of the "D"- scoring method (DSM): (a) van der Flier's "U3" statistic; (b) "Ud" statistic, as a modification of "U3" under the DSM; (c) "Zd" statistic, as a modification of the "Z3 (l[subscript z])"…
Descriptors: Goodness of Fit, Item Analysis, Item Response Theory, Scoring
John B. Buncher; Jayson M. Nissen; Ben Van Dusen; Robert M. Talbot – Physical Review Physics Education Research, 2025
Research-based assessments (RBAs) allow researchers and practitioners to compare student performance across different contexts and institutions. In recent years, research attention has focused on the student populations these RBAs were initially developed with because much of that research was done with "samples of convenience" that were…
Descriptors: Science Tests, Physics, Comparative Analysis, Gender Differences
A Comparison of Procedures for Estimating Person Reliability Parameters in the Graded Response Model
LaHuis, David M.; Bryant-Lees, Kinsey B.; Hakoyama, Shotaro; Barnes, Tyler; Wiemann, Andrea – Journal of Educational Measurement, 2018
Person reliability parameters (PRPs) model temporary changes in individuals' attribute level perceptions when responding to self-report items (higher levels of PRPs represent less fluctuation). PRPs could be useful in measuring careless responding and traitedness. However, it is unclear how well current procedures for estimating PRPs can recover…
Descriptors: Comparative Analysis, Reliability, Error of Measurement, Measurement Techniques
Zhang, Zhonghua; Zhao, Mingren – Journal of Educational Measurement, 2019
The present study evaluated the multiple imputation method, a procedure that is similar to the one suggested by Li and Lissitz (2004), and compared the performance of this method with that of the bootstrap method and the delta method in obtaining the standard errors for the estimates of the parameter scale transformation coefficients in item…
Descriptors: Item Response Theory, Error Patterns, Item Analysis, Simulation
Lambert, Richard G. – Center for Educational Measurement and Evaluation, 2023
This study sought to investigate whether there were performance differences between the children who engaged with the Ignite by Hatch™ educational gaming system using the English- or Spanish-language versions of the games. Differential item functioning methods (DIF) were employed to investigate these differences. Specifically, DIF analyses can…
Descriptors: Comparative Analysis, Educational Games, Spanish, English
Deribo, Tobias; Goldhammer, Frank; Kroehne, Ulf – Educational and Psychological Measurement, 2023
As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a…
Descriptors: Reaction Time, Guessing (Tests), Behavior Patterns, Bias
Yildirim, Osman Gazi; Ozdener, Nesrin – International Journal of Computer Science Education in Schools, 2022
The main goal of the current study is to develop a reliable instrument to measure programming anxiety in university students. A pool of 33 items based on extensive literature review and experts' opinions were created by researchers. The draft scale comprised three factors applied to 392 university students from two different universities in Turkey…
Descriptors: Anxiety, Undergraduate Students, Student Attitudes, Factor Analysis
Akhtar, Hanif – International Association for Development of the Information Society, 2022
When examinees perceive a test as low stakes, it is logical to assume that some of them will not put out their maximum effort. This condition makes the validity of the test results more complicated. Although many studies have investigated motivational fluctuation across tests during a testing session, only a small number of studies have…
Descriptors: Intelligence Tests, Student Motivation, Test Validity, Student Attitudes

Direct link
Peer reviewed
