Publication Date
| In 2026 | 0 |
| Since 2025 | 190 |
| Since 2022 (last 5 years) | 1057 |
| Since 2017 (last 10 years) | 2567 |
| Since 2007 (last 20 years) | 4928 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Brabec, Jordan Andrew; Pan, Steven C.; Bjork, Elizabeth Ligon; Bjork, Robert A. – Educational Psychology Review, 2021
Although widely used, the true-false test is often regarded as a superficial or even harmful test, one that lacks the pedagogical efficacy of more substantive tests (e.g., cued-recall or short-answer tests). Such charges, however, lack conclusive evidence and may, in some cases, be false. Across four experiments, we investigated how true-false…
Descriptors: Objective Tests, Accuracy, Cues, Recall (Psychology)
Deng, Jacky M.; Streja, Nicholas; Flynn, Alison B. – Journal of Chemical Education, 2021
Response process validity evidence can provide researchers with insight into how and why participants interpret items on instruments such as tests and questionnaires. In chemistry education research literature and the social sciences more broadly, response process validity evidence has been used and reported in a variety of ways. This paper's…
Descriptors: Chemistry, Science Education, Educational Research, Validity
Lanrong Li – ProQuest LLC, 2021
When developing a test, it is essential to ensure that the test is free of items with differential item functioning (DIF). DIF occurs when examinees of equal ability, but from different examinee subgroups, have different chances of getting the item correct. According to the multidimensional perspective, DIF occurs because the test measures more…
Descriptors: Test Bias, Test Items, Meta Analysis, Effect Size
Lanrong Li; Betsy Jane Becker – Journal of Educational Measurement, 2021
Differential bundle functioning (DBF) has been proposed to quantify the accumulated amount of differential item functioning (DIF) in an item cluster/bundle (Douglas, Roussos, and Stout). The simultaneous item bias test (SIBTEST, Shealy and Stout) has been used to test for DBF (e.g., Walker, Zhang, and Surber). Research on DBF may have the…
Descriptors: Test Bias, Test Items, Meta Analysis, Effect Size
Lang, Joseph B. – Journal of Educational and Behavioral Statistics, 2023
This article is concerned with the statistical detection of copying on multiple-choice exams. As an alternative to existing permutation- and model-based copy-detection approaches, a simple randomization p-value (RP) test is proposed. The RP test, which is based on an intuitive match-score statistic, makes no assumptions about the distribution of…
Descriptors: Identification, Cheating, Multiple Choice Tests, Item Response Theory
Demirkaya, Onur; Bezirhan, Ummugul; Zhang, Jinming – Journal of Educational and Behavioral Statistics, 2023
Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as…
Descriptors: Test Items, Prior Learning, Knowledge Level, Reaction Time
Constantinou, Filio – Research Papers in Education, 2023
Examination questions need to be sufficiently novel if they are to be effective as measurement instruments. Novelty, however, presupposes creativity, suggesting that question writing is, or should be, a creative process. To explore the boundaries of creativity in question writing, this study made use of two data sources: two corpora of examination…
Descriptors: Test Items, Creativity, Writing (Composition), Test Construction
Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023
This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…
Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis
Man, Kaiwen; Harring, Jeffrey R. – Educational and Psychological Measurement, 2023
Preknowledge cheating jeopardizes the validity of inferences based on test results. Many methods have been developed to detect preknowledge cheating by jointly analyzing item responses and response times. Gaze fixations, an essential eye-tracker measure, can be utilized to help detect aberrant testing behavior with improved accuracy beyond using…
Descriptors: Cheating, Reaction Time, Test Items, Responses
Guo, Wenjing; Choi, Youn-Jeng – Educational and Psychological Measurement, 2023
Determining the number of dimensions is extremely important in applying item response theory (IRT) models to data. Traditional and revised parallel analyses have been proposed within the factor analysis framework, and both have shown some promise in assessing dimensionality. However, their performance in the IRT framework has not been…
Descriptors: Item Response Theory, Evaluation Methods, Factor Analysis, Guidelines
Hsiao, Kuo-Lun; Ku, Ya-Yuan; Lee, Ya-Ting – Education and Information Technologies, 2023
New media literacy is an expected competency for university students. However, few literacy scales can evaluate students' fake news reporting and checking abilities. In the past, the new media literacy framework only included Critical Consuming, Critical Prosumption, Functional Prosumption, and Functional Consuming. Therefore, this study proposes…
Descriptors: Test Construction, Media Literacy, Test Validity, Test Items
Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023
A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…
Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation
Braun, Thorsten; Stierle, Rolf; Fischer, Matthias; Gross, Joachim – Chemical Engineering Education, 2023
Contributing to a competency model for engineering thermodynamics, we investigate the empirical competency structure of our exams in an attempt to answer the question: Do we test the competencies we want to convey to our students? We demonstrate that thermodynamic modeling and mathematical solution emerge as significant dimensions of thermodynamic…
Descriptors: Thermodynamics, Consciousness Raising, Engineering Education, Test Format
Stenger, Rachel; Olson, Kristen; Smyth, Jolene D. – Field Methods, 2023
Questionnaire designers use readability measures to ensure that questions can be understood by the target population. The most common measure is the Flesch-Kincaid Grade level, but other formulas exist. This article compares six different readability measures across 150 questions in a self-administered questionnaire, finding notable variation in…
Descriptors: Readability, Readability Formulas, Computer Assisted Testing, Evaluation Methods
Vitello, Sylvia; Crisp, Victoria; Ireland, Jo – Research Matters, 2023
Assessment materials must be checked for errors before they are presented to candidates. Any errors have the potential to reduce validity. For example, in the most extreme cases, an error may turn an otherwise well-designed exam question into one that is impossible to answer. In Cambridge University Press & Assessment, assessment materials are…
Descriptors: Check Lists, Test Validity, Error Correction, Test Construction

Peer reviewed
Direct link
