Publication Date
| In 2026 | 0 |
| Since 2025 | 59 |
| Since 2022 (last 5 years) | 385 |
| Since 2017 (last 10 years) | 828 |
| Since 2007 (last 20 years) | 1342 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 195 |
| Teachers | 161 |
| Researchers | 93 |
| Administrators | 50 |
| Students | 34 |
| Policymakers | 15 |
| Parents | 12 |
| Counselors | 2 |
| Community | 1 |
| Media Staff | 1 |
| Support Staff | 1 |
| More ▼ | |
Location
| Canada | 62 |
| Turkey | 59 |
| Germany | 40 |
| Australia | 36 |
| United Kingdom | 36 |
| Japan | 35 |
| China | 33 |
| United States | 32 |
| California | 25 |
| Iran | 25 |
| United Kingdom (England) | 25 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017
Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…
Descriptors: Automation, Scoring, Comparative Analysis, Test Items
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2016
This article examines the possible dependency of composite reliability on presentation format of the elements of a multi-item measuring instrument. Using empirical data and a recent method for interval estimation of group differences in reliability, we demonstrate that the reliability of an instrument need not be the same when polarity of the…
Descriptors: Test Reliability, Test Format, Test Items, Differences
Keller, Lisa A.; Keller, Robert; Cook, Robert J.; Colvin, Kimberly F. – Applied Measurement in Education, 2016
The equating of tests is an essential process in high-stakes, large-scale testing conducted over multiple forms or administrations. By adjusting for differences in difficulty and placing scores from different administrations of a test on a common scale, equating allows scores from these different forms and administrations to be directly compared…
Descriptors: Item Response Theory, Equated Scores, Test Format, Testing Programs
Wyse, Adam E.; Babcock, Ben – Journal of Educational Measurement, 2016
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
Descriptors: Cutting Scores, Psychometrics, Test Construction, Classification
Wang, Ling – Journal of Educational Multimedia and Hypermedia, 2021
Running records is an important reading assessment for diagnosing early readers' needs in diverse instructional settings across grade levels. This study develops an innovative app to help teachers administer running records assessment and investigates teachers' perceptions of its functionality and usability in practical classrooms. The app offers…
Descriptors: Miscue Analysis, Reading Comprehension, Reading Tests, Computer Software
Lewis, Kendra M.; Ewers, Timothy; Miller, JoLynn C.; Bird, Marianne; Borba, John; Hill, Russell D.; Rea-Keywood, Jeannette; Shelstad, Nancy; Trzesniewski, Kali – Journal of Extension, 2018
Research on retention in the 4-H youth development program has consistently shown that one of the primary indicators for youths' dropping out of 4-H is being a first-year member. Extension 4-H professionals from California, Idaho, Wyoming, and New Jersey formed a team to study this issue. Our team surveyed first-year members and their…
Descriptors: Youth Programs, Academic Persistence, School Holding Power, Dropout Research
Martin-Raugh, Michelle P.; Anguiano-Carrsaco, Cristina; Jackson, Teresa; Brenneman, Meghan W.; Carney, Lauren; Barnwell, Patrick; Kochert, Jonathan – International Journal of Testing, 2018
Single-response situational judgment tests (SRSJTs) differ from multiple-response SJTs (MRSJTS) in that they present test takers with edited critical incidents and simply ask test takers to read over the action described and evaluate it according to its effectiveness. Research comparing the reliability and validity of SRSJTs and MRSJTs is thus far…
Descriptors: Test Format, Test Reliability, Test Validity, Predictive Validity
Liu, Yuming; Robin, Frédéric; Yoo, Hanwook; Manna, Venessa – ETS Research Report Series, 2018
The "GRE"® Psychology test is an achievement test that measures core knowledge in 12 content domains that represent the courses commonly offered at the undergraduate level. Currently, a total score and 2 subscores, experimental and social, are reported to test takers as well as graduate institutions. However, the American Psychological…
Descriptors: College Entrance Examinations, Graduate Study, Psychological Testing, Scores
Sinharay, Sandip – Grantee Submission, 2018
Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns
Neiro, Jakke; Johansson, Niko – LUMAT: International Journal on Math, Science and Technology Education, 2020
The history and evolution of science assessment remains poorly known, especially in the context of the exam question contents. Here we analyze the Finnish matriculation examination in biology from the 1920s to 1960s to understand how the exam has evolved in both its knowledge content and educational form. Each question was classified according to…
Descriptors: Foreign Countries, Biology, Test Content, Test Format
Shar, Kelli; Russ, Rosemary S.; Laverty, James T. – Physical Review Physics Education Research, 2020
Assessments are usually thought of as ways for instructors to get information from students. In this work, we flip this perspective and explore how assessments communicate information to students. Specifically, we consider how assessments may provide information about what faculty and/or researchers think it means to know and do physics, i.e.,…
Descriptors: Epistemology, Science Instruction, Physics, Science Tests
Impact of Background Noise Fluctuation and Reverberation on Response Time in a Speech Reception Task
Prodi, Nicola; Visentin, Chiara – Journal of Speech, Language, and Hearing Research, 2019
Purpose: This study examines the effects of reverberation and noise fluctuation on the response time (RT) to the auditory stimuli in a speech reception task. Method: The speech reception task was presented to 76 young adults with normal hearing in 3 simulated listening conditions (1 anechoic, 2 reverberant). Speechlike stationary and fluctuating…
Descriptors: Acoustics, Reaction Time, Auditory Stimuli, Speech Communication
Steedle, Jeffrey T.; Morrison, Kristin M. – Educational Assessment, 2019
Assessment items are commonly field tested prior to operational use to observe statistical item properties such as difficulty. Item parameter estimates from field testing may be used to assign scores via pre-equating or computer adaptive designs. This study examined differences between item difficulty estimates based on field test and operational…
Descriptors: Field Tests, Test Items, Statistics, Difficulty Level
Smail, Layes; Sana, Tibi; Yamina, Bouakkaz; Rebai, Mohamed – Reading & Writing Quarterly, 2022
This study examined whether the phonological awareness (PA) deficit in Arabic speaking dyslexic children could be impacted by the presence vs. absence of verbal working memory (WM) as function of the sensory modality of administration (auditory vs. visual) of the phonological tests. Three phonological awareness (PA) tasks, i.e., phoneme…
Descriptors: Phonological Awareness, Dyslexia, Short Term Memory, Verbal Ability
Masrai, Ahmed – SAGE Open, 2022
Vocabulary size measures serve important functions, not only with respect to placing learners at appropriate levels on language courses but also with a view to examining the progress of learners. One of the widely reported formats suitable for these purposes is the Yes/No vocabulary test. The primary aim of this study was to introduce and provide…
Descriptors: Vocabulary Development, Language Tests, English (Second Language), Second Language Learning

Peer reviewed
Direct link
