Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Kane, Michael – Measurement: Interdisciplinary Research and Perspectives, 2015
Michael Kane writes in this article that he is in more or less complete agreement with Professor Koretz's characterization of the problem outlined in the paper published in this issue of "Measurement." Kane agrees that current testing practices are not adequate for test-based accountability (TBA) systems, but he writes that he is far…
Descriptors: Educational Testing, Accountability, Standardized Tests, Equated Scores
DeMars, Christine E.; Jurich, Daniel P. – Educational and Psychological Measurement, 2015
In educational testing, differential item functioning (DIF) statistics must be accurately estimated to ensure the appropriate items are flagged for inspection or removal. This study showed how using the Rasch model to estimate DIF may introduce considerable bias in the results when there are large group differences in ability (impact) and the data…
Descriptors: Test Bias, Guessing (Tests), Ability, Differences
Musekamp, Frank; Pearce, Jacob – Studies in Higher Education, 2015
Low-stakes assessment is supposed to improve educational practice by providing feedback to different actors in educational systems. However, the process of assessment from design to the point of a final impact on student learning outcomes is complex and diverse. It is hard to identify reasons for substandard achievement on assessments, let alone…
Descriptors: Foreign Countries, Educational Assessment, Educational Improvement, Engineering
Anderson, Daniel; Irvin, Shawn; Alonzo, Julie; Tindal, Gerald A. – Educational Measurement: Issues and Practice, 2015
The alignment of test items to content standards is critical to the validity of decisions made from standards-based tests. Generally, alignment is determined based on judgments made by a panel of content experts with either ratings averaged or via a consensus reached through discussion. When the pool of items to be reviewed is large, or the…
Descriptors: Test Items, Alignment (Education), Standards, Online Systems
Torres Irribarra, David; Diakow, Ronli; Freund, Rebecca; Wilson, Mark – Grantee Submission, 2015
This paper presents the Latent Class Level-PCM as a method for identifying and interpreting latent classes of respondents according to empirically estimated performance levels. The model, which combines elements from latent class models and reparameterized partial credit models for polytomous data, can simultaneously (a) identify empirical…
Descriptors: Item Response Theory, Test Items, Statistical Analysis, Models
Yarbrough, Nükhet D. – Creativity Research Journal, 2016
As part of a project to translate and administer the Torrance Tests of Creative Thinking (TTCT) to Turkish elementary and secondary students, 35 professionals were trained in a full-day workshop to learn to score the verbal TTCT. All trainees scored the same 4 sets of TTCT verbal criterion tests for fluency, flexibility, and originality by filling…
Descriptors: Creative Thinking, Item Analysis, Scores, Test Items
Gusev, Marjan; Ristov, Sasko; Armenski, Goce – International Journal of Distance Education Technologies, 2016
Recent technology trends evolved the student assessment from traditional ones ("pen-and-paper" and "face-to-face") to modern e-Assessment system. These modern approaches allow the teachers to conduct and evaluate an exam with huge number of students in a short period of time. Even more important, both the teacher and the…
Descriptors: Educational Technology, Technology Uses in Education, Computer Assisted Testing, Evaluation Methods
Mulligan, Neil W.; Smith, S. Adam; Spataro, Pietro – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2016
Stimuli co-occurring with targets in a detection task are better remembered than stimuli co-occurring with distractors--the attentional boost effect (ABE). The ABE is of interest because it is an exception to the usual finding that divided attention during encoding impairs memory. The effect has been demonstrated in tests of item memory but it is…
Descriptors: Memory, Attention, Recognition (Psychology), Priming
Foley, Brett P. – Practical Assessment, Research & Evaluation, 2016
There is always a chance that examinees will answer multiple choice (MC) items correctly by guessing. Design choices in some modern exams have created situations where guessing at random through the full exam--rather than only for a subset of items where the examinee does not know the answer--can be an effective strategy to pass the exam. This…
Descriptors: Guessing (Tests), Multiple Choice Tests, Case Studies, Test Construction
Park, Mihwa; Johnson, Joseph A. – International Journal of Environmental and Science Education, 2016
While significant research has been conducted on students' conceptions of energy, alternative conceptions of energy have not been actively explored in the area of environmental science. The purpose of this study is to examine students' alternative conceptions in the environmental science discipline through the analysis of responses of first year…
Descriptors: Environmental Education, Multiple Choice Tests, Test Items, Energy
Huda, Nizlel; Subanji; Nusantar, Toto; Susiswo; Sutawidjaja, Akbar; Rahardjo, Swasono – Educational Research and Reviews, 2016
This study aimed to determine students' metacognitive failure in Mathematics Education Program of FKIP in Jambi University investigated based on assimilation and accommodation Mathematical framework. There were 35 students, five students did not answer the question, three students completed the questions correctly and 27 students tried to solve…
Descriptors: Metacognition, Mathematics Education, Problem Solving, Qualitative Research
Ryan, Ève; Brunfaut, Tineke – Language Assessment Quarterly, 2016
It is not unusual for tests in less-commonly taught languages (LCTLs) to be developed by an experienced item writer with no proficiency in the language being tested, in collaboration with a language informant who is a speaker of the target language, but lacks language assessment expertise. How this approach to item writing works in practice, and…
Descriptors: Language Tests, Uncommonly Taught Languages, Test Construction, Test Items
Suh, Youngsuk – Journal of Educational Measurement, 2016
This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…
Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance
Volov, Vyacheslav T.; Gilev, Alexander A. – International Journal of Environmental and Science Education, 2016
In today's item response theory (IRT) the response to the test item is considered as a probability event depending on the student's ability and difficulty of items. It is noted that in the scientific literature there is very little agreement about how to determine factors affecting the item difficulty. It is suggested that the difficulty of the…
Descriptors: Item Response Theory, Test Items, Difficulty Level, Science Tests
Moothedath, Shana; Chaporkar, Prasanna; Belur, Madhu N. – Perspectives in Education, 2016
In recent years, the computerised adaptive test (CAT) has gained popularity over conventional exams in evaluating student capabilities with desired accuracy. However, the key limitation of CAT is that it requires a large pool of pre-calibrated questions. In the absence of such a pre-calibrated question bank, offline exams with uncalibrated…
Descriptors: Guessing (Tests), Computer Assisted Testing, Adaptive Testing, Maximum Likelihood Statistics

Peer reviewed
Direct link
