Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 7 |
| Since 2007 (last 20 years) | 19 |
Descriptor
| Scoring | 249 |
| Testing Problems | 249 |
| Test Construction | 65 |
| Test Reliability | 51 |
| Elementary Secondary Education | 48 |
| Test Interpretation | 45 |
| Higher Education | 44 |
| Test Validity | 39 |
| Test Use | 37 |
| Testing Programs | 36 |
| Testing | 31 |
| More ▼ | |
Source
Author
| Hughes, David C. | 3 |
| Lord, Frederic M. | 3 |
| Quellmalz, Edys | 3 |
| Roeber, Edward D. | 3 |
| Sattler, Jerome M. | 3 |
| Stocking, Martha L. | 3 |
| Wainer, Howard | 3 |
| Weiss, David J. | 3 |
| Andrulis, Richard S. | 2 |
| Baker, Eva L. | 2 |
| Dings, Jonathan | 2 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 19 |
| Practitioners | 11 |
| Teachers | 4 |
| Counselors | 1 |
| Parents | 1 |
| Policymakers | 1 |
Location
| Australia | 2 |
| Kentucky | 2 |
| Netherlands | 2 |
| Ohio | 2 |
| United Kingdom (England) | 2 |
| Brazil | 1 |
| Greece | 1 |
| Haiti | 1 |
| Iran | 1 |
| Israel | 1 |
| Japan | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 2 |
| Education Consolidation… | 1 |
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Atehortua, Laura – ProQuest LLC, 2022
Intelligence tests are used in a variety of settings such as schools, clinics, and courts to assess the intellectual capacity of individuals of all ages. Intelligence tests are used to make high-stakes decisions such as special education placement, employment, eligibility for social security services, and determination of the death penalty.…
Descriptors: Adults, Intelligence Tests, Children, Error of Measurement
Arefsadr, Sajjad; Babaii, Esmat – TESL-EJ, 2023
According to the IELTS official website, IELTS candidates usually score lower in the IELTS Writing test than in the other language skills. This is disappointing for the many IELTS candidates who fail to get the overall band score they need. Surprisingly enough, few studies have addressed this issue. The present study, then, is aimed at shedding…
Descriptors: Second Language Learning, Language Tests, English (Second Language), Foreign Countries
LaFlair, Geoffrey T.; Langenfeld, Thomas; Baig, Basim; Horie, André Kenji; Attali, Yigal; von Davier, Alina A. – Journal of Computer Assisted Learning, 2022
Background: Digital-first assessments leverage the affordances of technology in all elements of the assessment process--from design and development to score reporting and evaluation to create test taker-centric assessments. Objectives: The goal of this paper is to describe the engineering, machine learning, and psychometric processes and…
Descriptors: Computer Assisted Testing, Affordances, Scoring, Engineering
Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019
Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…
Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)
Sarac, Merve; Loken, Eric – International Journal of Testing, 2023
This study is an exploratory analysis of examinee behavior in a large-scale language proficiency test. Despite a number-right scoring system with no penalty for guessing, we found that 16% of examinees omitted at least one answer and that women were more likely than men to omit answers. Item-response theory analyses treating the omitted responses…
Descriptors: English (Second Language), Language Proficiency, Language Tests, Second Language Learning
Xiao, Yang; Han, Jing; Koenig, Kathleen; Xiong, Jianwen; Bao, Lei – Physical Review Physics Education Research, 2018
Assessment instruments composed of two-tier multiple choice (TTMC) items are widely used in science education as an effective method to evaluate students' sophisticated understanding. In practice, however, there are often concerns regarding the common scoring methods of TTMC items, which include pair scoring and individual scoring schemes. The…
Descriptors: Hierarchical Linear Modeling, Item Response Theory, Multiple Choice Tests, Case Studies
Hoang, Ngoc Thi Huyen – Language Education & Assessment, 2019
As validity pertains to test use rather than the test itself, using a test for unintended purposes requires a new validation program using additional evidence from relevant sources. This small-scale study contributes to the validation of the use of originally academic language tests--the International English Language Testing System and the Test…
Descriptors: Language Tests, Immigrants, Immigration, Testing Problems
Clifford, Ray – Foreign Language Annals, 2016
This article summarizes some of the technical issues that add to the complexity of language testing. It focuses in particular on the criterion-referenced nature of the ACTFL Proficiency Guidelines-Speaking; and it proposes a criterion-referenced interpretation of the ACTFL guidelines for reading and listening. It then demonstrates how using…
Descriptors: Criterion Referenced Tests, Language Tests, Language Proficiency, Guidelines
Kahn, Josh; Nese, Joseph T.; Alonzo, Julie – Behavioral Research and Teaching, 2016
There is strong theoretical support for oral reading fluency (ORF) as an essential building block of reading proficiency. The current and standard ORF assessment procedure requires that students read aloud a grade-level passage (˜ 250 words) in a one-to-one administration, with the number of words read correctly in 60 seconds constituting their…
Descriptors: Teacher Surveys, Oral Reading, Reading Tests, Computer Assisted Testing
Chen, Haiwen H.; von Davier, Matthias; Yamamoto, Kentaro; Kong, Nan – ETS Research Report Series, 2015
One major issue with large-scale assessments is that the respondents might give no responses to many items, resulting in less accurate estimations of both assessed abilities and item parameters. This report studies how the types of items affect the item-level nonresponse rates and how different methods of treating item-level nonresponses have an…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Henning, Grant – English Teaching Forum, 2012
To some extent, good testing procedure, like good language use, can be achieved through avoidance of errors. Almost any language-instruction program requires the preparation and administration of tests, and it is only to the extent that certain common testing mistakes have been avoided that such tests can be said to be worthwhile selection,…
Descriptors: Testing, English (Second Language), Testing Problems, Student Evaluation
Sawchuk, Stephen – Education Week, 2010
Most experts in the testing community have presumed that the $350 million promised by the U.S. Department of Education to support common assessments would promote those that made greater use of open-ended items capable of measuring higher-order critical-thinking skills. But as measurement experts consider the multitude of possibilities for an…
Descriptors: Test Items, Federal Legislation, Scoring, Accountability
Reed, Deborah K.; Cummings, Kelli D.; Schaper, Andrew; Biancarosa, Gina – Review of Educational Research, 2014
Recent studies indicate that examiners make a number of intentional and unintentional errors when administering reading assessments to students. Because these errors introduce construct-irrelevant variance in scores, the fidelity of test administrations could influence the results of evaluation studies. To determine how assessment fidelity is…
Descriptors: Fidelity, Reading Tests, Student Evaluation, Reading Research
Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012
A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…
Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring
Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2011
The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method against the oral examination (OE) method. MCQs are widely used and their importance seems likely to grow, due to their inherent suitability for electronic assessment. However, MCQs are influenced by the tendency of examinees to guess…
Descriptors: Grades (Scholastic), Scoring, Multiple Choice Tests, Test Format

Direct link
Peer reviewed
