Publication Date
| In 2026 | 0 |
| Since 2025 | 62 |
| Since 2022 (last 5 years) | 388 |
| Since 2017 (last 10 years) | 831 |
| Since 2007 (last 20 years) | 1345 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 195 |
| Teachers | 161 |
| Researchers | 93 |
| Administrators | 50 |
| Students | 34 |
| Policymakers | 15 |
| Parents | 12 |
| Counselors | 2 |
| Community | 1 |
| Media Staff | 1 |
| Support Staff | 1 |
| More ▼ | |
Location
| Canada | 63 |
| Turkey | 59 |
| Germany | 41 |
| United Kingdom | 37 |
| Australia | 36 |
| Japan | 35 |
| China | 33 |
| United States | 32 |
| California | 25 |
| Iran | 25 |
| United Kingdom (England) | 25 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Journal of Educational Measurement, 2008
This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically,…
Descriptors: Equated Scores, Sample Size, Test Reliability, Comparative Analysis
Vassar, Matt – Social Indicators Research, 2008
The purpose of the present study was to meta-analytically investigate the score reliability for the Satisfaction With Life Scale. Four-hundred and sixteen articles using the measure were located through electronic database searches and then separated to identify studies which had calculated reliability estimates from their own data. Sixty-two…
Descriptors: Test Format, Life Satisfaction, Reliability, Measures (Individuals)
Thurlow, Martha; Rogers, Christopher; Christensen, Laurene – National Center on Educational Outcomes, University of Minnesota, 2010
The success of all students, including students with disabilities, on statewide assessments in mathematics and reading/English language arts has been examined closely. This is due, in part, to the role of these content areas in school accountability for the Elementary and Secondary Education Act (ESEA) known as "No Child Left Behind" (NCLB).…
Descriptors: Science Tests, Disabilities, Student Participation, Testing Accommodations
Hendrickson, Amy; Patterson, Brian; Ewing, Maureen – College Board, 2010
The psychometric considerations and challenges associated with including constructed response items on tests are discussed along with how these issues affect the form assembly specifications for mixed-format exams. Reliability and validity, security and fairness, pretesting, content and skills coverage, test length and timing, weights, statistical…
Descriptors: Multiple Choice Tests, Test Format, Test Construction, Test Validity
Teemant, Annela – Journal of the Scholarship of Teaching and Learning, 2010
ESL students struggle to represent accurately on tests what they know. Understanding what constitutes equitable testing practices in university settings for ESL students poses a significant challenge to educators. This study reports on the content analysis of semi-structured interview data obtained from 13 university-level ESL students on their…
Descriptors: Testing, Interviews, Test Anxiety, English (Second Language)
Tait, Carolyn – Higher Education Quarterly, 2010
The recruitment of Asian students into western universities has highlighted the debate about commercialisation of education, academic standards and the role of culture and language in approaches to learning. This article investigates Chinese students' perceptions of how two typical examination formats (multiple choice and essay) affect their…
Descriptors: Feedback (Response), Student Attitudes, Institutional Evaluation, Academic Standards
DeCarlo, Lawrence T. – ETS Research Report Series, 2008
Rater behavior in essay grading can be viewed as a signal-detection task, in that raters attempt to discriminate between latent classes of essays, with the latent classes being defined by a scoring rubric. The present report examines basic aspects of an approach to constructed-response (CR) scoring via a latent-class signal-detection model. The…
Descriptors: Scoring, Responses, Test Format, Bias
Belov, Dmitry I.; Armstrong, Ronald D. – Applied Psychological Measurement, 2008
This article presents an application of Monte Carlo methods for developing and assembling multistage adaptive tests (MSTs). A major advantage of the Monte Carlo assembly over other approaches (e.g., integer programming or enumerative heuristics) is that it provides a uniform sampling from all MSTs (or MST paths) available from a given item pool.…
Descriptors: Monte Carlo Methods, Adaptive Testing, Sampling, Item Response Theory
Webb, Mi-young Lee; Cohen, Allan S.; Schwanenflugel, Paula J. – Educational and Psychological Measurement, 2008
This study investigated the use of latent class analysis for the detection of differences in item functioning on the Peabody Picture Vocabulary Test-Third Edition (PPVT-III). A two-class solution for a latent class model appeared to be defined in part by ability because Class 1 was lower in ability than Class 2 on both the PPVT-III and the…
Descriptors: Item Response Theory, Test Items, Test Format, Cognitive Ability
Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2010
The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method, to the examination based on constructed-response questions (CRQs). Despite that MCQs have an advantage concerning objectivity in the grading process and speed in production of results, they also introduce an error in the final…
Descriptors: Computer Assisted Instruction, Scoring, Grading, Comparative Analysis
Lohman, David F.; Lakin, Joni M. – British Journal of Educational Psychology, 2009
Background: Strand, Deary, and Smith (2006) reported an analysis of sex differences on the Cognitive Abilities Test (CAT) for over 320,000 UK students 11-12 years old. Although mean differences were small, males were overrepresented at the upper and lower extremes of the score distributions on the quantitative and non-verbal batteries and at the…
Descriptors: Gender Differences, Cognitive Tests, Foreign Countries, Comparative Analysis
Lafontaine, Dominique; Monseur, Christian – European Educational Research Journal, 2009
In this article we discuss how apparently indicators that may appear straightforward, such as gender differences, need to be interpreted with extreme care. In particular, we consider how the assessment framework, and the methodology of international surveys, may have a potential impact on the results and on the indicators. Through analysis of…
Descriptors: Foreign Countries, Reading Comprehension, Test Format, Comparative Analysis
In'nami, Yo; Koizumi, Rie – Language Testing, 2009
A meta-analysis was conducted on the effects of multiple-choice and open-ended formats on L1 reading, L2 reading, and L2 listening test performance. Fifty-six data sources located in an extensive search of the literature were the basis for the estimates of the mean effect sizes of test format effects. The results using the mixed effects model of…
Descriptors: Test Format, Listening Comprehension Tests, Multiple Choice Tests, Program Effectiveness
Hart, Ray; Casserly, Michael; Uzzell, Renata; Palacios, Moses; Corcoran, Amanda; Spurgeon, Liz – Council of the Great City Schools, 2015
There has been little data collected on how much testing actually goes on in America's schools and how the results are used. So in the Spring of 2014, the Council staff developed and launched a survey of assessment practices. This report presents the findings from that survey and subsequent Council analysis and review of the data. It also offers…
Descriptors: Urban Schools, Student Evaluation, Testing Programs, Testing
Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012
This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…
Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics

Peer reviewed
Direct link
