Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 61 |
| Since 2017 (last 10 years) | 125 |
| Since 2007 (last 20 years) | 219 |
Descriptor
| Test Construction | 813 |
| Test Format | 813 |
| Test Items | 363 |
| Test Validity | 180 |
| Higher Education | 177 |
| Computer Assisted Testing | 146 |
| Multiple Choice Tests | 139 |
| Test Reliability | 135 |
| Foreign Countries | 130 |
| Elementary Secondary Education | 108 |
| Language Tests | 87 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 78 |
| Teachers | 57 |
| Researchers | 38 |
| Administrators | 16 |
| Students | 6 |
| Policymakers | 5 |
| Media Staff | 1 |
| Parents | 1 |
Location
| Turkey | 12 |
| Canada | 10 |
| Japan | 10 |
| United States | 9 |
| United Kingdom | 8 |
| Germany | 7 |
| Australia | 6 |
| Israel | 6 |
| California | 5 |
| China | 5 |
| Florida | 5 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 3 |
| Improving Americas Schools… | 1 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Pyle, Katie; Jones, Emily; Williams, Chris; Morrison, Jo – Educational Research, 2009
Background: All national curriculum tests in England are pre-tested as part of the development process. Differences in pupil performance between pre-test and live test are consistently found. This difference has been termed the pre-test effect. Understanding the pre-test effect is essential in the test development and selection processes and in…
Descriptors: Foreign Countries, Pretesting, Context Effect, National Curriculum
National Assessment Governing Board, 2009
As the ongoing national indicator of what American students know and can do, the National Assessment of Educational Progress (NAEP) in Reading regularly collects achievement information on representative samples of students in grades 4, 8, and 12. The information that NAEP provides about student achievement helps the public, educators, and…
Descriptors: National Competency Tests, Reading Tests, Test Items, Test Format
Peer reviewedMolina, Maria Teresa Lopez-Mezquita – Indian Journal of Applied Linguistics, 2009
Lexical competence is considered to be an essential step in the development and consolidation of a student's linguistic ability, and thus the reliable assessment of such competence turns out to be a fundamental aspect in this process. The design and construction of vocabulary tests has become an area of special interest, as it may provide teachers…
Descriptors: Student Evaluation, Second Language Learning, Computer Assisted Testing, Foreign Countries
Allalouf, Avi; Abramzon, Andrea – Language Assessment Quarterly, 2008
Differential item functioning (DIF) analysis can be used to great advantage in second language (L2) assessments. This study examined the differences in performance on L2 test items between groups from different first language backgrounds and suggested ways of improving L2 assessments. The study examined DIF on L2 (Hebrew) test items for two…
Descriptors: Test Items, Test Format, Second Language Learning, Test Construction
Tanguma, Jesus – 2000
This paper addresses four steps in test construction specification: (1) the purpose of the test; (2) the content of the test; (3) the format of the test; and (4) the pool of items. If followed, such steps not only will assist the test constructor but will also enhance the students' learning. Within the "Content of the Test" section, two…
Descriptors: Test Construction, Test Content, Test Format, Test Items
Peer reviewedDeMars, Christine E. – Journal of Educational Measurement, 2003
Generated data to simulate multidimensionality resulting from including two or four subtopics on a test. DIMTEST analysis results suggest that including multiple topics, when they are commonly taught together, can lead to conceptual multidimensionality and mathematical multidimensionality. (SLD)
Descriptors: Curriculum, Simulation, Test Construction, Test Format
Liu, Jinghua; Zhu, Xiaowen – ETS Research Report Series, 2008
The purpose of this paper is to explore methods to approximate population invariance without conducting multiple linkings for subpopulations. Under the single group or equivalent groups design, no linking needs to be performed for the parallel-linear system linking functions. The unequated raw score information can be used as an approximation. For…
Descriptors: Raw Scores, Test Format, Comparative Analysis, Test Construction
Crisp, Victoria – Research Papers in Education, 2008
This research set out to compare the quality, length and nature of (1) exam responses in combined question and answer booklets, with (2) responses in separate answer booklets in order to inform choices about response format. Combined booklets are thought to support candidates by giving more information on what is expected of them. Anecdotal…
Descriptors: Geography Instruction, High School Students, Test Format, Test Construction
Peer reviewedJohnson, William L.; Dixon, Paul N. – Educational and Psychological Measurement, 1984
This study analyzed the results of applying two different methods of likert-scale construction (single-column and discrepancy-column formats). The findings indicated that the discrepancy format provides stronger discrimination for purposes of measuring need. (Author/BW)
Descriptors: Needs Assessment, Responses, Test Construction, Test Format
Peer reviewedLucas, Peter A.; McConkie, George W. – American Educational Research Journal, 1980
An approach is described for the characterization of test questions in terms of the information in a passage relevant to answering them and the nature of the relationship of this information to the questions. The approach offers several advantages over previous algorithms for the production of test items. (Author/GDC)
Descriptors: Content Analysis, Cues, Test Construction, Test Format
Peer reviewedvan der Linden, Wim J.; Adema, Jos J. – Journal of Educational Measurement, 1998
Proposes an algorithm for the assembly of multiple test forms in which the multiple-form problem is reduced to a series of computationally less intensive two-form problems. Illustrates how the method can be implemented using 0-1 linear programming and gives two examples. (SLD)
Descriptors: Algorithms, Linear Programming, Test Construction, Test Format
Bahar, Mehmet; Aydin, Fatih; Karakirik, Erol – Online Submission, 2009
In this article, Structural communication grid (SCG), an alternative measurement and evaluation technique, has been firstly summarised and the design, development and implementation of a computer based SCG system have been introduced. The system is then tested on a sample of 154 participants consisting of candidate students, science teachers and…
Descriptors: Educational Technology, Technology Integration, Evaluation Methods, Measurement Techniques
Anderson, Zola; And Others – 1983
The study examined the effect of test modifications on the performance of 10 handicapped preschoolers on the Stanford-Binet Intelligence Scale (Form L-M). Adaptations of both stimulus and response modes were designed and constructed for subtests at the preschool levels on the Stanford-Binet. Attempts were made to maintain the functional…
Descriptors: Disabilities, Intelligence Tests, Preschool Education, Test Construction
Boser, Judith A. – Evaluation News, 1985
The maximum incorporation of computer coding into an instrument is recommended to reduce errors in coding information from questionnaires. Specific suggestions for guiding the precoding process for response options, numeric identifiers, and assignment of card columns are proposed for mainframe computer data entry. (BS)
Descriptors: Computers, Data Collection, Data Processing, Questionnaires
Peer reviewedStreiner, David L.; Miller, Harold R. – Journal of Clinical Psychology, 1986
Numerous short forms of the Minnesota Multiphasic Personality Inventory have been proposed in the last 15 years. In each case, the initial enthusiasm has been replaced by the questions about the clinical utility of the abbreviated version. Argues that the statistical properties of the test and reduced reliability due to shortening the scales…
Descriptors: Test Construction, Test Format, Test Length, Test Reliability

Direct link
