NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1,366 to 1,380 of 3,126 results Save | Export
Baker, Eva; Polin, Linda – 1978
The validity studies planned for the Test Design activities deal primarily with the appropriateness of items generated for a domain. Previous exploratory work in the field related to overall test content appropriateness ratings has not been satisfactory. Studies which are solely based on correlational data suffer from confounding with…
Descriptors: Questionnaires, Rating Scales, Test Construction, Test Format
Swartz, Richard; Whitney, Douglas R. – Lifelong Learning, 1987
The authors discuss the new essay requirement on the General Educational Development Test. Topics covered include scoring, expected difficulty, and how test preparatory classes can help students do well on the essay. (CH)
Descriptors: Adult Basic Education, High School Equivalency Programs, Test Format, Writing Skills
Peer reviewed Peer reviewed
Dodd, David K.; Leal, Linda – Teaching of Psychology, 1988
Discusses answer justification, a technique that allows students to convert multiple-choice items perceived to be "tricky" into short-answer essay questions. Convincing justifications earn students credit for missed items. The procedure is reported to be easy to administer and very popular among students. (Author/GEA)
Descriptors: Guessing (Tests), Higher Education, Multiple Choice Tests, Psychology
Peer reviewed Peer reviewed
Chambers, William V. – Social Behavior and Personality, 1985
Personal construct psychologists have suggested various psychological functions explain differences in the stability of constructs. Among these functions are constellatory and loose construction. This paper argues that measurement error is a more parsimonious explanation of the differences in construct stability reported in these studies. (Author)
Descriptors: Error of Measurement, Test Construction, Test Format, Test Reliability
Peer reviewed Peer reviewed
Grosse, Martin E.; Wright, Benjamin D. – Educational and Psychological Measurement, 1985
A model of examinee behavior was used to generate hypotheses about the operation of true-false scores. Confirmation of hypotheses supported the contention that true-false scores contain an error component that makes these tests less reliable than multiple-choice tests. Examinee response style may invalidate a total true-false score. (Author/DWH)
Descriptors: Objective Tests, Response Style (Tests), Test Format, Test Reliability
Peer reviewed Peer reviewed
Dixon, Paul N.; And Others – Educational and Psychological Measurement, 1984
The influence of scale format on results was examined. Two Likert type formats, one with all choice points defined and one with only end-points defined, were administered. Each subject completed half the items in each format. Results indicated little difference between forms, nor did subjects indicate a format preference. (Author/DWH)
Descriptors: Higher Education, Rating Scales, Response Style (Tests), Test Format
Spray, Judith; Lin, Chuan-Ju; Chen, Troy T. – 2002
Automated test assembly is a technology for producing multiple, equivalent test forms from an item pool. An important consideration for test security in automated test assembly is the inclusion of the same items on these multiple forms. Although it is possible to use item selection as a formal constraint in assembling forms, the number of…
Descriptors: Computer Assisted Testing, Item Banks, Test Construction, Test Format
van der Linden, Wim J. – 2001
This report contains a review of procedures for computerized assembly of linear, sequential, and adaptive tests. The common approach to these test assembly problems is to view them as instances of constrained combinatorial optimization. For each testing format, several potentially useful objective functions and types of constraints are discussed.…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Construction, Test Format
van der Linden, Wim J.; Adema, Jos J. – 1997
An algorithm for the assembly of multiple test forms is proposed in which the multiple-form problem is reduced to a series of computationally less intensive two-form problems. At each step one form is assembled to its true specifications; the other form is a dummy assembled only to maintain a balance between the quality of the current form and the…
Descriptors: Algorithms, Foreign Countries, Higher Education, Linear Programming
Henson, Robin K. – 2000
The purpose of this paper is to highlight some psychometric cautions that should be observed when seeking to develop short form versions of tests. Several points are made: (1) score reliability is impacted directly by the characteristics of the sample and testing conditions; (2) sampling error has a direct influence on reliability and factor…
Descriptors: Factor Structure, Psychometrics, Reliability, Sampling
Li, Yuan H.; Lissitz, Robert W.; Yang, Yu Nu – 1999
Recent years have seen growing use of tests with mixed item formats, e.g., partly containing dichotomously scored items and partly consisting of polytomously scored items. A matching two test characteristic curves method (CCM) for placing these mixed format items on the same metric is described and evaluated in this paper under a common-item…
Descriptors: Equated Scores, Estimation (Mathematics), Item Response Theory, Test Format
Papanastasiou, Elena C. – 2002
Due to the increased popularity of computerized adaptive testing (CAT), many admissions tests, as well as certification and licensure examinations, have been transformed from their paper-and-pencil versions to computerized adaptive versions. A major difference between paper-and-pencil tests and CAT, from an examinees point of view, is that in many…
Descriptors: Adaptive Testing, Cheating, Computer Assisted Testing, Review (Reexamination)
Peer reviewed Peer reviewed
Berndt, David J.; And Others – Journal of Consulting and Clinical Psychology, 1983
Obtained reading grade levels for depression scales by use of two empirically based readability formulae. Results showed Kovacs children's measure had the easiest reading level, the General Behavior Inventory was appropriate for college-level reading, and most other measures clustered at a fifth- to ninth-grade reading level. (WAS)
Descriptors: Affective Measures, Depression (Psychology), Readability, Readability Formulas
Peer reviewed Peer reviewed
Ouellette, Sue E.; Sendelbaugh, Joseph W. – American Annals of the Deaf, 1982
Fifteen deaf students (18 to 24 years old) who received the standard written form of a reading comprehension test performed significantly better than 15 deaf Ss who received an American Sign Language version. There were no differences between Ss receiving the standard form and Ss receiving a Manually Coded English videotaped form. (CL)
Descriptors: College Students, Deafness, Performance Factors, Reading Comprehension
Peer reviewed Peer reviewed
Wilcox, Rand R. – Educational and Psychological Measurement, 1982
Results in the engineering literature on "k out of n system reliability" can be used to characterize tests based on estimates of the probability of correctly determining whether the examinee knows the correct response. In particular, the minimum number of distractors required for multiple-choice tests can be empirically determined.…
Descriptors: Achievement Tests, Mathematical Models, Multiple Choice Tests, Test Format
Pages: 1  |  ...  |  88  |  89  |  90  |  91  |  92  |  93  |  94  |  95  |  96  |  ...  |  209