Publication Date
| In 2026 | 1 |
| Since 2025 | 64 |
| Since 2022 (last 5 years) | 390 |
| Since 2017 (last 10 years) | 833 |
| Since 2007 (last 20 years) | 1347 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 195 |
| Teachers | 161 |
| Researchers | 93 |
| Administrators | 50 |
| Students | 34 |
| Policymakers | 15 |
| Parents | 12 |
| Counselors | 2 |
| Community | 1 |
| Media Staff | 1 |
| Support Staff | 1 |
| More ▼ | |
Location
| Canada | 63 |
| Turkey | 59 |
| Germany | 41 |
| United Kingdom | 37 |
| Australia | 36 |
| Japan | 35 |
| China | 33 |
| United States | 32 |
| California | 25 |
| Iran | 25 |
| United Kingdom (England) | 25 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedCziko, Gary A. – TESOL Quarterly, 1982
Describes an attempt to construct an ESL dictation test that would: (1) be appropriate for a wide range of ability, (2) be easy and fast to score, (3) consist of set items that would form both a unidimensional and cumulative scale, and (4) yield scores that would be directly interpretable with respect to specified levels of English proficiency.…
Descriptors: Criterion Referenced Tests, English (Second Language), Higher Education, Scores
Peer reviewedPlake, Barbara S.; And Others – Journal of Educational Measurement, 1982
Effects of item arrangement, test anxiety, and sex on a mathematics test taken by motivated, upper-division undergraduates and beginning graduate students were investigated. Results showed that males outperformed females when items were arranged from easy to hard. (Author/GK)
Descriptors: Academic Achievement, College Mathematics, Higher Education, Sex Differences
Peer reviewedStiggins, Richard J. – Research in the Teaching of English, 1982
Compares direct and indirect writing assessment strategies and contrasts them in terms of the relationship each has to specific classroom decision-making situations, the components of writing assessed, practical testing matters, characteristics of test exercises, test scoring procedures, and procedures for determining test quality. (HOD)
Descriptors: Comparative Analysis, Decision Making, Educational Assessment, Test Format
Peer reviewedWard, William C.; And Others – Journal of Educational Measurement, 1980
Free response and machine-scorable versions of a test called Formulating Hypotheses were compared with respect to construct validity. Results indicate that the different forms involve different cognitive processes and measure different qualities. (Author/JKS)
Descriptors: Cognitive Processes, Cognitive Tests, Higher Education, Personality Traits
Peer reviewedBerk, Ronald A. – Journal of Educational Measurement, 1980
A dozen different approaches that yield 13 reliability indices for criterion-referenced tests were identified and grouped into three categories: threshold loss function, squared-error loss function, and domain score estimation. Indices were evaluated within each category. (Author/RL)
Descriptors: Classification, Criterion Referenced Tests, Cutting Scores, Evaluation Methods
Peer reviewedWainer, Howard; Lukhele, Robert – Educational and Psychological Measurement, 1997
The reliability of scores from four forms of the Test of English as a Foreign Language (TOEFL) was estimated using a hybrid item response theory model. It was found that there was very little difference between overall reliability when the testlet items were assumed to be independent and when their dependence was modeled. (Author/SLD)
Descriptors: English (Second Language), Item Response Theory, Scores, Second Language Learning
Peer reviewedColwell, Richard – Music Educators Journal, 1990
Encourages music teachers to work with students interested in advanced placement (AP) music courses. Discusses the logistics and advantages of placing students in these courses. Describes the Advanced Placement Listening and Literature and the Advanced Placement Theory courses and examinations. Outlines the examination scoring method and looks at…
Descriptors: Acceleration (Education), Advanced Placement Programs, Advanced Students, Educational Attainment
Peer reviewedMcCall, Virgil W.; And Others – Contemporary Educational Psychology, 1989
Scores from the Form L-M and the Fourth Edition of the Stanford-Binet Intelligence Scale were compared for 19 male and 13 female gifted children before they entered grade 3. Significant differences were found between the L-M intelligence scores and the composite and area scores of the Fourth Edition. (SLD)
Descriptors: Academically Gifted, Comparative Analysis, Elementary School Students, Intelligence Quotient
Peer reviewedHaladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989
A taxonomy of 43 rules for writing multiple-choice test items is presented, based on a consensus of 46 textbooks. These guidelines are presented as complete and authoritative, with solid consensus apparent for 33 of the rules. Four rules lack consensus, and 5 rules were cited fewer than 10 times. (SLD)
Descriptors: Classification, Interrater Reliability, Multiple Choice Tests, Objective Tests
Peer reviewedMelancon, Janet G.; Thompson, Bruce – Psychology in the Schools, 1989
Investigated measurement characteristics of both forms of Finding Embedded Figures Test (FEFT). College students (N=302) completed both forms of FEFT or one form of FEFT and Group Embedded Figures Test. Results suggest that FEFT forms provide reasonable reliable and valid data. (Author/NB)
Descriptors: College Students, Field Dependence Independence, Higher Education, Multiple Choice Tests
Peer reviewedComrey, Andrew L. – Journal of Consulting and Clinical Psychology, 1988
Addresses common pitfalls in homogeneous scale construction in clinical and social psychology. Offers suggestions about item writing, answer scale formats, data analysis procedures, and overall scale development strategy. Emphasizes effective use of factor-analytic methods to select items for scales and to determine its proper location in…
Descriptors: Clinical Psychology, Data Analysis, Factor Analysis, Personality Measures
Peer reviewedLyons, Judith A.; Scotti, Joseph R. – Psychological Assessment, 1994
The utility of using the Keane Minnesota Multiphasic Personality Inventory (MMPI) Posttraumatic Stress Disorder scale as an instrument separate from the full MMPI was evaluated. Results with 175 African American and white male veterans support use of the scale as an alternative to the full test. (SLD)
Descriptors: Blacks, Comparative Analysis, Evaluation Methods, Males
Peer reviewedSmith, Renee L.; And Others – Psychological Assessment, 1995
The clinical utility of using fewer than 12 trials of the Selective Reminding Test, a task to assess verbal memory, was studied with 100 cardiac patients and 100 brain injury patients. Results suggest that as few as 6 trials might be adequate, providing information consistent with that from 12 trials. (SLD)
Descriptors: Clinical Diagnosis, Diagnostic Tests, Head Injuries, Memory
Peer reviewedVallies, June Baird; And Others – Reading Improvement, 1992
Compares the performance of four mainstreamed learning-disabled students on oral and written tests in social studies. Finds superior test performance during oral testing replicated across all four students. Suggests procedures for implementing oral testing by classroom teachers. (RS)
Descriptors: Comparative Analysis, Educational Research, Grade 2, Learning Disabilities
Peer reviewedKnowles, Susan L.; Welch, Cynthia A. – Educational and Psychological Measurement, 1992
A meta-analysis of the difficulty and discrimination of the "none-of-the-above" (NOTA) test option was conducted with 12 articles (20 effect sizes) for difficulty and 7 studies (11 effect sizes) for discrimination. Findings indicate that using the NOTA option does not result in items of lesser quality. (SLD)
Descriptors: Difficulty Level, Effect Size, Meta Analysis, Multiple Choice Tests


