Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Liu, Jinghua; Schuppan, Fred; Walker, Michael E. – College Board, 2005
This study explored whether the addition of the items with more advanced math content to the SAT Reasoning Test™ (SAT®) would impact test-taker performance. Two sets of SAT math equating sections were modified to form four subforms each. Different numbers of items with advanced content, taken from the SAT II: Mathematics Level IC Test (Math IC),…
Descriptors: College Entrance Examinations, Mathematics Tests, Test Items, Difficulty Level
Peer reviewedShavelson, Richard J.; And Others – Journal of Educational Psychology, 1974
Descriptors: Aptitude, Incidental Learning, Individual Differences, Instructional Materials
Peer reviewedHarnisch, Delwyn L. – Journal of Educational Measurement, 1983
The Student-Problem (S-P) methodology is described using an example of 24 students on a test of 44 items. Information based on the students' test score and the modified caution index is put to diagnostic use. A modification of the S-P methodology is applied to domain-referenced testing. (Author/CM)
Descriptors: Academic Achievement, Educational Practices, Item Analysis, Responses
Haladyna, Thomas M.; Roid, Gale H. – Educational Technology, 1983
Summarizes item review in the development of criterion-referenced tests, including logical item review, which examines the match between instructional intent and the items; empirical item review, which examines response patterns; traditional item review; and instructional sensitivity of test items. Twenty-eight references are listed. (MBR)
Descriptors: Criterion Referenced Tests, Educational Research, Literature Reviews, Teaching Methods
Peer reviewedO'Leary, Susan G.; Steen, Patricia L. – Journal of Consulting and Clinical Psychology, 1982
Using three samples of children--a heterogeneous group, a hyperactive group, and a replication sample of hyperactive children--evaluted the Stony Brook Scale (SBS). Results indicated the SBS independently assessed hyperactivity and aggression in samples of hyperactive children. (Author/RC)
Descriptors: Aggression, Behavior Rating Scales, Children, Classification
Peer reviewedBarker, Douglas; Ebel, Robert L. – Contemporary Educational Psychology, 1982
Two forms of an undergraduate examination were constructed. Tests varied with respect to item truth value (true, false) and method of phrasing (positive, negative). Negatively stated items were more difficult but not more discriminating than positively stated items. False items were not more difficult but were more discriminating than true items.…
Descriptors: Difficulty Level, Higher Education, Item Analysis, Response Style (Tests)
Peer reviewedBeuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979
Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)
Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction
Peer reviewedPikulski, John J.; Shanahan, Timothy – Reading Teacher, 1980
Reviews information about three major dimensions along which phonics measures vary: (1) group v individual administration, (2) production response (say or write) v recognition response, and (3) the stimuli or materials used to evaluate phonics. (HOD)
Descriptors: Comparative Analysis, Elementary Education, Evaluation Methods, Phonics
Peer reviewedHanna, Gerald S.; Oaster, Thomas R. – Educational and Psychological Measurement, 1980
Certain kinds of multiple-choice reading comprehension questions may be answered correctly at the higher-than-chance level when they are administered without the accompanying passage. These high risk questions do not necessarily lead to passage dependence invalidity. They threaten but do not prove invalidity. (Author/CP)
Descriptors: High Schools, Multiple Choice Tests, Reading Comprehension, Reading Tests
Peer reviewedSandoval, Jonathan – Journal of Consulting and Clinical Psychology, 1979
Examined cultural bias of the Wechsler Intelligence Scale for Children-Revised (WISC-R) for Anglo-American, Black, and Mexican American children. Minority children responded in the same way as Anglo-American children. No clear pattern to items on the test that were more difficult for minority children appeared. The WISC-R appears to be nonbiased.…
Descriptors: Children, Culture Fair Tests, Intelligence Tests, Item Analysis
Peer reviewedSandoval, Jonathan; Miille, Mary Patricia Whelan – Journal of Consulting and Clinical Psychology, 1980
Findings indicated that the judges were not able to determine accurately which items were more difficult for minority students and that there was no significant difference in accuracy between judges of the different ethnic backgrounds. (Author)
Descriptors: Accountability, Blacks, Evaluators, Intelligence Tests
Peer reviewedRevelle, William – Multivariate Behavioral Research, 1979
Hierarchical cluster analysis is shown to be an effective method for forming scales from sets of items. Comparisons with factor analytic techniques suggest that hierarchical analysis is superior in some respects for scale construction. (Author/JKS)
Descriptors: Cluster Analysis, Factor Analysis, Item Analysis, Rating Scales
Peer reviewedMason, Geoffrey P. – Canadian Journal of Education, 1979
The author replies to Marx's critique of his essay, "Test Purpose and Item Type," which appears on pages 8-13 of this issue of "Canadian Journal of Education." For Marx's comments, see pages 14-19. (SJL)
Descriptors: Cognitive Processes, Criterion Referenced Tests, Formative Evaluation, Measurement Techniques
Peer reviewedSchnipke, Deborah L.; Scrams, David J. – Journal of Educational Measurement, 1997
A method to measure speededness on tests is presented that reflects the tendency of examinees to guess rapidly on items as time expires. The method models response times with a two-state mixture model, as demonstrated with data from a computer-administered reasoning test taken by 7,218 examinees. (SLD)
Descriptors: Adults, Computer Assisted Testing, Guessing (Tests), Item Response Theory
Peer reviewedCohen, Allan S.; And Others – Applied Psychological Measurement, 1996
Type I error rates for the likelihood ratio test for detecting differential item functioning (DIF) were investigated using Monte Carlo simulations. Type I error rates for the two-parameter model were within theoretically expected values at each alpha level, but those for the three-parameter model were not. (SLD)
Descriptors: Identification, Item Bias, Item Response Theory, Maximum Likelihood Statistics


