Publication Date
| In 2026 | 0 |
| Since 2025 | 55 |
| Since 2022 (last 5 years) | 197 |
| Since 2017 (last 10 years) | 497 |
| Since 2007 (last 20 years) | 745 |
Descriptor
| Test Items | 1189 |
| Test Reliability | 1189 |
| Test Validity | 687 |
| Test Construction | 567 |
| Foreign Countries | 349 |
| Difficulty Level | 280 |
| Item Analysis | 253 |
| Psychometrics | 236 |
| Item Response Theory | 219 |
| Factor Analysis | 184 |
| Multiple Choice Tests | 173 |
| More ▼ | |
Source
Author
| Schoen, Robert C. | 12 |
| LaVenia, Mark | 5 |
| Liu, Ou Lydia | 5 |
| Anderson, Daniel | 4 |
| Bauduin, Charity | 4 |
| DiLuzio, Geneva J. | 4 |
| Farina, Kristy | 4 |
| Haladyna, Thomas M. | 4 |
| Huck, Schuyler W. | 4 |
| Petscher, Yaacov | 4 |
| Stansfield, Charles W. | 4 |
| More ▼ | |
Publication Type
Education Level
Audience
| Practitioners | 39 |
| Researchers | 30 |
| Teachers | 24 |
| Administrators | 13 |
| Support Staff | 3 |
| Counselors | 2 |
| Students | 2 |
| Community | 1 |
| Parents | 1 |
| Policymakers | 1 |
Location
| Turkey | 69 |
| Indonesia | 37 |
| Germany | 20 |
| Canada | 17 |
| Florida | 17 |
| China | 16 |
| Australia | 15 |
| California | 12 |
| Iran | 11 |
| India | 10 |
| New York | 9 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Peer reviewedMay, Kim; Nicewander, W. Alan – Journal of Educational Measurement, 1994
Reliabilities and information functions for percentile ranks and number-right scores were compared using item response theory, modeling standardized achievement tests. Results demonstrate that situations exist in which the percentage of items known by examinees can be accurately estimated, but the percentage of persons falling below a given score…
Descriptors: Achievement Tests, Difficulty Level, Equations (Mathematics), Estimation (Mathematics)
Peer reviewedStumpf, Steven H. – Evaluation and the Health Professions, 1994
A five-year curriculum evaluation project is described that treated students' course ratings, examination reliability coefficients, and item-discrimination data as a battery of data points for determining annual revision efforts. Histograms were constructed to make valid demonstrations of successful efforts immediately comprehensible to faculty.…
Descriptors: College Faculty, Comprehension, Curriculum Evaluation, Longitudinal Studies
Peer reviewedTruman, William L. – AMATYC Review, 1992
Describes the development of a placement test specifically designed for students entering Pembroke State University. Includes question construction and suitable test standards in discussing the features of a good placement test. Concludes that the test provides a reliable measure of students potential success. (MDH)
Descriptors: Content Validity, Higher Education, Mathematics Education, Mathematics Tests
Peer reviewedDescy, Don E. – International Journal of Instructional Media, 1991
Describes the development of an affective measure of student attitudes toward their advisors, called the Descy Attitude toward Advisors scale (DATAs). Content validity is discussed, along with construct validity and alpha reliability data gathered on 120 graduate students in the fields of education and nursing. A copy of DATAs is appended.…
Descriptors: Academic Advising, Attitude Measures, Faculty Advisers, Higher Education
Kong, Xiaojing J.; Wise, Steven L.; Bhola, Dennison S. – Educational and Psychological Measurement, 2007
This study compared four methods for setting item response time thresholds to differentiate rapid-guessing behavior from solution behavior. Thresholds were either (a) common for all test items, (b) based on item surface features such as the amount of reading required, (c) based on visually inspecting response time frequency distributions, or (d)…
Descriptors: Test Items, Reaction Time, Timed Tests, Item Response Theory
Haladyna, Thomas M. – 1984
The purpose of this study is to examine an option-weighting method as it affects pass-fail decisions in formative and summative evaluation of student achievement for instructional units, certification, advancement, licensure, admissions, placement, and selection. A database was constructed using high school achievement test data where a…
Descriptors: Achievement Tests, Cutting Scores, High Schools, Multiple Choice Tests
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – ETS Research Report Series, 2006
This study addresses the sample error and linking bias that occur with small and unrepresentative samples in a non-equivalent groups anchor test (NEAT) design. We propose a linking method called the "synthetic function," which is a weighted average of the identity function (the trivial equating function for forms that are known to be…
Descriptors: Equated Scores, Sample Size, Test Items, Statistical Bias
PDF pending restorationCobern, William W. – 1986
This computer program, written in BASIC, performs three different calculations of test reliability: (1) the Kuder-Richardson method; (2); the "common split-half" method; and (3) the Rulon-Guttman split-half method. The program reads sequential access data files for microcomputers that have been set up by statistical packages such as…
Descriptors: Computer Software, Difficulty Level, Educational Research, Equations (Mathematics)
Jackson, Douglas N. – 1983
Concern for enhancing construct validity of vocational interest measures provides a focus for scale construction quite distinct from that derived from a criterion-referenced strategy: Construct-oriented measurement implies: (1) substantive definitions of dimensions; (2) concern for internal consistency reliability, as well as generalizability; (3)…
Descriptors: Career Counseling, Criterion Referenced Tests, Factor Analysis, Interest Inventories
Sinnott, Loraine T. – 1982
A standard method for exploring item bias is the intergroup comparison of item difficulties. This paper describes a refinement and generalization of this technique. In contrast to prior approaches, the proposed method deletes outlying items from the formulation of a criterion for identifying items as deviant. It also extends the mathematical…
Descriptors: College Entrance Examinations, Difficulty Level, Higher Education, Item Analysis
Fuchs, Lynn; And Others – 1981
Three related studies were conducted to examine the effects of variations in procedures used for curriculum-based assessment of reading proficiency: the first addressed the question of the influence of sample duration on the concurrent validity of the measure; the second addressed the question of the influence of sample duration on the level,…
Descriptors: Elementary Education, Item Banks, Learning Disabilities, Reading Ability
Linn, Robert – 1978
A series of studies on conceptual and design problems in competency-based measurements are explained. The concept of validity within the context of criterion-referenced measurement is reviewed. The authors believe validation should be viewed as a process rather than an end product. It is the process of marshalling evidence to support…
Descriptors: Criterion Referenced Tests, Item Analysis, Item Sampling, Test Bias
Weiten, Wayne – 1979
Two different formats for multiple-choice test items were compared in an experimental test given in a college class in introductory psychology. In one format, a question or incomplete statement was followed by four answers or completions, only one of which was correct. In the other format, the double multiple-choice version, the same questions…
Descriptors: Difficulty Level, Higher Education, Item Analysis, Multiple Choice Tests
LeBlanc, John; And Others – 1977
MeasureMetric is a school television/film series for fifth and sixth grade students which presents measurement and the metric system. The main goal of this test development effort was to produce a valid and reliable test usable by classroom teachers for measuring children's achievement of the objectives of the twelve 15-minute programs dealing…
Descriptors: Achievement Tests, Educational Objectives, Educational Television, Elementary Education
Lenke, Joanne M.; And Others – 1977
To investigate the effect of violating the assumption of equal item difficulty on Kuder-Richardson (KR) Formula 21 reliability coefficient, 670 eighth-and ninth- grade students were administered 26 short, homogeneous "tests" of mathematics concepts and skills. Both KR Formula 20 and KR Formula 21 were used to estimate reliability on each…
Descriptors: Comparative Analysis, Diagnostic Tests, Difficulty Level, Item Analysis

Direct link
