Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 1 |
| Since 2007 (last 20 years) | 1 |
Descriptor
| Mathematical Models | 46 |
| Test Items | 46 |
| Test Reliability | 35 |
| Item Analysis | 23 |
| Difficulty Level | 18 |
| Test Construction | 18 |
| Latent Trait Theory | 15 |
| Error of Measurement | 14 |
| Statistical Analysis | 14 |
| Comparative Analysis | 11 |
| Criterion Referenced Tests | 10 |
| More ▼ | |
Source
| Educational and Psychological… | 4 |
| Applied Psychological… | 2 |
| Journal of Educational… | 2 |
| Journal of Educational and… | 2 |
| Applied Measurement in… | 1 |
| Multivariate Behavioral… | 1 |
| Psychometrika | 1 |
Author
| Reckase, Mark D. | 4 |
| Benson, Jeri | 2 |
| Douglass, James B. | 2 |
| Feldt, Leonard S. | 2 |
| Gustafsson, Jan-Eric | 2 |
| Patience, Wayne M. | 2 |
| Wilcox, Rand R. | 2 |
| Ackerman, Terry A. | 1 |
| Algina, James | 1 |
| Allan S. Cohen | 1 |
| Armstrong, Ronald D. | 1 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 4 |
Location
| Australia | 1 |
| Florida | 1 |
| Georgia | 1 |
| South Carolina | 1 |
| Taiwan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| Comprehensive Tests of Basic… | 1 |
| Minnesota Multiphasic… | 1 |
| SAT (College Admission Test) | 1 |
| School and College Ability… | 1 |
| Stanford Binet Intelligence… | 1 |
What Works Clearinghouse Rating
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
van der Linden, Wim J. – 1982
A latent trait method is presented to investigate the possibility that Angoff or Nedelsky judges specify inconsistent probabilities in standard setting techniques for objectives-based instructional programs. It is suggested that judges frequently specify a low probability of success for an easy item but a large probability for a hard item. The…
Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Interrater Reliability
Peer reviewedArmstrong, Ronald D.; Jones, Douglas H.; Wang, Zhaobo – Journal of Educational and Behavioral Statistics, 1998
Generating a test from an item bank using a criterion based on classical test theory parameters poses considerable problems. A mathematical model is formulated that maximizes the reliability coefficient alpha, subject to logical constraints on the choice of items. Theorems ensuring appropriate application of the Lagragian relation techniques are…
Descriptors: Item Banks, Mathematical Models, Reliability, Test Construction
Peer reviewedHsu, Louis M. – Multivariate Behavioral Research, 1992
D.V. Budescu and J.L. Rogers (1981) proposed a method of adjusting correlations of scales to eliminate spurious components resulting from the overlapping of scales. Three reliability correction formulas are derived in this article that are based on more tenable assumptions. (SLD)
Descriptors: Correlation, Equations (Mathematics), Mathematical Models, Personality Measures
Peer reviewedReuterberg, Sven-Eric; Gustafsson, Jan-Eric – Educational and Psychological Measurement, 1992
The use of confirmatory factor analysis by the LISREL program is demonstrated as an assumption-testing method when computing reliability coefficients under different model assumptions. Results indicate that reliability estimates are robust against departure from the assumption of parallelism of test items. (SLD)
Descriptors: Equations (Mathematics), Estimation (Mathematics), Mathematical Models, Robustness (Statistics)
Wang, Wen-chung – 1997
Traditional approaches to the investigation of the objectivity of ratings for constructed-response items are based on classical test theory, which is item-dependent and sample-dependent. Item response theory overcomes this drawback by decomposing item difficulties into genuine difficulties and rater severity. In so doing, objectivity of ability…
Descriptors: College Entrance Examinations, Constructed Response, Foreign Countries, Interrater Reliability
Peer reviewedKane, Michael; Moloney, James – Applied Psychological Measurement, 1978
The answer-until-correct (AUC) procedure requires that examinees respond to a multi-choice item until they answer it correctly. Using a modified version of Horst's model for examinee behavior, this paper compares the effect of guessing on item reliability for the AUC procedure and the zero-one scoring procedure. (Author/CTM)
Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests
Peer reviewedFeldt, Leonard S. – Educational and Psychological Measurement, 1984
The binomial error model includes form-to-form difficulty differences as error variance and leads to Ruder-Richardson formula 21 as an estimate of reliability. If the form-to-form component is removed from the estimate of error variance, the binomial model leads to KR 20 as the reliability estimate. (Author/BW)
Descriptors: Achievement Tests, Difficulty Level, Error of Measurement, Mathematical Formulas
Peer reviewedHuynh, Huynh; Saunders, Joseph C. – Journal of Educational Measurement, 1980
Single administration (beta-binomial) estimates for the raw agreement index p and the corrected-for-chance kappa index in mastery testing are compared with those based on two test administrations in terms of estimation bias and sampling variability. Bias is about 2.5 percent for p and 10 percent for kappa. (Author/RL)
Descriptors: Comparative Analysis, Error of Measurement, Mastery Tests, Mathematical Models
Samejima, Fumiko – 1990
Test validity is a concept that has often been ignored in the context of latent trait models and in modern test theory, particularly as it relates to computerized adaptive testing. Some considerations about the validity of a test and of a single item are proposed. This paper focuses on measures that are population-free and that will provide local…
Descriptors: Adaptive Testing, Computer Assisted Testing, Equations (Mathematics), Item Response Theory
Divgi, D. R. – 1978
One aim of criterion-referenced testing is to classify an examinee without reference to a norm group; therefore, any statements about the dependability of such classification ought to be group-independent also. A population-independent index is proposed in terms of the probability of incorrect classification near the cutoff true score. The…
Descriptors: Criterion Referenced Tests, Cutting Scores, Difficulty Level, Error of Measurement
PDF pending restorationCobern, William W. – 1986
This computer program, written in BASIC, performs three different calculations of test reliability: (1) the Kuder-Richardson method; (2); the "common split-half" method; and (3) the Rulon-Guttman split-half method. The program reads sequential access data files for microcomputers that have been set up by statistical packages such as…
Descriptors: Computer Software, Difficulty Level, Educational Research, Equations (Mathematics)
Kolen, Michael J.; Whitney, Douglas R. – 1978
The application of latent trait theory to classroom tests necessitates the use of small sample sizes for parameter estimation. Computer generated data were used to assess the accuracy of estimation of the slope and location parameters in the two parameter logistic model with fixed abilities and varying small sample sizes. The maximum likelihood…
Descriptors: Difficulty Level, Item Analysis, Latent Trait Theory, Mathematical Models
Peer reviewedVan der Linden, Wim J. – Journal of Educational Measurement, 1982
An ignored aspect of standard setting, namely the possibility that Angoff or Nedelsky judges specify inconsistent probabilities (e.g., low probabilities for easy items but large probabilities for hard items) is explored. A latent trait method is proposed to estimate such misspecifications, and an index of consistency is defined. (Author/PN)
Descriptors: Cutting Scores, Latent Trait Theory, Mastery Tests, Mathematical Models
Peer reviewedFeldt, Leonard S. – Applied Measurement in Education, 1993
The recommendation that the reliability of multiple-choice tests will be enhanced if the distribution of item difficulties is concentrated at approximately 0.50 is reinforced and extended in this article by viewing the 0/1 item scoring as a dichotomization of an underlying normally distributed ability score. (SLD)
Descriptors: Ability, Difficulty Level, Guessing (Tests), Mathematical Models

Direct link
