Descriptor
| Automatic Indexing | 4 |
| Mathematical Models | 4 |
| Statistical Analysis | 4 |
| Algorithms | 3 |
| Probability | 3 |
| Bayesian Statistics | 2 |
| Classification | 2 |
| Cluster Grouping | 2 |
| Databases | 2 |
| Documentation | 2 |
| Indexing | 2 |
| More ▼ | |
Source
| Journal of the American… | 2 |
Publication Type
| Reports - Research | 2 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedHarter, Stephen P. – Journal of the American Society for Information Science, 1975
Confirms previously published research in concluding that specialty words tend to possess frequency distributions which cannot be described by a single Poisson distribution. (Author/PF)
Descriptors: Automatic Indexing, Indexing, Keywords, Mathematical Models
Peer reviewedHarter, Stephen P. – Journal of the American Society for Information Science, 1975
A probabilistic model of keyword indexing is outlined, and some of the consequences of the model are examined. An algorithm defining a measure of indexability is developed--a measure intended to reflect the relative significance of words in documents. (Author)
Descriptors: Algorithms, Automatic Indexing, Indexing, Mathematical Models
PDF pending restorationWhite, Lee J.; And Others – 1975
The major advantage of sequential classification, a technique for automatically classifying documents into previously selected categories, is that the entire document need not be processed before it is classified. This method assumes the availability of a priori categories, a selection of keywords representative of these categories, and the a…
Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification
Kar, B. Gautam; White, Lee J. – 1975
The feasibility of using a distance measure, called the Bayesian distance, for automatic sequential document classification was studied. Results indicate that, by observing the variation of this distance measure as keywords are extracted sequentially from a document, the occurrence of noisy keywords may be detected. This property of the distance…
Descriptors: Algorithms, Automatic Indexing, Bayesian Statistics, Classification


