NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Yang, Yiming; Wilbur, John – Journal of the American Society for Information Science, 1996
Studies aggressive automated word removal in text categorization in large databases based on corpus statistics to reduce the noise in free texts and to enhance the computational efficiency of categorization. Topics include stop word identification, categorization methods for comparison, tests on four document collections, and evaluation…
Descriptors: Comparative Analysis, Databases, Evaluation Methods, Information Retrieval
Peer reviewed Peer reviewed
Bookstein, Abraham; Kulyukin, Vladimir; Raita, Timo; Nicholson, John – Journal of the American Society for Information Science and Technology, 2003
Discusses automated information retrieval, focusing on statistical patterns expected of a pair of terms that are semantically related to each other, guided by text generation conceptualization. Examines how the tendency of a content bearing term to clump, as quantified by previously developed measures of term clumping, is influenced by the…
Descriptors: Information Retrieval, Mathematical Formulas, Measurement Techniques, Semantics
Peer reviewed Peer reviewed
Kopcsa, Alexander; Schiebel, Edgar – Journal of the American Society for Information Science, 1998
Introduces a new iteration model for the calculation of co-word maps. Co-word analysis is an objective quantitative method for analyzing and integrating survey information about research trends and structures that avoids problems using statistical methods to produce mappings of reduced information. (PEN)
Descriptors: Bibliometrics, Citations (References), Information Retrieval, Mathematical Formulas
Peer reviewed Peer reviewed
Wilbur, W. John – Journal of the American Society for Information Science, 1992
Describes a procedure for information retrieval testing that is based on the comparison of statistically independent methods of retrieval applied to the same database. The probability ranking principle is discussed, the statistical meaning of relevance is examined, and the methodology is illustrated on a large database of MEDLINE records. (19…
Descriptors: Comparative Analysis, Databases, Information Retrieval, Mathematical Formulas
Peer reviewed Peer reviewed
Ellis, David; And Others – Journal of Documentation, 1994
Describes a study in which several different sets of hypertext links are inserted by different people in full-text documents. The degree of similarity between the sets is measured using coefficients and topological indices. As in comparable studies of inter-indexer consistency, the sets of links used by different people showed little similarity.…
Descriptors: Full Text Databases, Hypermedia, Information Retrieval, Mathematical Formulas
Peer reviewed Peer reviewed
Losee, Robert M., Jr. – Information Processing and Management, 1994
Studies the performance of probabilistic information retrieval systems using differing statistical dependence assumptions when estimating the probabilities inherent in the retrieval model. Experimental results using the Bahadur Lazarsfeld expansion on the Cystic Fibrosis database are discussed that suggest that incorporating term dependence…
Descriptors: Cystic Fibrosis, Databases, Information Retrieval, Information Systems
Peer reviewed Peer reviewed
Wilbur, W. John – Journal of the American Society for Information Science, 1993
Presents a method of modeling the relevance relationship in information retrieval to answer the question of the theoretical limits of certain statistical methods. Hypergeometric probability distribution is used to construct an abstract model of a database of MEDLINE records, and results of tests of vector retrieval methods are reported. (28…
Descriptors: Automatic Indexing, Bayesian Statistics, Bibliographic Databases, Expert Systems