ERIC - Search Results

Descriptor

Information Retrieval	7
Mathematical Formulas	7
Statistical Analysis	7
Databases	3
Comparative Analysis	2
Models	2
Relevance (Information…	2
Subject Index Terms	2
Tables (Data)	2
Automatic Indexing	1
Bayesian Statistics	1
Bibliographic Databases	1
Bibliometrics	1
Citations (References)	1
Cystic Fibrosis	1
Evaluation Methods	1
Expert Systems	1
Full Text Databases	1
Hypermedia	1
Information Systems	1
Literature Reviews	1
Matrices	1
Measurement Techniques	1
Performance Factors	1
Probability	1
More ▼

Source

Journal of the American…	4
Information Processing and…	1
Journal of Documentation	1
Journal of the American…	1

Author

Wilbur, W. John	2
Bookstein, Abraham	1
Ellis, David	1
Kopcsa, Alexander	1
Kulyukin, Vladimir	1
Losee, Robert M., Jr.	1
Nicholson, John	1
Raita, Timo	1
Schiebel, Edgar	1
Wilbur, John	1
Yang, Yiming	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	4
Reports - Descriptive	3
Information Analyses	2
Opinion Papers	2

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Using Corpus Statistics to Remove Redundant Words in Text Categorization.

Peer reviewed

Yang, Yiming; Wilbur, John – Journal of the American Society for Information Science, 1996

Studies aggressive automated word removal in text categorization in large databases based on corpus statistics to reduce the noise in free texts and to enhance the computational efficiency of categorization. Topics include stop word identification, categorization methods for comparison, tests on four document collections, and evaluation…

Descriptors: Comparative Analysis, Databases, Evaluation Methods, Information Retrieval

Adapting Measures of Clumping Strength To Assess Term-Term Similarity.

Peer reviewed

Bookstein, Abraham; Kulyukin, Vladimir; Raita, Timo; Nicholson, John – Journal of the American Society for Information Science and Technology, 2003

Discusses automated information retrieval, focusing on statistical patterns expected of a pair of terms that are semantically related to each other, guided by text generation conceptualization. Examines how the tendency of a content bearing term to clump, as quantified by previously developed measures of term clumping, is influenced by the…

Descriptors: Information Retrieval, Mathematical Formulas, Measurement Techniques, Semantics

Science and Technology Mapping: A New Iteration Model for Representing Multidimensional Relationships.

Peer reviewed

Kopcsa, Alexander; Schiebel, Edgar – Journal of the American Society for Information Science, 1998

Introduces a new iteration model for the calculation of co-word maps. Co-word analysis is an objective quantitative method for analyzing and integrating survey information about research trends and structures that avoids problems using statistical methods to produce mappings of reduced information. (PEN)

Descriptors: Bibliometrics, Citations (References), Information Retrieval, Mathematical Formulas

Retrieval Testing by the Comparison of Statistically Independent Retrieval Methods.

Peer reviewed

Wilbur, W. John – Journal of the American Society for Information Science, 1992

Describes a procedure for information retrieval testing that is based on the comparison of statistically independent methods of retrieval applied to the same database. The probability ranking principle is discussed, the statistical meaning of relevance is examined, and the methodology is illustrated on a large database of MEDLINE records. (19…

Descriptors: Comparative Analysis, Databases, Information Retrieval, Mathematical Formulas

On the Creation of Hypertext Links in Full-Text Documents: Measurement of Inter-Linker Consistency.

Peer reviewed

Ellis, David; And Others – Journal of Documentation, 1994

Describes a study in which several different sets of hypertext links are inserted by different people in full-text documents. The degree of similarity between the sets is measured using coefficients and topological indices. As in comparable studies of inter-indexer consistency, the sets of links used by different people showed little similarity.…

Descriptors: Full Text Databases, Hypermedia, Information Retrieval, Mathematical Formulas

Term Dependence: Truncating the Bahadur Lazarsfeld Expansion.

Peer reviewed

Losee, Robert M., Jr. – Information Processing and Management, 1994

Studies the performance of probabilistic information retrieval systems using differing statistical dependence assumptions when estimating the probabilities inherent in the retrieval model. Experimental results using the Bahadur Lazarsfeld expansion on the Cystic Fibrosis database are discussed that suggest that incorporating term dependence…

Descriptors: Cystic Fibrosis, Databases, Information Retrieval, Information Systems

Retrieval Testing with Hypergeometric Document Models.

Peer reviewed

Wilbur, W. John – Journal of the American Society for Information Science, 1993

Presents a method of modeling the relevance relationship in information retrieval to answer the question of the theoretical limits of certain statistical methods. Hypergeometric probability distribution is used to construct an abstract model of a database of MEDLINE records, and results of tests of vector retrieval methods are reported. (28…

Descriptors: Automatic Indexing, Bayesian Statistics, Bibliographic Databases, Expert Systems