ERIC Number: ED027934
Record Type: RIE
Publication Date: 1968-Oct
Pages: 25
Abstractor: N/A
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: N/A
Index Simulation Feasibility and Automatic Document Classification.
Fried, J.B.; And Others
The objectives of this research were to investigage (1) the feasibility of an index simulator, and (2) the use of a sequential analysis model for the automatic classification of documents. Index simulation was studied in general for various types of indexes, and in depth for the simulation of an inverted coordinate index to a document collection having a specific distribution of maximum postings per term. The two principal lines of activity included (1) the collection of real-world data, and (2) actual index-model programming, for which two preliminary simulated programs were written and made operative. It is concluded that an inverted index can be simulated effectively, but more work must be done to ascertain the feasibility of a general index simulation model. The automatic classification model classifies documents by reading succeeding parts of a document only to the point at which classification of the document into one or more categories is achieved. Procedures for estimating and selecting needed probabilities are delineated and various document reading procedures are given. The report recommends that (1) further work on a large scale be conducted on an index simulation model, and (2) the automatic classification model become operational so that it might be compared to other existing and proposed systems. (JW)
Descriptors: Automation, Classification, Computer Programs, Computers, Content Analysis, Coordinate Indexes, Feasibility Studies, Indexes, Indexing, Information Retrieval, Information Storage, Mathematical Models, Simulation, Statistical Analysis
Clearinghouse for Federal Scientific and Technical Information, Springfield, Va. 22151 (PB 182 597, MF $.65, HC $3.00)
Publication Type: N/A
Education Level: N/A
Audience: N/A
Language: N/A
Sponsor: National Science Foundation, Washington, DC.
Authoring Institution: Ohio State Univ., Columbus. Computer and Information Science Research Center.
Grant or Contract Numbers: N/A
Author Affiliations: N/A