NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED406996
Record Type: Non-Journal
Publication Date: 1997-Apr-16
Pages: 35
Abstractor: N/A
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: N/A
Do We Still Need Controlled Vocabulary? Of Course, We Do! But How Do We Get It: The Roles for Text Analysis Softwares.
Greenfield, Rich
The author argues that traditional library cataloging (MARC) and the online public access catalog (OPAC) are in collision with the world of the Internet because items in electronic formats undergo MARC cataloging only on a very selective basis. Also the library profession initially isolated itself from World Wide Web development by predicting no real need for universal access, by ignoring large areas of human creativity, and by de-emphasizing "ephemeral" resources. This paper recommends a constructive merger of the best of both worlds--the full text analysis provided by web search engines and the controlled vocabularies found in library OPACs. The Congressional Research Service (CRS) is being used as a testbed to examine relevant techniques. Three of the major text analysis technologies are natural language processing, case-based reasoning, and adaptive learning. As part of "the new OPAC," the Experimental Search System (ESS) is one of the Library of Congress' first efforts to make selected cataloging and digital library resources available over the World Wide Web by means of a single, point-and-click interface. Perhaps even more promising is the idea of using large MARC databases to generate word clusters associated with controlled vocabulary terms and classifications. Six commercial text analysis software products are reviewed in the Appendix, using a comparative table. These tools, many of them associated with major search engine vendors, may support automatic classification and document analysis, thereby increasing cataloger productivity. (AEF)
Publication Type: Reports - Evaluative; Speeches/Meeting Papers
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: Library of Congress, Washington, DC. Congressional Research Service.
Grant or Contract Numbers: N/A
Author Affiliations: N/A