By the same authors

Using Document Dimensions for Enhanced Information Retrieval

Research output: Contribution to conferencePaper



Publication details

DatePublished - 2004
Original languageUndefined/Unknown


Conventional document search techniques are constrained by attempting to match individual keywords or phrases to source documents. Thus, these techniques miss out documents that contain semantically similar terms, thereby achieving a relatively low degree of recall. At the same time, processing capabilities and tools for syntactic and semantic analysis of language have advanced to the point where an index-time linguistic analysis of source documents is both feasible and realistic. In this paper, we introduce document dimensions, a means of classifying or grouping terms discovered in documents. Using an enhanced version of Jakarta Lucene[1], we demonstrate that supplementing keyword analysis with some syntactic and semantic information can indeed enhance the quality of information retrieval results.

Discover related content

Find related publications, people, projects, datasets and more using interactive charts.

View graph of relations