Reviewing and evaluating Automatic Term Recognition techniques

Ioannis Korkontzelos, Loannis P. Klapaftis, Suresh Manandhar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Automatic Term Recognition (ATR) is defined as the task of identifying domain specific terms from technical corpora. Termhood-based approaches measure the degree that a candidate term refers to a domain specific concept. Unithood-based approaches measure the attachment strength of a candidate term constituents. These methods have been evaluated using different, often incompatible evaluation schemes and datasets. This paper provides an overview and a thorough evaluation of state-of-the-art ATR methods, under a common evaluation framework, i.e. corpora and evaluation method. Our contributions are two-fold: (1) We compare a number of different ATR methods, showing that termhood-based methods achieve in general superior performance. (2) We show that the number of independent occurrences of a candidate term is the most effective source for estimating term nestedness, improving ATR performance.

Original languageEnglish
Title of host publicationADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS
EditorsB Nordstrom, A Ranta
Place of PublicationBERLIN
PublisherSpringer
Pages248-259
Number of pages12
Volume5221 LNAI
ISBN (Print)978-3-540-85286-5
Publication statusPublished - 2008
Event6th International Conference on Natural Language Processing - Gothenburg
Duration: 25 Aug 200827 Aug 2008

Conference

Conference6th International Conference on Natural Language Processing
CityGothenburg
Period25/08/0827/08/08

Keywords

  • automatic term recognition
  • ATR
  • term extraction

Cite this