The Archaeotools project: faceted classification and natural language processing in an archaeological context

S. Jeffrey, J. Richards, F. Ciravegna, S. Waller, S. Chapman, Z. Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

This paper describes 'Archaeotools', a major e-Science project in archaeology. The aim of the project is to use faceted classification and natural language processing to create an advanced infrastructure for archaeological research. The project aims to integrate over 1 x 10(6) structured database records referring to archaeological sites and monuments in the UK, with information extracted from semi-structured grey literature reports, and unstructured antiquarian journal accounts, in a single faceted browser interface. The project has illuminated the variable level of vocabulary control and standardization that currently exists within national and local monument inventories. Nonetheless, it has demonstrated that the relatively well-defined ontologies and thesauri that exist in archaeology mean that a high level of success can be achieved using information extraction techniques. This has great potential for unlocking and making accessible the information held in grey literature and antiquarian accounts, and has lessons for allied disciplines.

Original languageEnglish
Pages (from-to)2507-2519
Number of pages13
Journal Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences
Volume367
Issue number1897
DOIs
Publication statusPublished - 28 Jun 2009

Bibliographical note

One contribution of 16 to a Theme Issue ‘Crossing boundaries: computational science, e-Science and global e-Infrastructure I. Selected papers from the UK e-Science All Hands Meeting 2008’.

Keywords

  • archaeology
  • grey literature
  • faceted classification
  • information extraction
  • natural language processing

Cite this