NLP and Archaeology: A View from a Digital Archive

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

The Archaeology Data Service (ADS) has been experimenting with Natural Language Processing (NLP) methodologies for over 12 years. As an accredited digital repository, the focus has been to explore how NLP techniques can be used to augment any basic digital object’s metadata and to begin to facilitate increased human and machine access. Thus, the words used within the ADS archive catalogue and archaeological reports have added value; they provide detail, context and understanding, but conversely, they can also be ambiguous. The NLP techniques studied go beyond allowing a user to search a PDF file, to building a classification for the user, and then continuing to improve the rules behind the method(s). While these experiments solidified our view that NLP has an important role to play in our core services, our ability to implement them in a robust way has remained elusive. This chapter presents our journey from an archaeological perspective, being useful to both researchers who wish to engage with NLP methodologies in Social Sciences and Humanities, while also giving the point of view of a trusted digital repository. Also, it reports ADS efforts to implement NLP within our collections, discussing why it remains elusive and future challenges.
Original languageEnglish
Title of host publicationDiscourse and Argumentation in Archaeology
Subtitle of host publicationConceptual and Computational Approaches. Quantitative Archaeology and Archaeological Modelling
EditorsCesar Gonzalez-Perez, Patricia Martin-Rodilla, Martín Pereira-Fariña
PublisherSpringer Nature Switzerland
Chapter10
Pages215-228
Number of pages14
ISBN (Electronic)978-3-031-37156-1
ISBN (Print)978-3-031-37155-4
DOIs
Publication statusPublished - 4 Nov 2023

Publication series

NameQuantitative Archaeology and Archaeological Modelling
PublisherSpringer Nature Switzerland
ISSN (Print)2366-5998
ISSN (Electronic)2366-6005

Keywords

  • Archaeology Data Service
  • Natural Language Processing
  • Named Entity Recognition
  • Digital Archives
  • Metadata

Cite this