TY - CHAP
T1 - NLP and Archaeology
T2 - A View from a Digital Archive
AU - Wright, Holly Ellen
AU - Evans, Tim
AU - Green, Katie Louise
PY - 2023/11/4
Y1 - 2023/11/4
N2 - The Archaeology Data Service (ADS) has been experimenting with Natural Language Processing (NLP) methodologies for over 12 years. As an accredited digital repository, the focus has been to explore how NLP techniques can be used to augment any basic digital object’s metadata and to begin to facilitate increased human and machine access. Thus, the words used within the ADS archive catalogue and archaeological reports have added value; they provide detail, context and understanding, but conversely, they can also be ambiguous. The NLP techniques studied go beyond allowing a user to search a PDF file, to building a classification for the user, and then continuing to improve the rules behind the method(s). While these experiments solidified our view that NLP has an important role to play in our core services, our ability to implement them in a robust way has remained elusive. This chapter presents our journey from an archaeological perspective, being useful to both researchers who wish to engage with NLP methodologies in Social Sciences and Humanities, while also giving the point of view of a trusted digital repository. Also, it reports ADS efforts to implement NLP within our collections, discussing why it remains elusive and future challenges.
AB - The Archaeology Data Service (ADS) has been experimenting with Natural Language Processing (NLP) methodologies for over 12 years. As an accredited digital repository, the focus has been to explore how NLP techniques can be used to augment any basic digital object’s metadata and to begin to facilitate increased human and machine access. Thus, the words used within the ADS archive catalogue and archaeological reports have added value; they provide detail, context and understanding, but conversely, they can also be ambiguous. The NLP techniques studied go beyond allowing a user to search a PDF file, to building a classification for the user, and then continuing to improve the rules behind the method(s). While these experiments solidified our view that NLP has an important role to play in our core services, our ability to implement them in a robust way has remained elusive. This chapter presents our journey from an archaeological perspective, being useful to both researchers who wish to engage with NLP methodologies in Social Sciences and Humanities, while also giving the point of view of a trusted digital repository. Also, it reports ADS efforts to implement NLP within our collections, discussing why it remains elusive and future challenges.
KW - Archaeology Data Service
KW - Natural Language Processing
KW - Named Entity Recognition
KW - Digital Archives
KW - Metadata
U2 - 10.1007/978-3-031-37156-1_10
DO - 10.1007/978-3-031-37156-1_10
M3 - Chapter
SN - 978-3-031-37155-4
T3 - Quantitative Archaeology and Archaeological Modelling
SP - 215
EP - 228
BT - Discourse and Argumentation in Archaeology
A2 - Gonzalez-Perez, Cesar
A2 - Martin-Rodilla, Patricia
A2 - Pereira-Fariña, Martín
PB - Springer Nature Switzerland
ER -