Co-attention enabled content-based image retrieval

Research output: Contribution to journalArticlepeer-review

Abstract

Content-based image retrieval (CBIR) aims to provide the most similar images to a given query. Feature extraction plays an essential role in retrieval performance within a CBIR pipeline. Current CBIR studies would either uniformly extract feature information from the input image and use them directly or employ some trainable spatial weighting module which is then used for similarity comparison between pairs of query and candidate matching images. These spatial weighting modules are normally query non-sensitive and only based on the knowledge learned during the training stage. They may focus towards incorrect regions, especially when the target image is not salient or is surrounded by distractors. This paper proposes an efficient query sensitive co-attention\footnote{``Co-attention'' in this paper refers to spatial attention conditioned on the query content.} mechanism for large-scale CBIR tasks. In order to reduce the extra computation cost required by the query sensitivity to the co-attention mechanism, the proposed method employs clustering of the selected local features. Experimental results indicate that the co-attention maps can provide the best retrieval results on benchmark datasets under challenging situations, such as having completely different image acquisition conditions between the query and its match image.
Original languageEnglish
Pages (from-to)245-263
Number of pages18
JournalNeural Networks
Volume164
Issue number7
Early online date8 May 2023
DOIs
Publication statusPublished - 1 Jul 2023

Bibliographical note

© 2023 The Author(s).

Cite this