Learning Sequential Descriptors for Sequence-based Visual Place Recognition
Need a robust sequential descriptor? Use (Net)VLAD, you can stack patch descriptors temporally as well (along with the default spatial stacking).
My summary on HFPapers: https://huggingface.co/papers/2207.03868#64fdc21d8f76d450695a6941
arXiv: https://arxiv.org/abs/2207.03868