sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

734
active users

AM_Stark

Unified Visual Relationship Detection with Vision and Language Models

VLM for scene understanding (VRD). DETR-like object detector (with bounding box prediction) and Perceiver Resampler for relationship decoder.

My summary on HFPapers: huggingface.co/papers/2303.089
arXiv: arxiv.org/abs/2303.08998

huggingface.coPaper page - Unified Visual Relationship Detection with Vision and Language ModelsJoin the discussion on this paper page