sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

677
active users

#vlp

0 posts0 participants0 posts today

Went back to BLIP (arxiv.org/abs/2201.12086) last night. When I first skimmed it, I focused on the part of the paper focused on bootstrapping captions, but the "Multimodal mixture of Encoder-Decoder" architecture is pretty cool.

It uses a structured architecture involving multiple encoder/decoders wherein some parts of the architecture take advantage of others (e.g. using the contrastive loss for hard example mining for the image-text matching loss).