The Gradient @thegradient

**Ross Wightman** @rwightman · Nov 29, 2022 *

Nov 29, 2022 *

Which B/16 reigns supreme? I've recently fine-tuned quite a few new ViT models and wanted to compare them. With new multi-weight support on the way I realized timm will soon have ~20 different B/16 (or close to). B/16 is the most common ViT model and easiest to compare across wide range of pretrain datasets and methods. In the lead is BEiT v2, but hot on its heels are fine-tuned LAION2B and OpenAI CLIP image towers. Check out a notebook at https://colab.research.google.com/drive/12u1csH7_Uun78lGti35zvi5-S6FX4ZKu?usp=sharing #CV #machinelearning #vit #AI

Ross Wightman @rwightman@sigmoid.social

This is also my first time experimenting with ImageNet-X (https://facebookresearch.github.io/imagenetx/site/home), a lot to unpack here, but I hope to explore more models in timm further with this soon... #imagenet

Nov 29, 2022, 04:35 PM··Web

3boosts·3favorites

**Ross Wightman** @rwightman · Nov 29, 2022

Nov 29, 2022

Ross Wightman @rwightman

And finally, many of these models are already in timm, but the CLIP tower weights are currently being added. Check out the progress in https://github.com/rwightman/pytorch-image-models/pull/1520 and watch the #huggingface hub https://huggingface.co/timm

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back