The Gradient @thegradient

Recent searches

Search options

Only available when logged in.

**The Sigmoid Data Therapist** @datatherapist2 · Jan 21, 2023

Jan 21, 2023

The Sigmoid Data Therapist @datatherapist2

@janeadams @jfpuget a bit context to the riddle:

Student uses a pretrained Bert as text vectorizer of their PyTorch model’s input. Not trying to continue training Bert. When they initialize their NN, they set the bert model to evaluation mode (assume they also did all @jfpuget suggested). Then, they call their own model’s .train() and print out the Bert token ids and embeddings in the .forward() — where they get same ids but different embs in different runs.

How can this happen?

The Sigmoid Data Therapist @datatherapist2@sigmoid.social

@janeadams @jfpuget I’ll probably give it one more day before sharing the (perhaps somewhat disappointing?) current solution tomorrow. So cast your answers today!

#DeepLearning #NLProc #bert

Jan 22, 2023, 12:26 AM··Metatext

0boosts·0favorites

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back