I am not sure that everyone gets that RLHF is a form of supervised learning.
RLHF does not work without human labels (preferences) nor human curated training samples.
Mastodon is the best way to keep up with what's happening.
Follow anyone across the fediverse and see it all in chronological order. No algorithms, ads, or clickbait in sight.