@matthen2 comparing different norm layers would be very interesting. Though unclear how to factor out the effect of lr.
Do you plan on publishing the code at some point? Want to know if I need to start working on repro or can just wait :-)
Mastodon is the best way to keep up with what's happening.
Follow anyone across the fediverse and see it all in chronological order. No algorithms, ads, or clickbait in sight.