And the year has barely started!
RT @MishaLaskin@twitter.com
In-context RL at scale. After online pre-training, the agent solves new tasks entirely in-context like an LLM and works in a complex domain. One of the most interesting RL results of the year. https://twitter.com/FeryalMP/status/1616035293064462338
: https://twitter.com/MishaLaskin/status/1616066421582176258