Provably Convergent Policy Optimization via Metric-aware Trust Region Methods
Jun Song, Niao He, Lijun Ding, Chaoyue Zhao
Action editor: Amir-massoud Farahmand.
https://openreview.net/forum?id=jkTqJJOGMS
Mastodon is the best way to keep up with what's happening.
Follow anyone across the fediverse and see it all in chronological order. No algorithms, ads, or clickbait in sight.