Mattia Rigotti<p>This <a href="https://mastodon.social/tags/DeepRL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DeepRL</span></a> paper from University of Alberta seems quite cool:</p><p>"Deep reinforcement learning without experience replay, target networks, or batch updates"</p><p>As the title says, they succeeded in training deep RL networks in streaming setting getting rid of replay buffers.<br>The main tricks for that to work seem to be signal normalization and bounding the step-size 🤯</p><p>💻Code: <a href="http://github.com/mohmdelsayed/streaming-drl" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">github.com/mohmdelsayed/stream</span><span class="invisible">ing-drl</span></a><br>📄Paper: <a href="https://openreview.net/pdf?id=yqQJGTDGXN" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">openreview.net/pdf?id=yqQJGTDG</span><span class="invisible">XN</span></a></p><p><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/RL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RL</span></a> <a href="https://mastodon.social/tags/DeepLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DeepLearning</span></a></p>