sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

588
active users

Taylor W. Killian

🎉Last week, amid the rush, we were informed that our paper (w/ Sonali Parbhoo and Marzyeh Ghassemi): "Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning" was accepted to
! 🎉

In safety-critical environments, such as healthcare, it is important to be mindful of worst case outcomes when assessing which actions to avoid. With an estimated distribution over expected return, we can use the conditional value at risk (CVaR) to characterize this (2/7)

We use distributional RL to give this rich representation of possible outcomes for each action and use the CVaR to assess the risk of any action leading to a dead-end. Extending our prior work on dead-end discovery (tinyurl.com/Neurips21DeD) we introduce DistDeD! (3/7)

There are 2 immediate benefits of using CVaR and distributional RL in dead-end discovery:
1) We enable *even earlier indication* of when things may go wrong.
2) The implementation of DistDeD is tunable in order to account for specific aspects of the intended use case. (4/7)

The improvements we've made with DistDeD are exciting! This is one promising direction to make RL useful in the real-world.

We are working with several clinical collaborators to determine how we can best use DistDeD. Lots to come in the near future, stay tuned! (5/7)

There are lots of people to thank. Foremost, I wanted to publicly acknowledge those whose enthusiasm and encouragement helped push me to continue building upon these ideas. Thank you Mehdi Fatemi, Marc Bellemare, Will Dabney, Vinith Suriyakumar, and Haoran Zhang. (6/7)