Abstract
We study the problem of safe learning and exploration in sequential control
problems. The goal is to safely collect data samples from an operating
environment to learn an optimal controller. A central challenge in this setting
is how to quantify uncertainty in order to choose provably-safe actions that
allow us to collect useful data and reduce uncertainty, thereby achieving both
improved safety and optimality. To address this challenge, we present a deep
robust regression model that is trained to directly predict the uncertainty
bounds for safe exploration. We then show how to integrate our robust
regression approach with model-based control methods by learning a dynamic
model with robustness bounds. We derive generalization bounds under domain
shifts for learning and connect them with safety and stability bounds in
control. We demonstrate empirically that our robust regression approach can
outperform conventional Gaussian process (GP) based safe exploration in
settings where it is difficult to specify a good GP prior.
Users
Please
log in to take part in the discussion (add own reviews or comments).