Reward Potentials for Planning with Learned Neural Network Transition Models
Loading...
Date
Authors
Say, Buser
Sanner, Scott
Thiebaux, Sylvie
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Abstract
Optimal planning with respect to learned neural network
(NN) models in continuous action and state spaces using mixed-integer
linear programming (MILP) is a challenging task for branch-and-bound
solvers due to the poor linear relaxation of the underlying MILP model.
For a given set of features, potential heuristics provide an efficient framework for computing bounds on cost (reward) functions. In this paper, we
model the problem of finding optimal potential bounds for learned NN
models as a bilevel program, and solve it using a novel finite-time constraint generation algorithm. We then strengthen the linear relaxation
of the underlying MILP model by introducing constraints to bound the
reward function based on the precomputed reward potentials. Experimentally, we show that our algorithm efficiently computes reward potentials for learned NN models, and that the overhead of computing reward
potentials is justified by the overall strengthening of the underlying MILP
model for the task of planning over long horizons.
Description
Citation
Collections
Source
Type
Book Title
Entity type
Access Statement
License Rights
Restricted until
2099-12-31
Downloads
File
Description