Nguyen, Phuong; Sunehag, Peter; Hutter, Marcus
Following a recent surge in using history-based methods for resolving perceptual aliasing in reinforcement learning, we introduce an algorithm based on the feature reinforcement learning framework called ΦMDP . To create a practical algorithm we devise a stochastic search procedure for a class of context trees based on parallel tempering and a specialized proposal distribution. We provide the first empirical evaluation for ΦMDP. Our proposed algorithm achieves superior performance to the...[Show more]
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.