Classical Planning Algorithms on the Atari Video Games
Date
2015
Authors
Lipovetzky, Nir
Ramirez, Miguel
Geffner, Hector
Journal Title
Journal ISSN
Volume Title
Publisher
AAAI Press
Abstract
The Atari 2600 games supported in the Arcade Learning Environment [Bellemare et al., 2013] all feature a known initial (RAM) state and actions that have deterministic effects. Classical planners, however, cannot be used off-the-shelf as there is no compact PDDL-model of the games, and action effects and goals are not known a priori. Indeed, there are no explicit goals, and the planner must select actions on-line while interacting with a simulator that returns successor states and rewards. None of this precludes the use of blind lookahead algorithms for action selection like breadth-first search or Dijkstra's yet such methods are not effective over large state spaces. We thus turn to a different class of planning methods introduced recently that have been shown to be effective for solving large planning problems but which do not require prior knowledge of state transitions, costs (rewards) or goals. The empirical results over 54 Atari games show that the simplest such algorithm performs at the level of UCT, the state-of-the-art planning method in this domain, and suggest the potential of width-based methods for planning with simulators when factored, compact action models are not available
Description
Keywords
Citation
Collections
Source
Classical Planning Algorithms on the Atari Video Games
Type
Conference paper
Book Title
Entity type
Access Statement
License Rights
DOI
Restricted until
2037-12-31