Context-Specific Representation Abstraction for Deep Option Learning

AAAI

Cite Paper

Authors

Matthew Riemer
Gerald Tesauro
Miao Liu
Dong-Ki Kim
Marwa Abdulhai
Jonathan How

Published on

04/24/2022

Categories

AAAI

Hierarchical reinforcement learning has focused on discovering temporally extended actions, such as options, that can provide benefits in problems requiring extensive exploration. One promising approach that learns these options end-to-end is the option-critic (OC) framework. We examine and show in this paper that OC does not decompose a problem into simpler sub-problems, but instead increases the size of the search over policy space with each option considering the entire state space during learning. This issue can result in practical limitations of this method, including sample inefficient learning. To address this problem, we introduce Context-Specific Representation Abstraction for Deep Option Learning (CRADOL), a new framework that considers both temporal abstraction and context-specific representation abstraction to effectively reduce the size of the search over policy space. Specifically, our method learns a factored belief state representation that enables each option to learn a policy over only a subsection of the state space. We test our method against hierarchical, nonhierarchical, and modular recurrent neural network baselines, demonstrating significant sample efficiency improvements in challenging partially observable environments.

Please cite our work using the BibTeX below.

@misc{https://doi.org/10.48550/arxiv.2109.09876,
  doi = {10.48550/ARXIV.2109.09876},
  
  url = {https://arxiv.org/abs/2109.09876},
  
  author = {Abdulhai, Marwa and Kim, Dong-Ki and Riemer, Matthew and Liu, Miao and Tesauro, Gerald and How, Jonathan P.},
  
  keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
  
  title = {Context-Specific Representation Abstraction for Deep Option Learning},
  
  publisher = {arXiv},
  
  year = {2021},
  
  copyright = {arXiv.org perpetual, non-exclusive license}
}