Analyzing the Hidden Activations of Deep Policy Networks: Why Representation Matters

Trevor A. McInroe, Michael Spurrier, Jennifer Sieber, Stephen Conneely

We analyze the hidden activations of neural network policies of deep reinforcement learning (RL) agents and show, empirically, that it's possible to know a priori if a state representation will lend itself to fast learning. RL agents in high-dimensional states have two main learning burdens: (1) to learn an action-selection policy and (2) to learn to discern between useful and non-useful information in a given state. By learning a latent representation of these high-dimensional states with an auxiliary model, the latter burden is effectively removed, thereby leading to accelerated training progress. We examine this phenomenon across tasks in the PyBullet Kuka environment, where an agent must learn to control a robotic gripper to pick up an object. Our analysis reveals how neural network policies learn to organize their internal representation of the state space throughout training. The results from this analysis provide three main insights into how deep RL agents learn. First, a well-organized internal representation within the policy network is a prerequisite to learning good action-selection. Second, a poor initial representation can cause an unrecoverable collapse within a policy network. Third, a good initial representation allows an agent's policy network to organize its internal representation even before any training begins.

Knowledge Graph



Sign up or login to leave a comment