Caching of popular content during off-peak hours is a strategy to reduce network loads during peak hours. Recent work has shown significant benefits of designing such caching strategies not only to deliver part of the content locally, but also to provide coded multicasting opportunities even among users with different demands. Exploiting both of these gains was shown to be approximately optimal for caching systems with a single layer of caches. Motivated by practical scenarios, we consider in this work a hierarchical content delivery network with two layers of caches. We propose a new caching scheme that combines two basic approaches. The first approach provides coded multicasting opportunities within each layer; the second approach provides coded multicasting opportunities across multiple layers. By striking the right balance between these two approaches, we show that the proposed scheme achieves the optimal communication rates to within a constant multiplicative and additive gap. We further show that there is no tension between the rates in each of the two layers up to the aforementioned gap. Thus, both layers can simultaneously operate at approximately the minimum rate.