Network function virtualization (NFV) and content caching are two promising technologies that hold great potential for network operators and designers. This paper optimizes the deployment of NFV and content caching in 5G networks and focuses on the associated power consumption savings. In addition, it introduces an approach to combine content caching with NFV in one integrated architecture for energy aware 5G networks. A mixed integer linear programming (MILP) model has been developed to minimize the total power consumption by jointly optimizing the cache size, virtual machine (VM) workload, and the locations of both cache nodes and VMs. The results were investigated under the impact of core network virtual machines (CNVMs) inter-traffic. The result show that the optical line terminal (OLT) access network nodes are the optimum location for content caching and for hosting VMs during busy times of the day whilst IP over WDM core network nodes are the optimum locations for caching and VM placement during off-peak time. Furthermore, the results reveal that a virtualization-only approach is better than a caching-only approach for video streaming services where the virtualization-only approach compared to caching-only approach, achieves a maximum power saving of 7% (average 5%) when no CNVMs inter-traffic is considered and 6% (average 4%) with CNVMs inter-traffic at 10% of the total backhaul traffic. On the other hand, the integrated approach has a maximum power saving of 15% (average 9%) with and without CNVMs inter-traffic compared to the virtualization-only approach, and it achieves a maximum power saving of 21% (average 13%) without CNVMs inter-traffic and 20% (average 12%) when CNVMs inter-traffic is considered compared with the caching-only approach. In order to validate the MILP models and achieve real-time operation in our approaches, a heuristic was developed.