Variational inference (VI) and Markov chain Monte Carlo (MCMC) are two main approximate approaches for learning deep generative models by maximizing marginal likelihood. In this paper, we propose using annealed importance sampling for learning deep generative models. Our proposed approach bridges VI with MCMC. It generalizes VI methods such as variational auto-encoders and importance weighted auto-encoders (IWAE) and the MCMC method proposed in (Hoffman, 2017). It also provides insights into why running multiple short MCMC chains can help learning deep generative models. Through experiments, we show that our approach yields better density models than IWAE and can effectively trade computation for model accuracy without increasing memory cost.