Unmanned Aerial Vehicles (UAVs) have attracted considerable research interest recently. Especially when it comes to the realm of Internet of Things, the UAVs with Internet connectivity are one of the main demands. Furthermore, the energy constraint i.e. battery limit is a bottle-neck of the UAVs that can limit their applications. We try to address and solve the energy problem. Therefore, a path planning method for a cellular-connected UAV is proposed that will enable the UAV to plan its path in an area much larger than its battery range by getting recharged in certain positions equipped with power stations (PSs). In addition to the energy constraint, there are also no-fly zones; for example, due to Air to Air (A2A) and Air to Ground (A2G) interference or for lack of necessary connectivity that impose extra constraints in the trajectory optimization of the UAV. No-fly zones determine the infeasible areas that should be avoided. We have used a reinforcement learning (RL) hierarchically to extend typical short-range path planners to consider battery recharge and solve the problem of UAVs in long missions. The problem is simulated for the UAV that flies over a large area, and Q-learning algorithm could enable the UAV to find the optimal path and recharge policy.