In this paper, we consider a mobile edge computing system that provides computing services by cloud server and edge server collaboratively. The mobile edge computing can both reduce service delay and ease the load on the core network. We model the problem of maximizing the average system revenues with the average delay constraints for different priority service as a constrained semi-Markov decision process (SMDP). We propose an actor-critic algorithm with eligibility traces to solve the constrained SMDP. We use neural networks to train the policy parameters and the state value function's parameters to continuously improve the system performance.