For enabling automatic deployment and management of cellular networks, the concept of self-organizing network (SON) was introduced. SON capabilities can enhance network performance, improve service quality, and reduce operational and capital expenditure (OPEX/CAPEX). As an important component in SON, self-healing is defined as a network paradigm where the faults of target networks are mitigated or recovered by automatically triggering a series of actions such as detection, diagnosis and compensation. Data-driven machine learning has been recognized as a powerful tool to bring intelligence into network and to realize self-healing. However, there are major challenges for practical applications of machine learning techniques for self-healing. In this article, we first classify these challenges into five categories: 1) data imbalance, 2) data insufficiency, 3) cost insensitivity, 4) non-real-time response, and 5) multi-source data fusion. Then we provide potential technical solutions to address these challenges. Furthermore, a case study of cost-sensitive fault detection with imbalanced data is provided to illustrate the feasibility and effectiveness of the suggested solutions.