In this work, we consider the problem of finding globally optimal joint successive interference cancellation (SIC) ordering and power allocation (JSPA) for the general sum-rate maximization problem in downlink multi-cell NOMA systems. We propose a globally optimal solution based on the exploration of base stations (BSs) power consumption and distributed power allocation. The proposed centralized algorithm is still exponential in the number of BSs, however scales well with larger number of users. For any suboptimal decoding order, we address the problem of joint rate and power allocation (JRPA) to achieve maximum users sum-rate. Furthermore, we design semi-centralized and distributed JSPA frameworks with polynomial time complexity. Numerical results show that the optimal decoding order results in significant performance gains in terms of outage probability and users total spectral efficiency compared to the channel-to-noise ratio (CNR)-based decoding order known from single-cell NOMA. Moreover, it is shown that the performance gap between our proposed centralized and semi-centralized frameworks is quite low. Therefore, the low-complexity semi-centralized framework with near-to-optimal performance is a good choice for larger number of BSs and users.