Power allocation in spectrum sharing systems is challenging due to excessive interference that the secondary system could impose on the primary system. Therefore, an interference threshold constraint is considered to regulate the secondary system's activity. However, the primary receivers should measure the interference and inform the secondary users accordingly. These cause design complexities, e.g., due to transceiver's hardware impairments, and impose a substantial signaling overhead. We set our main goal to mitigate these requirements in order to make the spectrum sharing systems practically feasible. To cope with the lack of a model we develop a coexisting deep reinforcement learning approach for continuous power allocation in both systems. Importantly, via our solution, the two systems allocate power merely based on geographical location of their users. Moreover, the inter-system signaling requirement is reduced to exchanging only the number of primary users that their QoS requirements are violated. We observe that compared to a centralized agent that allocates power based on full (accurate) channel information, our solution is more robust and strictly guarantees QoS requirements of the primary users. This implies that both systems can operate simultaneously with almost-zero inter-system signaling overhead.