Hierarchical modulation (HM) is able to provide different levels of protection for data streams and achieve a rate region that cannot be realized by traditional orthogonal schemes, such as time division (TD). Nevertheless, criterions and algorithms for general HM design are not available in existing literatures. In this paper, we jointly optimize the constellation positions and binary labels for HM to be used in additive white gaussian noise (AWGN) channel. Based on bit-interleaved coded modulation (BICM) with successive interference cancellation (SIC) capacity, our main purpose is to maximize the rate of one data stream, with power constrains and the constrain that the rate of other data streams should be larger than given thresholds. Multi-start interior-point algorithm is used to carry out the constellation optimization problems and methods to reduce optimization complexity are also proposed in this paper. Numerical results verify the performance gains of optimized HM compared with optimized quadrature amplidude modulation (QAM) based HM and other orthogonal transmission methods.