The polar codes are proven to be capacity-achieving and are shown to have equivalent or even better finite-length performance than the turbo/LDPC codes under some improved decoding algorithms over the additive white Gaussian noise (AWGN) channels. Polar coding is based on the so-called channel polarization phenomenon induced by a transform over the underlying binary-input channel. The channel polarization is found to be universal in many signal processing problems and has been applied to the coded modulation schemes. In this paper, the channel polarization is further extended to the multiple antenna transmission following a multilevel coding principle. The multiple-input multile-output (MIMO) channel under quadrature amplitude modulation (QAM) are transformed into a series of synthesized binary-input channels under a three-stage channel transform. Based on this generalized channel polarization, the proposed space-time polar coded modulation (STPCM) scheme allows a joint optimization of the binary polar coding, modulation and MIMO transmission. In addition, a practical solution of polar code construction over the fading channels is also provided, where the fading channels are approximated by an AWGN channel which shares the same capacity with the original. The simulations over the MIMO channel with uncorrelated Rayleigh fast fading show that the proposed STPCM scheme can outperform the bit-interleaved turbo coded scheme in all the simulated cases, where the latter is adopted in many existing communication systems.