In this work, we deal with resource allocation in the downlink of spatial multiplexing MIMO-OFDMA systems. In particular, we concentrate on the problem of jointly optimizing the transmit and receive processing matrices, the channel assignment and the power allocation with the objective of minimizing the total power consumption while satisfying different quality-of-service requirements. A layered architecture is used in which users are first partitioned in different groups on the basis of their channel quality and then channel assignment and transceiver design are sequentially addressed starting from the group of users with most adverse channel conditions. The multi-user interference among users belonging to different groups is removed at the base station using a Tomlinson-Harashima pre-coder operating at user level. Numerical results are used to highlight the effectiveness of the proposed solution and to make comparisons with existing alternatives.