Recent strategies achieved ensembling for free by fitting concurrently diverse subnetworks inside a single base network. The main idea during training is that each subnetwork learns to classify only one of the multiple inputs simultaneously provided. However, the question of how these multiple inputs should be mixed has not been studied yet. In this paper, we introduce MixMo, a new generalized framework for learning multi-input multi-output deep subnetworks. Our key motivation is to replace the suboptimal summing operation hidden in previous approaches by a more appropriate mixing mechanism. For that purpose, we draw inspiration from successful mixed sample data augmentations. We show that binary mixing in features - particularly with patches from CutMix - enhances results by making subnetworks stronger and more diverse. We improve state of the art on the CIFAR-100 and Tiny-ImageNet classification datasets. In addition to being easy to implement and adding no cost at inference, our models outperform much costlier data augmented deep ensembles. We open a new line of research complementary to previous works, as we operate in features and better leverage the expressiveness of large networks.