Causal modeling has been recognized as a potential solution to many challenging problems in machine learning (ML). Here, we describe how a recently proposed counterfactual approach developed to deconfound linear structural causal models can still be used to deconfound the feature representations learned by deep neural network (DNN) models. The key insight is that by training an accurate DNN using softmax activation at the classification layer, and then adopting the representation learned by the last layer prior to the output layer as our features, we have that, by construction, the learned features will fit well a (multi-class) logistic regression model, and will be linearly associated with the labels. As a consequence, deconfounding approaches based on simple linear models can be used to deconfound the feature representations learned by DNNs. We validate the proposed methodology using colored versions of the MNIST dataset. Our results illustrate how the approach can effectively combat confounding and improve model stability in the context of dataset shifts generated by selection biases.