Center Smoothing: Certified Robustness for Networks with Structured Outputs

Aounon Kumar, Tom Goldstein

The study of provable adversarial robustness has mostly been limited to classification tasks and models with one-dimensional real-valued outputs. We extend the scope of certifiable robustness to problems with more general and structured outputs like sets, images, language, etc. We model the output space as a metric space under a distance/similarity function, such as intersection-over-union, perceptual similarity, total variation distance, etc. Such models are used in many machine learning problems like image segmentation, object detection, generative models, image/audio-to-text systems, etc. Based on a robustness technique called randomized smoothing, our $\textit{center smoothing}$ procedure can produce models with the guarantee that the change in the output, as measured by the distance metric, remains small for any norm-bounded adversarial perturbation of the input. We apply our method to create certifiably robust models with disparate output spaces - from sets to images - and show that it yields meaningful certificates without significantly degrading the performance of the base model. Code for our experiments is available at:

Knowledge Graph



Sign up or login to leave a comment