Using Undervolting as an On-Device Defense Against Adversarial Machine Learning Attacks

Saikat Majumdar, Mohammad Hossein Samavatian, Kristin Barber, Radu Teodorescu

Deep neural network (DNN) classifiers are powerful tools that drive a broad spectrum of important applications, from image recognition to autonomous vehicles. Unfortunately, DNNs are known to be vulnerable to adversarial attacks that affect virtually all state-of-the-art models. These attacks make small imperceptible modifications to inputs that are sufficient to induce the DNNs to produce the wrong classification. In this paper we propose a novel, lightweight adversarial correction and/or detection mechanism for image classifiers that relies on undervolting (running a chip at a voltage that is slightly below its safe margin). We propose using controlled undervolting of the chip running the inference process in order to introduce a limited number of compute errors. We show that these errors disrupt the adversarial input in a way that can be used either to correct the classification or detect the input as adversarial. We evaluate the proposed solution in an FPGA design and through software simulation. We evaluate 10 attacks on two popular DNNs and show an average detection rate of 80% to 95%.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment