#### Estimating Full Lipschitz Constants of Deep Neural Networks

##### Calypso Herrera, Florian Krach, Josef Teichmann

We estimate the Lipschitz constants of the gradient of a deep neural network and the network itself with respect to the full set of parameters. We first develop estimates for a deep feed-forward densely connected network and then, in a more general framework, for all neural networks that can be represented as solutions of controlled ordinary differential equations, where time appears as continuous depth. These estimates can be used to set the step size of stochastic gradient descent methods, which is illustrated for one example method.

arrow_drop_up