Points of non-linearity of functions generated by random neural networks

David Holmes

We consider functions from the real numbers to the real numbers, output by a neural network with 1 hidden activation layer, arbitrary width, and ReLU activation function. We assume that the parameters of the neural network are chosen uniformly at random with respect to various probability distributions, and compute the expected distribution of the points of non-linearity. We use these results to explain why the network may be biased towards outputting functions with simpler geometry, and why certain functions with low information-theoretic complexity are nonetheless hard for a neural network to approximate.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment