Marco Armenta, Thierry Judge, Nathan Painchaud, Youssef Skandarani, Carl Lemaire, Gabriel Gibeau Sanchez, Philippe Spino, Pierre-Marc Jodoin

In this paper, we explore a process called neural teleportation, a mathematical consequence of applying quiver representation theory to neural networks. Neural teleportation "teleports" a network to a new position in the weight space, while leaving its function unchanged. This concept generalizes the notion of positive scale invariance of ReLU networks to any network with any activation functions and any architecture. In this paper, we shed light on surprising and counter-intuitive consequences neural teleportation has on the loss landscape. In particular, we show that teleportation can be used to explore loss level curves, that it changes the loss landscape, sharpens global minima and boosts back-propagated gradients. From these observations, we demonstrate that teleportation accelerates training when used during initialization regardless of the model, its activation function, the loss function, and the training data. Our results can be reproduced with the code available here:

Knowledge Graph



Sign up or login to leave a comment