Over the recent years, deep learning has become the mainstream data-driven approach to solve many real-world problems in many important areas. Among the successful network architectures, shortcut connections are well established to take the outputs of earlier layers as additional inputs to later layers, which have produced excellent results. Despite the extraordinary power, there remain important questions on the underlying mechanism and associated functionalities regarding shortcuts. For example, why are the shortcuts powerful? How to tune the shortcut topology to optimize the efficiency and capacity of the network model? Along this direction, here we first demonstrate a topology of shortcut connections that can make a one-neuron-wide deep network approximate any univariate function. Then, we present a novel width-bounded universal approximator in contrast to depth-bounded universal approximators. Next we demonstrate a family of theoretically equivalent networks, corroborated by the concerning statistical significance experiments, and their graph spectral characterization, thereby associating the representation ability of neural network with their graph spectral properties. Furthermore, we shed light on the effect of concatenation shortcuts on the margin-based multi-class generalization bound of deep networks. Encouraged by the positive results from the bounds analysis, we instantiate a slim, sparse, and shortcut network (S3-Net) and the experimental results demonstrate that the S3-Net can achieve better learning performance than the densely connected networks and other state-of-the-art models on some well-known benchmarks.