Fast Feature Sampling from Implicit Infinite-width Models

Sho Sonoda

Infinitely-wide models have been succeeded to facilitate the theoretical understanding of modern, large-scale and nonlinear models such as neural networks. As the number $p$ of features exceeds the size $n$ of the training dataset, the model tends to be linear and the optimization problem tends to be convex because the design matrix $S$ tends to be full row rank. A variety of recent over-parametrization schemes result in to estimate a pseudo-inverse operator $S^{\dagger}$. In this study, we establish a new fast sampling method that approximates a pseudo-inverse operator without explicitly computing it. Technically speaking, we develop the kernel mean embedding (KME), maximum mean discrepancy (MMD) and generalized kernel quadrature (GKQ) for parameter distributions that achieve a fast approximation rate $O(e^{-p})$, which is faster than the traditional Barron's rate. Convergence analysis with the local Rademacher complexity shows that our method can achieve a fast learning rate $\widetilde{O}(1/n)$.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment