Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach

Alexandros Haliassos, Kriton Konstantinidis, Danilo P. Mandic

Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks, characterized by a lack of inherent ordering of features (variables). The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (Curse of Dimensionality). To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor, the order of which is equal to the number of features; for efficiency, it can be further factorized into a compact Tensor Train (TT) format. However, both TT and other Tensor Networks (TNs), such as Tensor Ring and Hierarchical Tucker, are sensitive to the ordering of their indices (and hence to the features). To establish the desired invariance to feature ordering, we propose to represent the weight tensor through the Canonical Polyadic (CP) Decomposition (CPD), and introduce the associated inference and learning algorithms, including suitable regularization and initialization schemes. It is demonstrated that the proposed CP-based predictor significantly outperforms other TN-based predictors on sparse data while exhibiting comparable performance on dense non-sequential tasks. Furthermore, for enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors. In conjunction with feature vector normalization, this is shown to yield dramatic improvements in performance for dense non-sequential tasks, matching models such as fully-connected neural networks.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment