Single-round Self-supervised Distributed Learning using Vision Transformer

Sangjoon Park, Ik-Jae Lee, Jun Won Kim, Jong Chul Ye

Despite the recent success of deep learning in the field of medicine, the issue of data scarcity is exacerbated by concerns about privacy and data ownership. Distributed learning approaches, including federated learning, have been investigated to address these issues. However, they are hindered by the need for cumbersome communication overheads and weaknesses in privacy protection. To tackle these challenges, we propose a self-supervised masked sampling distillation method for the vision transformer. This method can be implemented without continuous communication and can enhance privacy by utilizing a vision transformer-specific encryption technique. We conducted extensive experiments on two different tasks, which demonstrated the effectiveness of our method. We achieved superior performance compared to the existing distributed learning strategy as well as the fine-tuning only baseline. Furthermore, since the self-supervised model created using our proposed method can achieve a general semantic understanding of the image, we demonstrate its potential as a task-agnostic self-supervised foundation model for various downstream tasks, thereby expanding its applicability in the medical domain.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment