Multi-Modal Foundation Model for Simultaneous Comprehension of Molecular Structure and Properties

Jinho Chang, Jong Chul Ye

Recently, deep learning approaches have been extensively studied for various problems in chemistry, such as property prediction, virtual screening, de novo molecule design, etc. Despite the impressive successes, separately designed networks for specific tasks are usually required for end-to-end training, so it is often difficult to acquire a unified principle to synergistically combine existing models and training datasets for novel tasks. To address this, here we present a novel multimodal chemical foundation model that can be used for various downstream tasks that require a simultaneous understanding of structure and property. Specifically, inspired by recent advances in pre-trained multi-modal foundation models such as Vision-Language Pretrained models (VLP), we proposed a novel structure-property multi-modal (SPMM) foundation model using the dual-stream transformer with X-shape attention, so that it can align the molecule structure and the chemical properties in a common embedding space. Thanks to the outstanding structure-property unimodal representation, experimental results confirm that SPMM can simultaneously perform molecule generation, property prediction, classification, reaction prediction, etc., which was previously not possible with a single architecture.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment