Multi-regional interaction among neuronal populations underlies the brain's processing of rich sensory information in our daily lives. Recent neuroscience and neuroimaging studies have increasingly used naturalistic stimuli and experimental design to identify such realistic sensory computation in the brain. However, existing methods for cross-areal interaction analysis with dimensionality reduction, such as reduced-rank regression and canonical correlation analysis, have limited applicability and interpretability in naturalistic settings because they usually do not appropriately 'demix' neural interactions into those associated with different types of task parameters or stimulus features (e.g., visual or audio). In this paper, we develop a new method for cross-areal interaction analysis that uses the rich task or stimulus parameters to reveal how and what types of information are shared by different neural populations. The proposed neural demixed shared component analysis combines existing dimensionality reduction methods with a practical neural network implementation of functional analysis of variance with latent variables, thereby efficiently demixing nonlinear effects of continuous and multimodal stimuli. We also propose a simplifying alternative under the assumptions of linear effects and unimodal stimuli. To demonstrate our methods, we analyzed two human neuroimaging datasets of participants watching naturalistic videos of movies and dance movements. The results demonstrate that our methods provide new insights into multi-regional interaction in the brain during naturalistic sensory inputs, which cannot be captured by conventional techniques.