In this paper, we introduce two remote extended reality (XR) research methods that can overcome the limitations of lab-based controlled experiments, especially during the COVID-19 pandemic: (1) a predictive model-based task analysis and (2) a large-scale video-based remote evaluation. We used a box stacking task including three interaction modalities - two multimodal gaze-based interactions as well as a unimodal hand-based interaction which is defined as our baseline. For the first evaluation, a GOMS-based task analysis was performed by analyzing the tasks to understand human behaviors in XR and predict task execution times. For the second evaluation, an online survey was administered using a series of the first-person point of view videos where a user performs the corresponding task with three interaction modalities. A total of 118 participants were asked to compare the interaction modes based on their judgment. Two standard questionnaires were used to measure perceived workload and the usability of the modalities.