Evaluating Neural Machine Comprehension Model Robustness to Noisy Inputs and Adversarial Attacks

Winston Wu, Dustin Arendt, Svitlana Volkova

We evaluate machine comprehension models' robustness to noise and adversarial attacks by performing novel perturbations at the character, word, and sentence level. We experiment with different amounts of perturbations to examine model confidence and misclassification rate, and contrast model performance in adversarial training with different embedding types on two benchmark datasets. We demonstrate improving model performance with ensembling. Finally, we analyze factors that effect model behavior under adversarial training and develop a model to predict model errors during adversarial attacks.

Knowledge Graph



Sign up or login to leave a comment