Oral cancer has more than 83% survival rate if detected in its early stages, however, only 29% of cases are currently detected early. Deep learning techniques can detect patterns of oral cancer cells and can aid in its early detection. In this work, we present the first results of neural networks for oral cancer detection using microscopic images. We compare numerous state-of-the-art models via transfer learning approach and collect and release an augmented dataset of high-quality microscopic images of oral cancer. We present a comprehensive study of different models and report their performance on this type of data. Overall, we obtain a 10-15% absolute improvement with transfer learning methods compared to a simple Convolutional Neural Network baseline. Ablation studies show the added benefit of data augmentation techniques with finetuning for this task.