The official implementation of the Video Vision Transformer that applies a transformer architecture to video classification.
Keywords: computer vision, transformer, video understanding, video classification, machine learning, JAX