ViViT

The official implementation of the Video Vision Transformer that applies a transformer architecture to video classification.

Keywords: computer vision, transformer, video understanding, video classification, machine learning, JAX

Implements the following papers