Video block compressive sensing has been studied for use in resource constrained scenarios, such as wireless sensor networks, but the approach still suffers from low performance and long reconstruction time. Inspired by classical distributed video coding, we design a lightweight encoder with computationally intensive operations, such as video frame interpolation, performed at the decoder. Straying from recent trends in training end-to-end neural networks, we propose two algorithms that leverage convolutional neural network components to reconstruct video with greatly reduced reconstruction time. At the encoder, we leverage temporal correlation between frames and deploy adaptive techniques based on compressive measurements from previous frames. At the decoder, we exploit temporal correlation by using video frame interpolation and temporal differential pulse code modulation. Simulations show that our two proposed algorithms, VAL-VFI and VAL-IDA-VFI reconstruct higher quality video, achieving state-of-the-art performance.