Fine Tuned ViT on Food101 ๐ฎ๐ฃ๐๐ฃ๐
ViT feature extractor computer vision model to classify images of classes Food101 dataset.
Examples
Training Details
This model was fine-tuned on the Food-101 dataset using a pretrained Vision Transformer (ViT) in PyTorch with.
Final Result
- Top-1 Accuracy: 89%
- Total Training Time: 3:26:16
- Test loss: test_loss=1.16616
- Train loss: test_loss=1.83015
- Batch size: 128
- Num epochs: 40
- Hardware: NVIDIA DGX Spark