Fast.AI Breaks ImageNet Record with NVIDIA V100 Tensor Core GPUs

Nefi Alarcon

Researchers from fast.ai announced a new speed record for training ImageNet to 93 percent accuracy in only 18 minutes. Fast.ai alumni Andrew Shaw…

NVIDIA

•

Nefi Alarcon

•2 min read•advanced•

--

•View Original

AWSPyTorch

Overview

Researchers from fast.ai have set a new speed record for training ImageNet, achieving 93 percent accuracy in just 18 minutes using NVIDIA V100 Tensor Core GPUs. This accomplishment was made possible through the use of AWS cloud resources, fastai, cuDNN, and PyTorch libraries, along with the NVIDIA Collective Communications Library for distributed computation.

What You'll Learn

1

How to train ImageNet efficiently using NVIDIA V100 Tensor Core GPUs

2

Why using distributed computation can significantly speed up model training

3

When to apply fastai and PyTorch for deep learning projects

Prerequisites & Requirements

Understanding of deep learning concepts and frameworks
Familiarity with AWS and its services(optional)

Key Questions Answered

How fast did fast.ai train ImageNet to achieve 93 percent accuracy?

Fast.ai trained ImageNet to 93 percent accuracy in just 18 minutes, marking a new speed record in the process. This was achieved using 128 NVIDIA Tesla V100 Tensor Core GPUs on the AWS cloud.

What technologies were used in the fast.ai ImageNet training?

The training utilized NVIDIA Tesla V100 Tensor Core GPUs, fastai, cuDNN, and PyTorch libraries, along with the NVIDIA Collective Communications Library for distributed computation. These tools facilitated efficient training and model monitoring.

What was the cost of the compute resources used for the ImageNet training?

The total compute cost for the training was $40, which was achieved using 16 AWS instances. This demonstrates the affordability of leveraging cloud resources for advanced AI research.

How much faster was the new record compared to the previous one?

The new record set by fast.ai is 40% faster than the previous record for training ImageNet, showcasing significant advancements in training efficiency.

Key Statistics & Figures

Training time to achieve 93% accuracy

18 minutes

This is the time taken to train ImageNet, setting a new speed record.

Percentage improvement over previous record

40%

The new record is 40% faster than the previous training time for ImageNet.

Total compute cost

$40

This cost was incurred while using 16 AWS instances for the training.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Hardware

Nvidia Tesla V100 Tensor Core Gpus

Used for accelerating the training of ImageNet.

Software Library

Fastai

Framework used for implementing the training process.

Software Library

Cudnn

Accelerated the deep learning computations.

Software Library

Pytorch

Framework used for building and training the neural network models.

Software Library

Nvidia Collective Communications Library

Facilitated distributed computation through efficient communication between GPUs.

Key Actionable Insights

1
Leverage NVIDIA V100 Tensor Core GPUs for deep learning projects to achieve faster training times.
Using advanced GPUs can drastically reduce the time required for model training, making it feasible to iterate quickly and improve model performance.

2
Utilize distributed computation techniques to enhance the scalability of your machine learning models.
By implementing distributed training using tools like NCCL and PyTorch, teams can handle larger datasets and achieve better results in less time.

3
Consider using cloud services like AWS for cost-effective access to powerful computing resources.
Cloud platforms can provide the necessary infrastructure for intensive computations without the need for significant upfront investment in hardware.

Common Pitfalls

1

Underestimating the importance of distributed computation in deep learning.

Many practitioners may attempt to train models on single instances, which can lead to longer training times and missed opportunities for optimization.