Today Azure announces the general availability of Azure ND A100 v4 cloud GPU instances NVIDIA A100 Tensor Core GPUs– Achieve executive class supercomputing scalability in a public cloud. For discerning customers who are pursuing the next frontier in AI and High performance computing (HPC)Scalability is key to unlocking improved overall solution cost and time to solution.

Simply put, ND A100 v4 with NVIDIA A100 GPUs is designed to enable our most demanding customers to scale and scale without slowing down.

Benchmarking 164 ND A100 v4 virtual machines on a public supercomputing cluster prior to publication gave an HPL (High-Performance Linpack) result of 16.59 petaflops. This HPL result, delivered on a public cloud infrastructure, would fall in the top 20 of the top 500 fastest supercomputers in the world as of November 2020 or the top 10 in Europe based on the region where the Job has been carried out.

As measured by HPL-AI, a high-performance Linpack variant geared towards artificial intelligence (AI) and machine learning (ML), the same 164-VM pool achieved a petaflop result of 142.8, making it one of the top 5 fastest well-known AI supercomputers in the world measured using the official HPL AI benchmark list. Using just a fraction of a single Azure public cluster, these HPL results are among the best performing dedicated local supercomputing resources in the world.

And today, as the ND A100 v4 moves to general availability, we are announcing the immediate availability of the world’s fastest public cloud supercomputers on demand near you in four Azure regions: Eastern US, Western US 2, Western Europe, and South Central USA.

Starting with a single virtual machine (VM) and eight A100 Tensor Core GPUs based on the NVIDIA Ampere architecture, the ND A100 v4 VM series can host up to thousands of GPUs in a single cluster with an unprecedented 1.6 Tb connection bandwidth / s per scale VM is delivered over NVIDIA HDR InfiniBand links at 200 Gbps: one for each individual GPU. In addition, each 8-GPU VM has a full third-generation NVIDIA NVLink add-on that enables connectivity between GPU and GPU within the VM of more than 600 gigabytes per second.

Customers using the industry-standard HPC and AI tools and libraries can leverage the GPUs and unique connectivity features of ND A100 v4 without the need for specialized software or frameworks, using the same NVIDIA NCCL 2 libraries as the most scalable GPU-accelerated AI – and HPC workloads are supported out of the box, regardless of the underlying network topology or placement. Deploying VMs within the same VM scale set automatically configures the interconnect fabric.

Anyone can bring demanding on-premises AI and HPC workloads to the cloud via ND A100 v4 with minimal effort, but for customers who prefer an Azure-native approach, Azure machine learning provides an optimized virtual machine (pre-installed with the required drivers and libraries) and containerized environments optimized for the ND A100 v4 family. Sample recipes and Jupyter notebooks make it easy for users to get started quickly with various frameworks, including PyTorch, TensorFlow, and training state-of-the-art models like BERT. With Azure Machine Learning, customers have access to the same tools and functionality in Azure as our AI development teams.

Each NVIDIA A100 GPU delivers 1.7 to 3.2 times the performance of previous V100 GPUs and up to 20 times the performance overlaying new architectural features such as Mixed Accuracy, Sparsity, and Multi-Instance GPU (MIG) for specific workloads. The heart of every VM is a brand new 2nd generation AMD EPYC platform with PCI Express Gen 4.0 for the transfer from CPU to GPU, which is twice as fast as previous generations.

We can’t wait to see what you will create, analyze, and discover with the new Azure ND A100 v4 platform.

size

Physical CPU cores

Host storage (GB)

GPUs

Temporary NVMe local hard drive

NVIDIA InfiniBand network

Azure network

Standard_ND96asr_v4

96

900 GB

8 x 40 GB NVIDIA A100

6,500 GB

8 x 200 Gbit / s

40 Gbit / s

Learn more

.



Source link

Leave a Reply