Oracle provides cloud service with Nvidia A100 GPUs
Oracle makes Nvidia’s A100 GPUs available through its Oracle Cloud Infrastructure. Users can scale up computing tasks for high performance computing to more than 500 GPUs by setting up a network via Mellanox RDMA.
The GPU4.8 instances of the Oracle Cloud Infrastructure consist of eight A100 Tensor Core GPUs, combined with an AMD Rome processor with 64 cores and a clock speed of 2.9GHz, 2048GB ram and 24TB NVME storage. The instances can be part of Oracle’s Cluster Network. Customers can form clusters of more than 500 GPUs via Nvidia’s Mellanox RDMA, which uses Converged Ethernet, or RoCE.
Mellanox RDMA makes it possible to access memory without interrupting the tasks of the CPU and thanks to GPUDirect this is also possible with GPU memory. RoCE offers the possibility to route RDMA traffic via Ethernet. This concerns direct peer-to-peer connections between the GPU memory of the A100 cards, with low latency and where the CPU is relieved.
Apart from that, customers will be able to deploy one, two or four GPUs per virtual machine in the coming months. Oracle’s offer is aimed at companies in the automotive industry and aircraft industry, among others, that can use the graphical computing power for simulations and complex models.
Oracle is not the first to integrate the Nvidia A100 into its cloud architecture. Google announced its A2 VM family of instances last summer, with up to 16 A100 GPUs. Nvidia announced the A100 in May. It was then the first GPU based on Nvidia’s Ampere architecture. The A100 is intended for high performance computing, artificial intelligence and other data center applications.