Nvidia announces Tesla V100 accelerator with Volta GPU
Nvidia has announced the Tesla V100. The accelerator for gpgpu applications such as deep learning is equipped with a gpu based on the new Volta architecture. The chip has special Tensor Cores for training neural networks.
The Tesla V100’s GV100 GPU is 815mm2 in size, has 21.1 billion transistors and is made using TSMC’s 12nm finfet process. The chip is a lot bigger than the GP100 based on Pascal, which had a surface of 610mm2. The new Volta GPU has 5120 cuda cores for fp32 tasks and is combined with 16GB of Samsung HBM2 memory, which has a bandwidth of 900GB/s.
Nvida has adapted the architecture of the streaming multiprocessors and, in its own words, optimized it for deep learning. The GPU maker has done this by adding new Tensor Cores to the design, which are specialized in training neural networks. In total, the GV100 GPU has 640 of these Tensor Cores: eight per sm. Nvidia claims huge performance gains in applications that can take advantage of this. In regular fp32 and fp64 calculations, the GV100 is about 1.5 times faster than the GP100.
That speed gain seems to have to do with the size of the GV100 GPU. The Volta variant has more cores and 2MB more l2 cache. The card also has 20MB sm rf, which can communicate with the chip at a speed of 80TB/s. With the GP100 that is 14MB. According to Nvidia, consumption has not changed with a TDP of 300 watts.
Nvidia CEO Jen-Hsun Huang announced the Tesla V100 at the company’s GPU Technology Conference and extensive information about the Volta GPU has appeared on Nvidia’s dev blog. It is the first time that Nvidia provides details about the Volta GPU, which will succeed the current Pascal GPU. Initially, Volta will be used in the Tesla V100 accelerator, which will hit the market in the third quarter. Nvidia followed the same path with the Pascal GPU, which first ended up in the Tesla P100.
Nvidia is also going to release server systems that contain multiple of the cards, just as it previously did with the Tesla P100 based on the Pascal GPU. The DGX-1V will be available in the fourth quarter for $149,000 and will be equipped with eight Tesla V100 cards. Nvidia also comes with a smaller version that can be used as a ‘personal supercomputer’: the $69,000 DGX Station. The computer has four Tesla V100 cards, is equipped with water cooling and has a 1500W power supply.
Nvidia has not yet announced anything about GeForce video cards for gamers with a Volta GPU. Consumer video cards will probably come with the new GPU next year, but with modified configurations that are less focused on tasks such as deep learning.
Tesla Accelerators | Tesla V100 |
Tesla P100 |
GPU | GV100 (Volta) | GP100 (Pascal) |
Text message | 80 | 56 |
TPCs | 40 | 28 |
FP32 cuda cores / sm | 64 | 64 |
FP32 Cuda Cores / GPU | 5120 | 3584 |
FP64 cuda cores / sm | 32 | 32 |
FP64 Cuda Cores / GPU | 2560 | 1792 |
Tensor Cores / sm | 8 | – |
Tensor Cores / GPU | 640 | – |
base clock | unknown | 1328MHz |
GPU boost clock | 1455MHz | 1480MHz |
single precision | 15tflops | 10.6tflops |
Double precision | 7.5tflops | 5.3tflops |
Tensor Core Performance | 120tflops | – |
Texture Units | 320 | 224 |
Memory interface | 4096bit hbm2 | 4096bit hbm2 |
Memory size | 16GB | 16GB |
L2 cache |
6144KB | 4096KB |
Register file size / sm | 256KB | 256KB |
Registry file size / gpu | 20480KB | 14336KB |
tdp | 300 watts | 300 watts |
Transistors | 21.1 billion | 15.3 billion |
Gpu die format | 815mm² | 610mm² |
Design Process | 12nm FFN (tsmc) | 16nm finfet+ (tsmc) |