Google shows 100 petaflops water-cooled tpu v3 pods for machine learning
Google has shown generation 3.0 of its tensor processing unit for machine learning. Systems with the chips for the Google Cloud are more than eight times as powerful as those with the current tpu v2 generation. The chips are water-cooled to maintain temperature.
A pod with tpu v3 chips offers 100 petaflops of computing power, while a variant with tpu v2 chips offers 11.5 petaflops. It is not known how many chips the upcoming tpu v3 pods consist of and how much memory they have. The increase in the computing power of the pods with v3 chips is in any case not only due to the new chip architecture, but also because the systems are larger. A current tpu v2 pod consists of 64 devices each with four asic chips that in turn have two cores. Those pods have 4TB of high bandwidth memory.
Sundar Pichai showed off the tpu v3 chips during the Google I/O keynote, without giving further details about the chips or the pods. He did say that the tpu’s are now so powerful that the upcoming pcb’s with four tpu v3’s will need water cooling.
According to Zak Stone, Google’s product manager of Tensor Flow and Cloud TPU, the arrival of the next generation will affect all of Google’s services. According to him, the models for machine learning are becoming more accurate, but that would require more and more computing power. That would be the reason that Google started designing its own specialized hardware. Stone considers the pods “supercomputers for machine learning.”
The company uses the first version of the tpus, which Google has been using since 2016, for search results, the image recognition of Photos and speech recognition. The pods with tpu v2 chips that Google now uses, the company will also make available to customers, who can use the supercomputers via the Google Cloud to train their own machine learning models. Currently, Google offers the use of Cloud TPUs of 180 teraflops publicly.