Cerebras shows chip with 1.2 trillion transistors and 400,000 cores
The American company Cerebras has unveiled its Wafer Scale Engine: a chip with a surface area of 46,225mm² made up of 1.2 trillion transistors. The bulky chip is intended for artificial intelligence calculations.
The Wafer Scale Engine, so named because it practically takes up an entire wafer, measures 21.5×21.5 centimeters. Cerebras Systems pits the chip with its surface area and 1.2 trillion transistors against Nvidia’s Volta V100, which combines 21.1 billion transistors with an area of 815 mm². TSMC manufactures the Cerebras chip at 16nm.
The four hundred thousand cores of the chip are placed in a mesh network with distributed 18GB SRAM. The chip has no cache and according to the manufacturer this is one of the advantages over the use of GPUs for AI calculations. “Deep learning calculations require computing power with frequent access to data. This requires that computing cores and memory are close to each other. That is not the case with GPUs where most of the memory is slow and far away.”
The memory bandwidth of the Wafer Scale Engine is 9 petabytes per second thanks to its architecture and the combined bandwidth is 100 petabits per second. According to the makers, the chip’s so-called Sparse Linear Algebra Cores are specially designed for AI calculations and the programmable cores have no superfluous parts to minimize overhead.
84 ’tiles’ are placed on the chip in a 7 by 12 grid. Each tile contains 4800 cores with 48 kilobytes of sram each. Cerebras’ compiler converts TensorFlow and Pytorch models for deep learning computations into machine language for the engine to process, and microcode libraries further distribute computational tasks across the network, EETimes writes.
According to Cerebras, producing a chip of this size inevitably leads to defects, but the company did not go into detail about the yields when presenting at the Hot Chips conference. AnandTech was at the presentation. The company did say that the operation of non-functional cores can be absorbed by surrounding computing cores. Because so many cores are placed on a surface, traditional air cooling is insufficient and water cooling is required according to the maker. Many details about the chip, such as clock speeds and price, are still unknown.