AMD Releases New Zen Architecture Information and Performance Data
AMD revealed details about its new Zen architecture at an event in San Francisco. It also showed results of a benchmark in which the high-end series processors based on this architecture, Summit Ridge, can compete with Intel’s Broadwell-E.
During the event, AMD took a closer look at the architecture behind Zen, which has to scale from laptop to server. Naturally, we worked on performance gains, since the Bulldozer architecture could not compete with Intel’s processors, but the hunger for energy was also addressed. While Bulldozer cores, used in AMD FX processors, still had modules with two integer and one shared floating point unit per two ‘cores’, each Zen core has its own integer and floating point unit. . Each integer unit has four alu’s for computation and two alu’s for memory addressing. The fpus have two addition and two multiplication units. To feed instructions to the calculation blocks, AMD has improved the branch prediction, which predicts which instructions are likely to follow and can thus be prepared in advance. To that end, a micro-op cache has also been added, which Bulldozer lacked and should provide for significant gains. Zen has to add four instructions to the queue per clock tick, after which even six instructions can be fed to the schedulers. According to AMD, that delivers a 50 percent to 75 percent improvement in the pipeline. In order to provide instructions to the cores, the caches have also been improved: there are separate instruction and data caches in the very fast L1 caches, while AMD uses rather large L2 caches that can accommodate both data and instructions. This 512KB L2 cache is slower than the L1 caches, which are 32KB for data and 64KB for instructions. There is some uncertainty about the L3 caches, which would be 8MB in size and shared between four or eight cores. Compared to the Bulldozer architecture, the L1 caches have doubled in speed and the data cache has also doubled in size. Combined, the caches should collectively provide five times more bandwidth per core. Unlike Bulldozer’s two pseudo-cores without hyper-threading, the Zen cores must support true hyper-threading, which is working on two program instructions simultaneously. So a Zen Summit Ridge processor with eight physical cores shows sixteen logical cores. To tackle the energy consumption, also traditionally a tricky issue at AMD, we switched to a 14nm finfet process. That technique can already provide significant savings compared to the 32nm processors of the past, but AMD has also applied aggressive clockgating to minimize unused parts of the processor and made many other tweaks to limit consumption. That would make the Zen cores a lot more economical than Excavator cores, while the IPC would be increased by 40 percent. Combined, the Zen cores would take a big step forward in performance per watt. During a demonstration of an engineering sample of a Zen Summit Ridge processor, this eight-core processor was clocked at 3GHz. As a comparison AMD showed a Broadwell-E processor clocked at 3GHz, the Core i7-6900, which normally taps at 3.2GHz. Both CPUs have eight cores with hyperthreading or SMT, which tap at 3GHz. In a Blender benchmark, which relies heavily on floating point calculations, the Zen processor showed itself slightly faster than Intel’s processor. The first Zen processors for the high-end Summit Ridge platform should be released by the end of this year, followed by mainstream and mobile processors. Only then can we test the actual performance of the new architecture with our own benchmarks.