Arm: Cortex-A77 single-threaded performs 20 percent better than Cortex-A76
Arm has announced the Cortex-A77, the microarchitecture on which next year’s high-end Snapdragon and Exynos socs will be based. Arm has focused on increasing the number of instructions per clock cycle, or IPC.
According to Arm, the improvements to the IPC of Cortex-A77 provide 20 percent higher single-threaded performance than A76. That is at the same clock speed and with chips that are both produced on a 7nm process. By comparison, the performance gain of A75 against A76 was 35 percent, but then the clock speed also went up and Arm made the switch from 10nm to 7nm in the claims. Arm now compares A76 on the first 7nm generation against the A77 on 7nm of the optimized second 7nm generation.
A77 performs 20 percent better in integer calculations and 35 percent better in floating point calculations. In addition, the memory bandwidth has increased by 15 percent. The clock speeds have not increased and no improvements in battery life are expected. Arm focused strongly on this on last year’s A76. Furthermore, the performance improvements come at a price, as the size of A77 has increased by 17 percent over A76.
The performance improvements have been achieved, among other things, by the introduction of a macro-on-cache, which should provide higher fetch bandwidth and lower latency. In addition, a fourth arithmetic logic unit, or alu, and a second branch unit have been added. In addition, a second pipeline for hardware acceleration of AES encryption is in place.
The Cortex-A77 is intended for system-on-a-chips for smartphones, laptops and other devices. The first of these are likely to hit the market next year, judging by the earlier periods between announcement and market launch.
Traditionally, Arm is also announcing a new generation of Mali GPUs. With the Mali-G77, Arm is making the switch from Bitfrost, the architecture it used for three years, to the new Valhall architecture. Overall, a 40 percent improvement in peak performance in mobile devices can be expected and energy efficiency is on average 1.3 times higher. Arm further points to the 60 percent improvement in machine learning, which is important in image and sound recognition, among other things.
Arm has made significant changes to the architecture, and the G77, for example, includes two units of 16 fmas for a total of 32 fma lines for floating point calculations. With the G76 there were 24. In addition, the throughput of the texture mapper has been doubled.