Bug in AMD EPYC 7002 can cause server processor to crash after 1044 days
A bug in the AMD EPYC 7002 server processors can cause one of the computing cores to crash after 1044 days of operation. According to AMD documentation, it is sufficient to restart the server for that period. The American company has not planned a fix.
The editorial staff of Tom’s Hardware discovered the bug in the revision guide of the EPYC 7002 server processor that AMD released in April. It states that one of the computing cores of the EPYC 7002 is at risk of crashing after approximately 1044 days because it will then no longer be able to wake up from CC6 sleep mode.
According to AMD, the exact period depends, among other things, on the reference clock with which the processor keeps track of time. The US company states that users can avoid the bug by disabling the EPYC 7002’s CC6 sleep mode, or by restarting their server for the designated period of approximately 1044 days of operating time. There is some discussion on the internet about that term. According to a Reddit user the actual period would be approximately 1042 days of operating time.
The editors of Tom’s Hardware report that it is not unusual for (server) processors to contain bugs. These are usually resolved, but according to Tom’s Hardware that also depends on the severity of the bug. According to the editors, some bugs or vulnerabilities also appear over time and are therefore more difficult to predict in advance.
AMD released the EPYC 7002 server processors in 2019. The company then released a total of nineteen processors of the EPYC 7002 series. Just like the Ryzen 3000 processors, the processors are based on the Zen 2 architecture, with CPU chiplets built at 7nm.
Update, 11.45 am: ‘The American company will not provide a fix for the time being.’ in the introduction has been changed to ‘The American company has not planned a fix.’
AMD EPYC 7002 server processors