The lead up to Meltdown and Spectre – What happened behind the scenes?

By Faryad On Aug 15, 2018

Spread the love

Much has been written about Specter and Meltdown , which emerged early this year after a somewhat hurried announcement by several companies. They had worked hard behind the scenes to come up with solutions as quickly as possible. These vulnerabilities, referred to by some as the most serious of the past decade, could not be missed at the annual Black Hat security conference in Las Vegas. No fewer than three presentations were devoted to this topic. This article draws from these presentations, which covered some of the pre-announcement events. Together they form an interesting look at what went on behind the scenes surrounding these high-profile discoveries and the coordination that took place between some of those involved.

One of the presentations was a panel session with representatives from Google, Microsoft and Red Hat, led by someone from the Cert of the US Carnegie Mellon University. That is indeed the Cert that initially wrote that the only countermeasure against Specter and Meltdown was to “replace the CPU hardware”, but later made it a more nuanced recommendation. A possible reason for the rigorous advice emerges from the panel discussion.

Google

Matt Linton, donned in pink mohawk and boxed shorts, is on hand to explain Google’s perspective from his position as a senior security engineer , aka chaos specialist . It is not surprising that he kicks off the discussion, since Google’s Project Zero was heavily involved in the discovery of the vulnerabilities. The timeline of the disclosure process began in June 2017, when the researchers behind the discovery decided to notify the CPU manufacturers of their findings.

Linton: “It is not the case that Project Zero members inform Google in advance about vulnerabilities they find. In this case, Project Zero informed Intel, stating that Intel then also had to inform Google. Due to a miscommunication, that happened not right away though.” When it did, Google engaged senior employees from various departments to determine the severity and impact of the vulnerabilities found. According to Linton, there was a long back-and-forth communication with Intel. “There were various countermeasures being sent back and forth between Google and Intel all the time until October, but they were all shot down.”

At one point, a meeting of all the players involved in the sector took place. At that meeting, for example, Microsoft came up with a working proof-of-concept of an attack from the browser, says Linton. That’s when the Chrome team got involved, some of whom had been working on the site isolation feature for five years by then . Linton asked the Chrome team how soon that feature could be ready, and they gave an estimate of three to four quarters to get to a beta release. At that time, the embargo date had already been set for January 9, and we eventually managed to complete the beta before January.

In the end, it would not be possible to keep the embargo until January 9, because before then there were already several signals that indicated that something big was being worked on behind the scenes. These signals were eventually tied together by The Register . Subsequently, on January 2, the first proof-of-concept of a working attack appeared, which meant that the embargo had to be lifted, because there simply could not be any longer to wait. That happened on January 3.

Microsoft

Microsoft, represented on the panel by security response team leader Eric Doerr, brought in an expert in the process to determine the impact of the vulnerabilities. That expert was Anders Fogh of the security company G-Data. Doerr states that the story for Microsoft also started in June and that it was first necessary to determine what the company was dealing with. “We first checked if it was reproducible on Windows Server and we soon found out that it was indeed a real thing . Then we decided to involve people from different departments.”

He compares the process to peeling an onion, adding more and more complexity. It soon became clear that major changes were needed within Microsoft’s products, which suddenly brought the January deadline closer. Doerr is positive about the meeting of the organizations involved, which he says was held in November 2017. “It’s funny to see how seldom such a meeting takes place,” he says. “I was blown away by the collaboration that took place at the meeting, because who shares vulnerability mitigations with their direct competitors?”

He cites as an example that Google unveiled a full countermeasure called retpoline. “It was a tipping point, where collaboration went up by a huge factor,” says Doerr. At the end of his introduction, he adds: “In the last two weeks before publication, I exchanged more messages with people working on solutions than with my wife.”

RedHat

For Red Hat, represented by Christopher Robinson, aka CRob, things went a little differently. He got a call in November that he would get a call from Intel. So there was much less time for him to take action. “We still had a significant amount of work to do.” He underlines that Red Hat is an open source company that works with many external people. “Fun fact,” adds Robinson, “Nobody in open source likes to be told to do a certain job.” He refers to the fact that people were assigned different tasks to implement patches. “On January 3, when I thought I had a week left to write my documentation, I got a message asking if I was ready to go public in an hour.”

Who is allowed ‘in the tent’?

One of the questions raised during the discussion is how it was determined who was notified of the researchers’ findings. According to Red Hats Robinson, that consideration came down to whether someone can contribute to a solution. Doerr, from Microsoft, says the problem is that the more people are notified, the more likely it is that something will leak. “It is actually insane that this has been hidden for six months,” says Doerr. According to Googler Linton, the embargo was in the hands of the chipmakers. According to him, this means that there were people who should have been ‘in the tent’, but were not in it. “Google used the following criterion: are you maintaining an operating system, are you working on a virtualization stackor drivers, then you belong.” Art Manion, senior vulnerability analyst for the American Cert and panel leader, also wants to know why he himself was not informed in advance. That would explain why the Cert came up with such rigorous advice on the day that the embargo was lifted.

Another issue that will be addressed is why the US government was not notified, while Chinese companies were. To that the Google envoy responds: “That’s a very difficult part of the puzzle, where the question is whether an entity can contribute in a meaningful way.” He says that he also disagrees with the way in which this fact was framed on Twitter, among other things.By this he refers to the suggestion that informing Chinese companies, such as Lenovo, would be equivalent to informing the Chinese government. “It was necessary to notify Lenovo because it could push microcode updates to millions of people.” At that point, someone from Lenovo comes forward in the audience, stating that everything related to Specter and Meltdown was coordinated from the US with a team of about 12 people, in addition to someone in Japan for ThinkPad. Lenovo was reported to have been notified on November 30.

Finally, the question is what people would have done differently and what they have learned. According to Robinson, there has been collaboration on variants of the original vulnerabilities in the meantime. This collaboration went considerably better, because it was now clear who had to contact whom. According to Microsoft, the close collaboration should have started much earlier.

What’s to come?

Some of the discoverers of the leaks, Daniel Gruss, Moritz Lipp and Michael Schwarz from the University of Graz, have their own presentation about Meltdown during Black Hat, in which they explain how it works in a kind of role-playing game. In it, they state, among other things, that the Exynos soc in Samsung’s Galaxy S7 was also vulnerable to Meltdown and that the South Korean manufacturer released patches in July. They also expect that more attacks will be found in the future that use performance optimizations in processors. Given the variants that have passed so far, shown in a table below, the researchers are probably not wrong.

According to them, processor design needs to be rethought and a better balance between performance and security needs to be created. “Think of the car industry, for example. We didn’t always get faster cars, but safety became more important,” illustrates one of them. In closing, they also want to clear up the misconception that Meltdown and Specter are side channel attacks. Although they use such a channel for transmission, that does not mean that the entire attack should be classified as such.