Putting Surface Reliability in Perspective

This week’s blockbuster report on Surface reliability triggered some fascinating discussions. But now I have some new information to share.
To recap, Microsoft’s communicated the following points internally in the wake of a damning report from Consumer Reports in which the trusted consumer advocate organization removed its recommendation of Surface products.
- Microsoft acknowledges internally that Surface Book and Surface Pro 4 launched with unprecedented reliability problems. And the Consumer Reports data is skewed by that fact, as it should be, since so many existing Surface users did experience problems over the past year or more.
- Microsoft’s internal data shows that Surface reliability has improved since then. This is true of both the devices that were impacted by reliability issues and of new devices like Surface Laptop and the new Surface Pro.
- Microsoft’s internal data shows that Surface customer satisfaction rates are very high. I have made the case that customer satisfaction is not the same as reliability, but it’s fair to say, too, that these things are related.
I also theorized that the reliability issues are the real reason that Microsoft shipped Surface Laptop and the new Surface Pro sans modern technology like USB-C/Thunderbolt 3: It was afraid that using a new platform could trigger another round of reliability issues, so it stuck with the well-understood (if out-of-date) Surface Connect/USB-3 platform it used in prior generation products instead. This is just my own theory, but I believe it’s supported by the facts.
Since then, I’ve received a bit more context about the information in my report. And I’d like to share that with you now.
First, Skylake.
Based on background conversations with multiple highly-placed Microsoft executives over late 2015 and 2016, I placed the blame for Surfacegate firmly in Intel’s lap. Microsoft, by being first out of the gate with this then-new chipset, suffered from an unprecedentedly buggy new generation of Core processors, with the now well-understood results. No other PC makers, I noted, experienced these kinds of issues.
But in my most recent report, I noted that a different trusted source at Microsoft had a different story: The real problem was Surface-specific custom drivers and settings that the Microsoft hardware team cooked up.
As it turns out, both of these stories are, in effect, true. And the blame for Surfacegate can be split somewhat evenly between Intel and Microsoft.
That is, Skylake really was very buggy. So buggy, in fact, that a key processor feature, its compatibility with the Windows 10 “Instant On” (previously Connected Standby) power management functionality, was literally broken. And other PC makers did experience some reliability issues with their Skylake-based PCs too. Just not to the level that Microsoft did.
And that is so because the other PC makers, long used to how Intel does things, simply did not enable Instant On in their Skylake-based PCs. They did not ship PCs using a feature that they knew would not work properly.
Microsoft, a relative newcomer to the PC business, and still trusting Intel, its biggest partner, believed the company when it told them that the power management issues would be fixed. So it shipped Surface Book and Surface Pro 4 with Instant On enabled. Even though it did not work. The theory being that Intel would quickly fix this issue. Which it did not.
This, I think, explains Microsoft’s anger at Intel. But it’s fair to note that Microsoft was naive to trust Intel too. I believe they did this because it would have been embarrassing to both companies for Microsoft to ship its first Windows 10 PCs without enabling a key new Windows 10 feature. But they paid the price.
Second, the Lenovo story.
I also buried a story, of sorts, when I noted that Microsoft CEO Satya Nadella met with key Lenovo executives and asked them how they were faring with all the Skylake issues. Lenovo was confused, I wrote. No one was having any issues, he was told. And then I theorized that this must have triggered some interesting conversations inside of Microsoft.
This story is true, and I now believe that the person Nadella spoke with was, in fact, the CEO of Lenovo. But regardless, some context will help explain why this conversation shouldn’t be surprising.
Inside of Lenovo, there is a large group of teams that works on the firm’s PCs. With every new Intel CPU generation, that team, like teams from other PC makers, tests the new chips before implementing them in their products. And like the other PC makers, Lenovo, of course, discovered that Instant On wasn’t working. So it disabled that functionality and shipped PCs that were much more reliable than Microsoft’s.
But why would the CEO of Lenovo, or any other high-level executives at that firm, be aware of such implementation details for specific PC models or configurations? After all, there are bugs in all Intel chips, and those bugs are corrected with firmware issued later by Intel, over time, or by software made by the PC makers. From Lenovo’s perspective, it shipped new PCs like it always does, with whatever features. And those PCs achieved whatever level of reliability and market acceptance. To Lenovo’s leadership, everything proceeded normally. It’s no wonder they were confused by Nadella’s question.
Ultimately, what we’re left with here is that Microsoft suffered from some major reliability issues with Surface Book and Surface Pro 4. And it feels that it has corrected those issues since then, though we will need a lot more time and data before we know that to be true.
More important, Microsoft clearly cares deeply about Surface reliability and about its customers having a great experience. And that is quite heartening. I’ve seen other blogs try to undercut the Consumer Reports data, which is a losing strategy given that publication’s decades of unbiased experience testing consumer products. Microsoft, to its credit, is not engaging in that effort, at least not directly. (I can only imagine what cherry-picked data it might have provided to less sophisticated bloggers to help them make its case for them.) Instead, it accepts its role in the reliability issues of the past, and it contends that it will simply continue to try to do better.
And that, folks, is the Microsoft I know.