Steven Levy has published an in-depth article about Apple's artificial intelligence and machine learning efforts, after meeting with senior executives Craig Federighi, Eddy Cue, Phil Schiller, and two Siri scientists at the company's headquarters.
Apple provided Levy with a closer look at how machine learning is deeply integrated into Apple software and services, led by Siri, which the article reveals has been powered by a neural-net based system since 2014. Apple said the backend change greatly improved the personal assistant's accuracy.
"This was one of those things where the jump was so significant that you do the test again to make sure that somebody didn’t drop a decimal place," says Eddy Cue, Apple’s senior vice president of internet software and services.
Alex Acero, who leads the Siri speech team at Apple, said Siri's error rate has been lowered by more than a factor of two in many cases.
“The error rate has been cut by a factor of two in all the languages, more than a factor of two in many cases,” says Acero. “That’s mostly due to deep learning and the way we have optimized it — not just the algorithm itself but in the context of the whole end-to-end product.”
Acero told Levy he was able to work directly with Apple's silicon design team and the engineers who write the firmware for iOS devices to maximize performance of the neural network, and Federighi added that Apple building both hardware and software gives it an "incredible advantage" in the space.
"It's not just the silicon," adds Federighi. "It's how many microphones we put on the device, where we place the microphones. How we tune the hardware and those mics and the software stack that does the audio processing. It's all of those pieces in concert. It's an incredible advantage versus those who have to build some software and then just see what happens."
Apple's machine learning efforts extend far beyond Siri, as evidenced by several examples shared by Levy:
You see it when the phone identifies a caller who isn’t in your contact list (but did email you recently). Or when you swipe on your screen to get a shortlist of the apps that you are most likely to open next. Or when you get a reminder of an appointment that you never got around to putting into your calendar. Or when a map location pops up for the hotel you’ve reserved, before you type it in. Or when the phone points you to where you parked your car, even though you never asked it to. These are all techniques either made possible or greatly enhanced by Apple’s adoption of deep learning and neural nets.
Another product born out of machine learning is the Apple Pencil, which can detect the difference between a swipe, a touch, and a pencil input:
In order for Apple to include its version of a high-tech stylus, it had to deal with the fact that when people wrote on the device, the bottom of their hand would invariably brush the touch screen, causing all sorts of digital havoc. Using a machine learning model for “palm rejection” enabled the screen sensor to detect the difference between a swipe, a touch, and a pencil input with a very high degree of accuracy. “If this doesn’t work rock solid, this is not a good piece of paper for me to write on anymore — and Pencil is not a good product,” says Federighi. If you love your Pencil, thank machine learning.
On the iPhone, machine learning is enabled by a localized dynamic cache or "knowledge base" that Apple says is around 200MB in size, depending on how much personal information is stored.
This includes information about app usage, interactions with other people, neural net processing, a speech modeler, and "natural language event modeling." It also has data used for the neural nets that power object recognition, face recognition, and scene classification.
"It's a compact, but quite thorough knowledge base, with hundreds of thousands of locations and entities. We localize it because we know where you are," says Federighi. This knowledge base is tapped by all of Apple's apps, including the Spotlight search app, Maps, and Safari. It helps on auto-correct. "And it's working continuously in the background," he says.
Apple, for example, uses its neural network to capture the words iPhone users type using the standard QuickType keyboard.
Other information Apple stores on devices includes probably the most personal data that Apple captures: the words people type using the standard iPhone QuickType keyboard. By using a neural network-trained system that watches while you type, Apple can detect key events and items like flight information, contacts, and appointments — but information itself stays on your phone.
Apple insists that much of the machine learning occurs entirely local to the device, without personal information being sent back to its servers.
"Some people perceive that we can't do these things with AI because we don't have the data," says Cue. "But we have found ways to get that data we need while still maintaining privacy. That's the bottom line."
"We keep some of the most sensitive things where the ML is occurring entirely local to the device," Federighi says. As an example, he cites app suggestions, the icons that appear when you swipe right.
The full-length article on Backchannel provides several more details about how machine learning and artificial intelligence work at Apple.