Contractors that are working on Siri regularly hear confidential medical information, drug deals, recordings of couples having sex, and other private information, according to a report from The Guardian that shares details collected from a contractor who works on one of Apple's Siri teams.
The employee who shared the info is one of many contractors around the world that listen to Siri voice data collected from customers to improve the Siri voice experience and help Siri better understand incoming commands and queries.
According to The Guardian, the employee shared the information because he or she was concerned with Apple's lack of disclosure about the human oversight, though Apple has several times in the past confirmed that this takes place and the practice has been outlined in past reports as well.
The whistleblower said: "There have been countless instances of recordings featuring private discussions between doctors and patients, business deals, seemingly criminal dealings, sexual encounters and so on. These recordings are accompanied by user data showing location, contact details, and app data."
In a statement, Apple confirmed to The Guardian that a small number of anonymized Siri requests are analyzed for the purpose of improving Siri. A small, random subset (less than 1 percent) of daily Siri activations are used for grading, with each clip only lasting for a few seconds.
"A small portion of Siri requests are analysed to improve Siri and dictation. User requests are not associated with the user's Apple ID. Siri responses are analysed in secure facilities and all reviewers are under the obligation to adhere to Apple's strict confidentiality requirements."
Apple has not made its human-based Siri analysis a secret, but its extensive privacy terms don't appear to explicitly state that Siri information is listened to by humans. The employee said that Apple should "reveal to users" that human oversight exists.
The contractor who spoke to The Guardian said that "the regularity of accidental triggers on the watch is incredibly high," and that some snippets were up to 30 seconds in length. Employees listening to Siri recordings are encouraged to report accidental activations as a technical problem, but aren't told to report about content.
Apple has an extensive privacy policy related to Siri and says it anonymizes all incoming data so that it's not linked to an Apple ID and provides no information about the user. Still, the contractor claims that user data showing location, contact details, and app data is shared, and that names and addresses are sometimes disclosed when they're spoken aloud. To be clear, Apple says that all Siri data is assigned a random identifier and does not include location or contact details as stated by the contractor.
As well as the discomfort they felt listening to such private information, the contractor said they were motivated to go public about their job because of their fears that such information could be misused. "There's not much vetting of who works there, and the amount of data that we're free to look through seems quite broad. It wouldn't be difficult to identify the person that you're listening to, especially with accidental triggers - addresses, names and so on.
While Apple's Siri privacy policy and security documents do not mention human oversight specifically, they are detailed and provide information on how Siri recordings are used.
As stated in Apple's security white paper, for example, user voice data is saved for a six-month period so that the recognition system can use them to better understand a person's voice. The voice data that's saved is identified using a random identifier that's assigned when Siri is turned on, and it is never linked to an Apple ID. After six months, a second copy is saved sans any identifier and is used by Apple for improving Siri for up to two years. A small number of recordings, transcripts, and associated data without identifying information is sometimes used by Apple for ongoing improvement of Siri beyond two years.
Apple's privacy website has a Siri section that offers up more info, explaining that all Siri queries are assigned a random identifier not associated with an Apple ID. The identifier is reset whenever Siri is turned off and then on again, and turning Siri off deletes all user data associated with a Siri identifier.
When we do send information to a server, we protect your privacy by using anonymized rotating identifiers so that searches and locations can't be traced to you personally. And you can disable Location Services, our proactive features, or the proactive features' use of your location at any time.
Those concerned about Siri triggering accidentally on devices like the iPhone, Apple Watch, and HomePod can turn off the "Hey Siri" feature and can instead activate Siri manually, and Siri can also be turned off entirely.