Apple is taking a new approach to training its AI models – one that avoids collecting or copying user content from iPhones or Macs.
According to a recent blog post, the company plans to continue to rely on synthetic data (constructed data that is used to mimic user behaviour) and differential privacy to improve features like email summaries, without gaining access to personal emails or messages.
For users who opt in to Apple’s Device Analytics program, the company’s AI models will compare synthetic email-like messages against a small sample of a real user’s content stored locally on the device. The device then identifies which of the synthetic messages most closely matches its user sample, and sends information about the selected match back to Apple. No actual user data leaves the device, and Apple says it receives only aggregated information.
The technique will allow Apple to improve its models for longer-form text generation tasks without collecting real user content. It’s an extension of the company’s long-standing use of differential privacy, which introduces randomised data into broader datasets to help protect individual identities. Apple has used this method since 2016 to understand use patterns, in line with the company’s safeguarding policies.