Updated: Mar 22
Privacy for machine learning (ML) and other data-intensive applications is increasingly threatened by sophisticated methods of re-identifying anonymized data. In addition, while encryption helps preserve privacy while data is stored on hard drives or moving over networks, the data must be decrypted to be operated on, and this exposes it to malicious actors and inadvertent leakage.
Privacy-Preserving Machine Learning (PPML), including rapid advances in cryptography, statistics, and other building block technologies, provides powerful new ways to maintain anonymity and safeguard privacy.
Advancing Privacy for Machine Learning
Traditional approaches to privacy and ML rely on removing identifiable information, encrypting data at rest and in flight, and limiting data sharing to a small set of trusted partners. But ML often involves massive data volumes and numerous parties, with separate organizations owning or providing the ML models, training data, inference data, infrastructure and ML service. Collaboration has meant running the risks of exposing the data to all these partners, as well as working on unencrypted data.
PPML combines complementary technologies to address these privacy challenges. Working together, these technologies make it possible to learn about a population while protecting data about the individuals within the data set. For example:Federated learning and multi-party computation let institutions collaboratively study data without sharing the data and potentially losing control of it. In addition to bringing together previously siloed data, these technologies help provide secure access to the massive quantities of data needed to increase model training accuracy and generate novel insights. They also avoid the costs and headaches of moving huge data sets among partners.
Homomorphic encryption (HE) is a public/private key cryptosystem that allows applications to perform inference, training and other computation on encrypted data, without exposing the data itself. Dramatic performance advances are making HE practical for mainstream use.Differential privacy adds mathematical noise to personal data, protecting individual privacy but enabling insights into patterns of group behavior.
Expanding ML Collaboration
The advances provided by PPML and its component technologies can enable a new class of private ML services that let competitors collaborate to achieve mutual benefit without losing a competitive advantage.
These new methods of collaboration present particularly exciting opportunities in healthcare, financial services and retail—fields that that collect highly sensitive data about their clients and customers and that comprise 22 percent of the US Gross Domestic Product (GDP). Rival banks could create joint models for combating money laundering, potentially reducing fraud.
Hospitals could use remote, third-party analytics on patient data, potentially leading to new clinical insights and breakthrough treatments. Retailers could monetize their purchase data while protecting user privacy and retaining their ability to develop unique products and services.
Faster Time-to-Value for Machine Learning
Intel is working on multiple fronts to accelerate progress on PPML. Data owners can use HE-Transformer, the open source backend to our nGraph neural network compiler, to gain valuable insights without exposing the underlying data. Alternatively, model owners can use HE-Transformer to deploy their models by in encrypted form, helping protect their intellectual property.
Researchers, developers and institutions can accelerate their PPML building blocks and protect their federated learning environments by running them in a trusted execution environment (TEE) such as Intel® Software Guard Extensions (Intel SGX).
PPML and its component technologies will bring new power to AI and ML services while strengthening protections for sensitive data.
Read the original article here