Preserving Patient Privacy with Homomorphic Encryption

Author(s):

  • Francis Dutil, Applied Research Scientist
  • Tess Berthier, Research Lead
  • Lisa Di Jorio, Director of AI Research & Strategy 

Health research necessitates access to vast quantities of medical data and personal health information (PHI). From a patient perspective, privacy of PHI is of the utmost importance in order to preserve and protect individual interests. In an ideal world, this would mean that no health data is able to leave health institutions, and that research facilities – either public such as universities or private such as pharmaceutical companies – would need to bring their Intellectual Property on-premise (e.g. a machine learning model), potentially compromising confidentiality, often among an organization’s most valuable assets.

At Imagia, we are collaborating on multi-institutions projects which bring the privacy challenges to the top of our priorities. Although it is possible to undertake distributed research where each data provider represents a node independent of one another, there is mounting scientific evidence that federating data analytics & learning across institutions delivers better chances of generalizability (1, 2). The latter challenge being core to applying AI in healthcare, it is no surprise that we’re witnessing lots of research at the intersection of federated learning & data privacy.

One of the most secure strategies is to operate directly on encrypted data. In this scenario, a local institution (a hospital for example) would encrypt its data before sharing them to collaborators, and all computations or manipulation would be operated on encrypted data. This strategy is opposed to the current way of working with encrypted data, where the collaborator would decrypt the data inside a secure enclave and then directly work on the decrypted version, risking a potential privacy breach and opening the door to data leaks.

Fig 1 – The left side depicts the current way data is transferred from a local institution to a collaborator. The right side depicts the new mechanism working with encrypted data

Imagia is leveraging advanced cryptography research and combining it with its AI expertise to interact with health data securely.

How Homomorphic Encryption solves the Privacy Preservation Challenge
Homomorphic Encryption (HE) is an emerging technology designed to process data that remains encrypted. More than that, the result of the computation remains encrypted, and only the party holding the original encryption key can decrypt it. Using the appropriate design, HE would solve the privacy challenges described above. Homomorphic encryption (HE) is particularly appealing as it is resistant to most forms of attack, including quantum attacks.

However, even though homomorphic encryption has shown promising results in multiple areas (3, 4), its use in combination with the latest artificial intelligence (AI) breakthrough such as deep learning-based methods, remains a challenge. Due to the heavy computation requirement of neural networks, there is a hard limit on the complexity of the networks that can be used before homomorphic encryption makes the training process impractical. For example, the first proposed techniques using HE computes a prediction on an encrypted monochrome image of dimension 28×28 pixels in more than 3 minutes (as opposed to less than a fraction of a second on the same non-encrypted image). Within the last few months, Imagia has been making progress to scale up this solution to high-resolution medical imaging.

In which scenario is homomorphic encryption the most useful?
Although there exists a myriad of applications where homomorphic encryption would bring a lot of value, in this blog post, we focus on two scenarios that are of particular importance for the adoption of AI in the medical community: prediction, and federated learning.

Using HE to Predict on Medical Images
In the prediction scenario, a hospital encrypts their medical data (image, CT scan, etc.) and sends it to a third party which would perform AI-based prediction (through the use of a neural network) to infer an encrypted prediction. At this stage, the server is not able to decrypt the prediction and has to return it to the hospital which is the only place to decrypt it. A hospital, as an example, could send an encrypted X-Ray to a cloud-based service to predict if the patient has a specific disease. The cloud-based service uses a proprietary algorithm to infer whether or not the patient has the disease, but is unable to make sense of its prediction, preventing unauthorized secondary use of data. By design, the hospital is the only one able to decrypt the data, making it safe even if the server is compromised. Additionally, the cloud-based service remains in control of its IP.

Fig 2 – Prediction over medical images with HE

At Imagia, we use our HE based solution to predict disease from Optical Coherence Tomography (OCT) images. OCT is a noninvasive procedure that uses light waves to take a cross section picture of the retina. Those images map and measure the thickness of the retina, and help the ophthalmologist give a diagnosis for a variety of diseases.

A small model was prepared, specifically optimized for this task, and used the OCTID dataset [REF]. Our experiments show that we are able to predict, in 20 minutes, an image of size 112×112, which would be acceptable from a clinician perspective. Moreover, we demonstrate that we do not lose in prediction accuracy when predicting over encrypted data. Because that clinical task does not require real-time assessment, it’s an ideal use case to harden privacy through homomorphic encryption.

Fig 3 – Prediction of the OCT result with HE

Using HE with Federated Learning
As explained in our precedent post, Federated learning (FL) allows experts to build machine learning models without sharing any data across servers. In addition to being an initial strong first step towards data privacy (the data never leaves the site, only the learned parameters are transferred), FL enables valuable collaborations to build more efficient AI models, help generalize and even help reduce bias in learned models.

In short, in a federated learning scenario, multiple hospitals can collaborate to train a model for a common task (such as predicting a specific clinical outcome), but do not share any data. Each hospital would train the same model on their own data locally, and then send their trained weights to a third-party server. This server would aggregate them before sending it back to each hospital.

In this situation, encryption happens when each client (hospital) sends their learned parameters (or weights) to the server, preventing any potential data breach through the model. With homomorphic encryption, the server is able to compute the usual aggregation directly on encrypted parameters. Lastly, an encrypted model is returned to each hospital, and once decrypted, can be further trained locally before the next round of federation.

Fig 4 – Using HE with Federated Learning

Our use case is a comprehensive federated learning experiment where a model is trained to localize nodules in chest CT scans. The model for this experiment was generated through our SELF framework, an Efficient Neural Architecture Search algorithm, and contains 400 thousand learnable parameters. We demonstrated that we are able to aggregate and learn successfully over encrypted parameters: this new layer of security doesn’t impact the accuracy for the federated learning scenario. Moreover, the impact on the training time is negligible. In practice, it means that AI models can be trained on local data, with privacy guarantees and without loss of efficiency and accuracy

Interested in learning more? Check out our white paper here!

We have designed our EVIDENS™ solution so that it is easy to set up and run, and importantly, places patient privacy at the center of our work. To stay updated on our efforts to reinforce patient privacy by HE and other privacy sparing measures, subscribe to our newsletter, or reach out directly at [email protected].

References
[1] Qicheng Lao, Xiang Jiang, Mohammad Havaei: Hypothesis Disparity Regularized Mutual Information Maximization. AAAI 2021: 8243-8251

[2] Shi, N., Lai, F., Kontar, R. A., & Chowdhury, M. (2021). Fed-ensemble: Improving Generalization through Model Ensembling in Federated Learning. arXiv preprint arXiv:2107.10663.

[3] Frederik Armknecht, Colin Boyd, Christopher Carr, Kristian Gjøsteen, Angela Jaschke, Christian A Reuter, and Martin Strand.   A guide to fully homomorphic encryption. IACR Cryptol. ePrint Arch., 2015:1192, 2015.

[4] Joppe W Bos, Kristin  Lauter,  and  Michael  Naehrig.   Private predictive analysis on encrypted medical data. Journal of biomedical informatics, 50:234–243, 2014.

 

 

 

Related posts

Application of Homomorphic Encryption in Medical Imaging

Application of Homomorphic Encryption in Medical Imaging

A technical 20-page report for next-generation data governance models. Authors: Francis Dutil, Alexandre See, Lisa Di Jorio and Florent Chan

...
Read more
Preserving Patient Privacy with Homomorphic Encryption

Preserving Patient Privacy with Homomorphic Encryption

Health research necessitates access to vast quantities of medical data and personal health information (PHI). From a patient perspective, pr

...
Read more
Delivering AI in Healthcare - Platforms vs Packages? Why not both!

Delivering AI in Healthcare - Platforms vs Packages? Why not both!

Personalized medicine is touted as the holy grail of patient care. By supercharging decision support so that the right treatments are prescr

...
Read more