Federated learning for protecting patient privacy

Jonathan Kanevesky, MD, FRCSC
Director – Clinical Innnovations [email protected]
Sophie Laplante
Sophie Laplante, MD, FRCPC, Radiology
Special collaboration [email protected]

The application of Machine Learning (ML) in healthcare presents unique challenges. For instance, ML algorithms may require data to be migrated and shared from the same server or datacenter for developing training datasets. In a multi-center healthcare environment, sharing protected health information (PHI) across different servers creates privacy concerns.

Ensuring patient data privacy is paramount for companies developing AI healthcare tools, as PHI must be respected when patient data is used for learning algorithms. The use of conventional machine learning technologies could lead to lapses in conditions ensuring patient privacy and data traceability. Privacy breaches in machine learning are well recognized [1],[2] – and this is why we must place privacy at the heart of our solutions to healthcare AI.

 

Our patient-first approach 

The “patient-first” approach is built around ensuring patient privacy when working on decentralized data and on technology that is based on federated learning. Federated learning allows experts to build machine learning models without sharing any data across servers. It is that data that contains individual PHI; the model does not contain PHI. By using Imagia’s solutions, the PHI can’t be traced back to any individual patient. In our algorithms, the patient-safe model (not the data itself) travels between different locations called nodes. Those models are trained using local datasets, all safely within the local firewall.  What is exchanged are the parameters (what is actually “learned”) across all of the nodes, via a central server. Consider the patient-safe model akin to a new medical resident who improves her performance by doing several training rotations across different hospitals. The medical resident continuously encounters PHI within the firewall of each hospital rotation.  However, what she takes from each experience is the learning from each different patient population. The specifics of each patient encounter remain secure, but the resident continues to learn.

Imagia utilizes an advanced federated learning approach that makes data sharing unnecessary and avoids patient privacy compromises. Our privacy preserving methodologies combined with FL ensures that the data cannot be traced back to an individual patient. Imagia’s  EVIDENS™ platform ensures privacy by design for every patient.

Let’s further understand how patient privacy concerns are eliminated through federated learning using  EVIDENS™:

Data from each hospital stays within the hospital firewall

Collaborative and transfer learning across institues

In a multi-center healthcare system, one center (or hospital) may not get enough data to train an algorithm, but a second center (or multiple other centers) have adequate training data. We can measure (weight) the contribution of each data center, including the center that has less, or minimal data. Each center will benefit from the knowledge (acquired) from other centers, enhancing the experience for the center with insufficient data. This is one way Imagia’s platform is useful for a hospital that does not have enough data.

Even in the case where data from a given center is scarce, it is not shared outside the firewall; Imagia’s platform capabilities can be useful to quickly train AI models and automate analysis at the local level. The platform becomes that central engine which aggregates and drives all of the insights gained from the data.

Alternatively, at Imagia, a single model can be trained with data from one center, then sent to the other centers one by one with different datasets but similar feature spaces, sequentially, circling around all of the centers in this manner. Even hospitals with sparse data will contribute, and only the model- not the PHI, is shared. We extract the most significant features to start learning. This is a type of “continual learning” which can be used as an alternative to  “horizontal federated learning”.

Horizontal Vertical FML

Source: crownpku.com/2019/03/13/A-Practical-Overview-of-Federated-Learning.html

Another option is to have centers that agree to share a subset of the features for the same patient that is enough information to unlock insights. For example, if a patient had genomic sequencing and a symptom questionnaire done at a private clinic while his regular visits, lab results and radiology images were done at another hospital, only a subset of features are present at any given institution.  This “Vertical Federated learning” is often used by financial institutions such as banks.

We have made sure that our EVIDENS™ solution is not only easy to set up and run, but also places patient privacy at the center of what we do. We will keep you updated with all our efforts to maintain patient privacy through these blogs. Stay tuned, stay connected!

 

References:

[1]  Nasr, M., Shokri, R. and Houmansadr, A., 2019, May. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In 2019 IEEE Symposium on Security and Privacy (SP) (pp. 739-753). IEEE.

[2] Shokri, R., Stronati, M., Song, C. and Shmatikov, V., 2017, May. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP) (pp. 3-18). IEEE.

 

Related posts

Genomics at Imagia :                                  How AI can help unlock the clinical power of genomic data

Genomics at Imagia : How AI can help unlock the clinical power of genomic data

Genomic data has the potential to be clinically useful, but its use today is very limited – this potential has not been realized. Imag

...
Read more
Imagia Announces the Appointment of Jennifer M. Buechel to its Board of Directors

Imagia Announces the Appointment of Jennifer M. Buechel to its Board of Directors

Imagia boosts its Board of Directors with an experienced biotech industry leader with strong dual domain expertise in technology and healthc

...
Read more
Conditional Generation of Medical Images via Disentangled Adversarial Inference

Conditional Generation of Medical Images via Disentangled Adversarial Inference

Mohammad Havaei, Ximeng Mao, Yiping Wang, Qicheng Lao. Medical Image Analysis, 2021, 102106, ISSN 1361-8415.


...
Read more