Machine Learning – Understanding Differences between Real World Evidence and Clinical Trial Data for better patient outcomes

Jonathan Kanevesky, MD, FRCSC
Director – Clinical Innnovations [email protected]
Kam Kafi,
MD, CM
Director – Oncology and Clinical Strategy [email protected]

 

 

Healthcare data volume has been skyrocketing. Large volumes of patient data from a variety of sources, ranging from electronic health records (EHR) capturing events within a healthcare system – to wearables and digital health apps that provide information on day to day events are available; however the real challenge lies in gaining meaningful insights from the data. We can lean on artificial intelligence to identify key features but we need to ensure we select the proper data for a given AI task so that it’s representative, while considering its limitations when interpreting results.

Real-world evidence (RWE) and randomized clinical trial data (RCT), are both valuable to advance precision medicine. There are three main differences between RWE and RCT with relation to the output of the machine learning algorithms that will be trained by the respective data. There are a variety of machine learning algorithms; the development and selection of an algorithm must account for these differences to provide the desired data that leads to better patient outcomes.

RWE in medicine means evidence obtained from real world data (RWD), which are observational data generated from day to day activities. Traditionally it encompasses assessments from routine clinical practice but more so includes patient generated data from in-between care episodes. It is typically aggregated from electronic health records (EHRs) but can come from any source that can be digitally stored and used. RWD can be collected after drugs are approved, and captures critically needed evidence on how drugs actually perform in the real world. It’s diverse and episodic in nature and therefore can contain certain biases. For example, if we want to study the impact of a drug on a patient, RWD might not account for missed doses, mild side-effects, adverse life events, or anything else that might influence its effectiveness. These can impact the results, but RWE is typically considered to be based on large enough data sets that the impact of this “missing” data is less important.  Overall,  RWD is easier and less expensive to acquire, and running queries on RWE reveals a lot of information such as real world effectiveness of drugs, the identification of underserved populations, or what factors are associated with clinical decision-making.

From RWD to RWE

On the other hand, RCT’s are specifically designed to answer a single clinical endpoint, an event or an outcome that can be measured objectively, particularly by excluding potential confounders through rigorous requirements and protocols. RCTs cost significant time and money, and are designed to follow patients in a standardized fashion. The pecuniary burden usually falls on companies who can then use the data so that their products (e.g. a cancer drug) can be approved, marketed and sold. Considering the cancer drug discussed above, a RCT must account for or eliminate missed doses, prior cancer history might be cause to exclude a set of trial patients, and all other medicines taken by trial patients are documented and mathematically considered in the outcomes. This highly controlled environment and a highly selective population enables statistics to support or reject the specific question posed in the trial. Because of this,  RCT data is often considered the gold standard (golden but not perfect) for clinical research and has the highest reliability.

RCT vs RWD

As it applies to AI applications, RCT vs RWD data’s strengths and weaknesses follow the same trends. Because RCT data is generated on very specific conditions, some learnings don’t easily generalize to real life settings. Therefore, a model predicting treatment response built on RCT data might need to be fine tuned down the line to capture the diversity of populations and practice settings encountered in the real world. On the other hand, the meticulous tracking and consistent level of data from RCTs makes it particularly practical for AI tasks that require a lot of annotations. An interesting application of AI on RCT data is the generation of digital twins which can be used to accelerate clinical trials by reducing the need for real patients in a placebo arm.

1. Accuracy of data

The accuracy of RWE data is generally considered to be lower than that for RCT, based on the fact that investigators have less control over how the data is produced, collected, stored. In RCTs it is in an experimental setting vs a  real-world setting. Details regarding the patient population, exposure to the treatment, and outcomes can be hard to assess. For instance, EHR data may include a prescription or a payment record for a cancer treatment drug. The RWE would assume that the patient took the drug, even though that may not be the case. That decision may be made at the patient level, or it may not have been administered by a medical professional. In RCT data, meticulous attention to treatment is included in protocols, and whether or not the patient actually took the drug is electronically documented. Furthermore, protocolized visits and standardized questionnaires are used in RCTs, which results in more consistent and standardized data. But that leads to sampling bias and overfitting, as it includes only patients within a certain radius of centers etc, which impacts the generalizability of results.

RWD is collected from multiple centers and multiple care providers, as it reflects a population. Inherently, the data is incomplete, and at times it can be unreliable. The reasons can be technical – related to how the data is collected, or from random errors. There are ways to lower the impact of incomplete and unreliable data. For example, expert adjudication is often  required to curate RWE so that the data can be used by a learning algorithm and is one of the biggest current bottlenecks in data preparation. This is a critical but time consuming and costly step in improving data quality and readiness necessary for model development. However, including humans in the loop learning can be leveraged to speed up data preparation. Additional steps such as feasibility studies may be needed before developing machine learning algorithms, if the data is unreliable.

RWD RCT RWE

2. Data Preparation

In clinical settings, machine learning is often used for outcome predictions. Generally speaking, the more data is available, the better machine learning algorithms will perform. Often RWE data are larger datasets that can be used for training and validating models. RWE is well suited for predictive clinical models due to its larger data size, heterogeneity and generalizability of findings.

For example, supervised learning, when used in radiation oncology, requires labels provided by clinical experts to predict patient outcomes or planning evaluation. At present, this technology uses RWE data and can be used for tumor response modeling or image-guided radiotherapy. Researchers hope that machine learning algorithms can eventually be applied to better inform  RCT design, ultimately making clinical trial design more efficient and effective. Strategies like population enrichment, whereby a study population is selected rather than randomized, can aid the development of AI driven companion diagnostics to enhance treatment prediction.

3. Heterogeneity in responses

Machine learning algorithms can identify heterogeneity in responses to treatments. For instance, using machine learning methods, a recent study analyzed the Swedish Heart Failure Registry, a nationwide registry that included over 44,000 patients, to detect heterogeneity in treatment response. Surprisingly, the study observed that aldosterone antagonists, which are commonly prescribed drugs for heart failure, are proven to be beneficial in RCTs, but the RWE does not support these findings.

In another RWE study looking at images of solid tumors, it was found that machine learning can identify features used for characterization and predictive quantification including tumor heterogeneity which is linked to treatment response and overall outcome.

This shows how the extreme standardization of data collected in RCT may occult heterogeneity that happens in real life, leading to unpredicted differences in treatment response after a drug is approved. RWE data on the other hand will include a broader scope of patients in real-world settings, from heterogeneous groups that may have received treatment from many practitioners.

Conclusions

When research organizations develop AI-solutions, these differences in the two types of data need to be considered in the study and AI design. Depending on the intended outcome and use case, Machine learning algorithms need to be trained accordingly for generating meaningful insights.

At Imagia, our expert team is experienced in working with both RCT data and RWE, and we understand how to best capitalize on both. We help organizations transform their patient data into actionable insights or knowledge. Stay tuned to learn more about how data science helps in improving patient outcomes.

Related posts

How we developed an AI-driven discovery platform that transforms real-world healthcare data into clinically actionable insights

How we developed an AI-driven discovery platform that transforms real-world healthcare data into clinically actionable insights

It seems self-evident that healthcare research could gain tremendous value from using artificial intelligence to derive insights and solutio

...
Read more
How AI could improve lung cancer screening—and help to save lives

How AI could improve lung cancer screening—and help to save lives

Lung cancer is the deadliest cancer in Canada—and the world. Every year, lung cancer kills more than 20,000 Canadians. Of those who receiv

...
Read more
Application of Homomorphic Encryption in Medical Imaging

Application of Homomorphic Encryption in Medical Imaging

A technical 20-page report for next-generation data governance models. Authors: Francis Dutil, Alexandre See, Lisa Di Jorio and Florent Chan

...
Read more