- Joint research between IBM Research and Sutter Health led to the development of methods for predicting heart failure using clues imbedded in patient records in Epic EHR technology.
Researchers observed the performance of machine learning models designed to detect prediagnostic heart failure in primary care patients using longitudinal data in EpicCare EHRs.
“Information that can be gained on populations of patients from longitudinal EHR data can be used to individualize care for a given patient,” Ng et al. stated. “Access to these data in combination with the rapid evolution of modern machine learning and data mining techniques offers a potentially promising means to accelerate discoveries that can be readily translated to clinical practice.”
For the past three years, scientists have utilized artificial intelligence (AI) — natural language processing, machine learning, big data analytics, etc. — to predict the likelihood of heart failure in patients.
Additionally, doctors use patient EHRs to record signs and symptoms of heart failure and order diagnostic tests to determine a patient’s likelihood of having the disease.
Even with these methods in place, patients are generally diagnosed with heart failure during hospitalizations after an event has already occurred indicating the disease has progressed to the point of potentially irreversible organ damage.
In an effort to detect heart failure before significant organ damage has already taken its toll, researchers determined the optimal way to train effective onset predictive models using EHR data.
The team assessed model performance according to prediction window length (i.e., amount of time before clinical diagnosis), observation window length (i.e., duration of observation before the prediction window), and number of data domains (i.e., data diversity), among other metrics.
Researchers observed 1,648 heart failure cases and 13,525 sex, age category, and clinic matched controls for modeling.
For analysis, researchers extracted data from EpicCare EHRs at the Geisinger Clinic in Pennsylvania for all cases and controls from each data type. EHR data came from outpatient face-to-face patient-provider interactions and inpatient visits.
Ultimately, the research showed model performance improved as the prediction window length decreased by less than two years and the observation window length increased and then leveled off after 2 years.
Additionally, model performance improved as the training dataset size increased and leveled off after 4,000 patients, more diverse data was used, and data was confined to patients with less than 10 phone or face-to-face provider interactions in two years.
“The most surprising finding in this study is how sensitive the model performance is to the size of the training set,” noted researchers. “Substantial improvements to our predictive model were achieved by increasing the training set size from 1000 patient records to 4000 records.”
With this recent research, IBM and Sutter Health scientists have gained insight into practical tradeoffs and types of data needed to train models and developed potential new application methods allowing future models to be more easily adopted.
Another valuable insight gleaned during the study is that the types of data formerly used to predict heart failure were not always relevant.
The team found only 6 of the 28 original risk factors within the Framingham Heart Failure Signs and Symptoms (FHFSS) were consistently shown to predict a future diagnosis of heart failure.
However, other data types collected in EHRs including disease diagnosis, medication prescriptions, and lab tests were more effective and helpful to serve as predictors of a patient’s onset of heart failure when combined with FHFSS.
“Our findings are consistent with previous work using EHR data,” stated researchers. “In predicting mortality over different time horizons from end-stage renal disease, machine learning models that contain all available predictors outperform those that do not.”
Other team findings showed that other data types routinely collected in EHRs (such as disease diagnoses, medication prescriptions and lab tests) when combined with FHFSS could be helpful predictors of a patient's onset of heart failure.
IBM stated the group will continue to work together to improve the accuracy and clinical relevance of EHR data utilization in clinical care.
“There are many possible directions for future work,” wrote researchers. Larger patient data sets and more in-depth analysis of the risk factors involved captured by the predictive model will yield further insights in future studies.
IBM and Sutter Health scientists are optimistic this newest research may have wider implications for using EHR data to improve predictive medicine for other diseases.