Electronic Health Records

Use & Optimization News

IBM: Natural language, machine learning can flag heart disease

By Jennifer Bresnick

- When two giants come together, there’s the potential for an epic clash of the titans – or in this case, an Epic collaboration between the EHR superstar and IBM, which has been turning its attention to healthcare in a big way.  Building off the foundations laid by the Watson supercomputer, IBM has been developing a suite of applications targeted towards bringing powerful analytics to providers in an intuitive manner.  As a test of its capabilities, IBM and Epic Systems conducted a trial program at Carilion Clinic in Roanoke, Virginia to identify patients at risk for developing heart disease using natural language found in physician notes and the creative learning prowess of its processors.

Paul Hake, of the IBM Smarter Care Analytics Group, spoke to EHRintelligence about the analytics behind the project, which was able to identify 8500 patients at risk of developing congestive heart failure within a year.

What are some of the successful highlights of the study?

Congestive heart failure is a big problem in health care, and anything we can do to use advanced analytics to help identify patients early in the disease progression cycle offers the opportunity to intervene.  We found that our Advanced Care Insights model helped us distinguish between the patients that had actually developed the disease and those that hadn’t yet.

Using unstructured data was found to be important in this project.  When physicians are recording information, they’ll just prefer to type everything in one place into the notes section of the EMR.  And so this information is kind of lost.  It’s then almost a manual process to map this unstructured information back into the EMR system so that we can then use it for analytics.  We can run natural language processing algorithms against this data and automatically extract these features or risk factors from the notes in the medical record.

READ MORE: Clinical Decision Support Central to Advancing Cancer Care

Particularly noteworthy was the way that it could extract the social and behavioral factors about the patients, which often are missing in the EMR.  Those are some of the factors that are significant in terms of the risk factors.  Is the patient depressed?  What’s the living status of the patient?  Are they homeless?  These are some of the factors that turn out to be important in the model, but they are also things that can be missed from a traditional analysis that doesn’t consider this sort of unstructured data.

Additionally, the overall project was very fast.  We took this IBM research model and we deployed it very quickly.  Our project was only six weeks long, which is a very short time frame for a project like this.  We’re particularly excited about this.  The accuracy scores we achieved are quite impressive, but so is the prospect of being able to take this work to other joint customers with Epic and to open this capability up to a broader range of customers.

What specific risk factors did the analytics model take into account?

There are several known, standard risk factors, whether it’s from the Framingham criteria or other risk models that are out there.  We included those, and other co-morbidities like coronary artery disease, hypertension, diabetes, and age.


We combine that knowledge base with a data-driven approach, and what that does is it uncovers a number of other things.  Apart from the social factors and co-morbidities that I just mentioned, we’d see things like a certain pattern of medications that were prescribed.  And whilst not significant by themselves, in pattern and in combination with other factors, they turn out to be significant.  It’s really the pattern of these, I think, combined with other risk factors, which give the gain in this sort of model.

READ MORE: Health Data Sharing Leads to New Flu Trend Prediction System

How accurate are these algorithms when it comes to unstructured data extraction and understanding?

They’re not perfect, but it’s pretty good. We’ve reached 85% accuracy when predicting heart failure.  It’s certainly a lot faster than doing it manually.  In this Carilion project, we processed 21 million records like this, and that just wouldn’t be feasible for Carilion to do on their own.  They would have to spend years doing that, and we can just run them the NLP engine in a few hours, and it can process that enormous volume of notes.

Part of our collaboration with Epic is introducing this NLP concept into Epic’s products.  A physician can type notes into the text field and then hit this ‘Process NLP’ button and have the natural language processing tool extract the meaning automatically.  The idea is that it’s not like a regular search, where you have to spell everything correctly.  It can find terms that are ambiguous, and it has intelligence to understand terms like negation and context.  So if you say, “the patient does not report symptoms of restlessness,” it’s smart enough to figure that out.

How does this project compare with the other work that Watson has been doing?

This solution shares the same foundational architecture that IBM Watson does.  The natural language processing engine is the same. It’s obviously more straightforward and easier to implement than Watson is.  Watson is moving to a cloud model anyway, but in its older versions, it’s a complex system, right?  And so it’s not something you could implement only in six weeks.

These software components are not new.  They have been around for ten years or so; they’re not new.  But packaging them up in this kind of configurable solution is new.  We’ve written a bunch of these rules that extract the text related to health care data, and we also have some custom model algorithms as well.   The combination of old software assets plus new content is really the innovative part of bringing analytics to the point of care.

What’s next on the horizon for this sort of analytics work?

The platform supports a range of different diseases, including diabetes and COPD.  We absolutely plan to extend this horizontally into different disease areas, and we already have models targeting diabetes, for example.  Also, our colleagues on IBM Watson analytics team have a lot of these annotated rules written for extracting terms related to cancer, so we can leverage that work as well.

We want to expand this vertically, as well.  So when we look at heart failure, there are, of course, compelling issues with treating heart failure patients, including things like readmissions.  In addition to broadening the disease types, we also have this capability to expand into for models for predicting readmissions and other sorts of adverse events as well.

And a lot of that capability exists today.  It’s just that we’re trying to round out the whole platform.  Carilion is one custom example.  There are others that have been using this type of approach for a while, but in addition to Carilion, and they also use the technology in other disease areas as well.

Continue to site...