Are you mining unstructured data?

Are you mining unstructured data?

In the US healthcare industry, nearly 1.2 billion clinical documents are produced each year, out of which 60-80% of valuable clinical data is in the form of unstructured data. This means that a significant amount of data that could be effectively used to improve outcomes by a large margin is stored in the form of narrative text.

Structured and Unstructured data –

Structured data can easily be searched and correlated by the use of certain algorithms and can be laid out in an organized and detailed manner which allows relevant information to be accessed quickly and efficiently.

Unstructured data refers to data that is presented largely in the form of human language, and they are not easily searchable. Unstructured data has scattered information but contains valuable insights that form the basis for identifying patterns and developing new ideas.

Why does this matter?

Healthcare organizations are not leveraging the unstructured data as much as we would like to. In a report published in July 2016, Gartner estimates that less than 5 percent of the organizations are exploring the unstructured data analytics. Several organizations still depend heavily on the coded data. If most organizations are tapping into 20-40% of the data (structured data) that is being generated, we have a huge opportunity to better understand our patient populations by analyzing the unstructured data.

How can we process unstructured data?

Text mining (Text analytics) helps us to identify relevant information by transforming the text into data. Some of the frequently used techniques include sentiment analysis, Natural Language Processing (NLP), social media monitoring, etc.

“Text mining is the process or practice of examining large collections of written resources in order to generate new information” – Oxford Definition

Application of Text Analytics in Healthcare –

There are several applications of Text analytics in healthcare. Few examples include –

  • Clinical Documentation Improvement (CDI): Using Text analytics, CDI teams can generate a narrow list of cases that may have potential gaps in documentation within a matter of seconds instead of spending several hours or even days in reviewing each and every chart. This helps the organizations to be more efficient and potentially improve the financial reimbursement.
  • Real-time data about the patients: Organizations can use text analytics to continuously monitor the patients in the health systems and improve the outcomes proactively. For example, for the patients with sepsis, the symptoms could be uncovered in real-time and alerts could be sent to the care teams based on the data that is generated through text mining.
  • Improving Outcomes: Text analytics allows healthcare systems to better understand and analyze the patterns of certain types of diseases in a high-risk population. For instance, in Indiana, a team of research scientists launched a study to identify patients who suffered from peripheral arterial disease or PAD. The team realized that in order to understand high-risk patients, they needed to leverage text analytics. The team integrated text analytics into their project and developed sophisticated algorithms in order to come up with a comprehensive list of patients suffering from PAD. The result was that the end report gave details of over 41,000 patients who suffered from PAD – a number that is difficult to achieve using the traditional methods of data analysis.
  • Research: Text analytics combined with NLP allows us to extract certain attributes of complex illnesses and improves real-time monitoring in order to better coordinate clinical care. These attributes include the stage of the illness, bio markers, histology, and so on. Text analytics has especially been instrumental in the field of cancer research for the very same reason.

As healthcare is transitioning from volume to value, organizations need to leverage text analytics to perform well on the outcomes that matter. Relying only on structured data for population health analytics is not an ideal option anymore.

Tags: , , , , , , , , , , , , ,