Healthcare Data Lake: The Key to Operating a Data Informed Organization
The Data Informed Healthcare Organization is Here
Two decades ago, it was often the ideas of the one or two executive visionaries driving the entire business priority. Today, the most successful healthcare organizations are using data to validate ideas and further refine them through advanced studies and predictive models.
The data informed healthcare organization has come of age through recent advances in data technologies, a surge in artificial intelligence and machine learning capabilities, and the availability of high-compute, storage-efficient hardware through the Commercial Cloud (AWS, Azure, Google Cloud). This influx of technology and talent in the market has resulted in a lower barrier to entry for data informed intelligence. Market competition and large-scale innovation has reduced the learning curve as well as the cost.
The Essential Role of a Strong Data Architecture – and How to Achieve It
Becoming a data informed healthcare organization starts with having a strong data architecture. Data must be secure, but readily available to those who need it. Data must be cheap to store in extremely large volumes, but systems must be able to search through it in seconds or less. Complex data like JSON or images must be accessible through standard query languages like SQL.
Enter the Healthcare Data Lake – a collection of datasets focused on patient claims history, analytical output from quality measurement and risk adjustment programs, clinical data from Electronic Health Record systems, and social determinants of health data. Removing the obstacles of siloed data sources in varying formats, the data lake creates one comprehensive, consolidated source of data for healthcare organizations to access on-demand in support of a variety of clinical and business use cases.
Common Data Lake Misconceptions
When I first heard the term “Data Lake” and learned a little about it, the overarching promise of one all-encompassing data source sounded a bit intimidating; like something that would be very big, messy, and challenging to deal with and gain value from. This is not an uncommon perception – and not totally without merit. However, when implemented properly, a data lake delivers speed, accuracy, and ease of integration with the organization’s current tools and workflows, avoiding these top data lake misconceptions:
#1 – “A data lake is complex and with this volume of data it would take weeks to update.”
We have customers today who are getting data lake refreshes in a few hours. It used to take two weeks for them to populate the same data in their own on-prem data warehouse.
#2 – “This massive amount of data will be too hard to work with and understand.”
The data is structured within the data lake – all sources can be connected through common keys and we have data dictionaries that describe the data elements.
#3 – “We’ve already spent years and millions of dollars building our own analytics data warehouse and we don’t want to throw all that work away.”
This is not an either/or proposition. Technologies powering data lakes often use data sharing and replication to push data across regions and even across clouds or into private data centers. Data lakes can be an extension and enrichment of existing data warehouses.
#4 – “If I use a third-party data lake, my team can’t connect all their analytics tools to it.”
Tools such as SageMaker, SAS, or even business applications can securely connect to the data lake. At Inovalon, we consider the data lake an extension of our customers’ datasets and encourage direct connectivity, when needed.
Delivering a Flexible, Scalable Healthcare Data Lake
Data lakes are historically made up of raw structured and unstructured data, but data in Inovalon’s Healthcare Data Lake is largely structured, making it easy to understand and consume for these use cases. Customers can integrate supplemental data sources into their data lake instance as they see fit. The Healthcare Data Lake is not limited by a single, finite data model, meaning every customer’s data lake looks different based on their unique needs and goals. Additionally, most of Inovalon’s solutions leverage the Healthcare Data Lake as their data store, resulting in more sources with minimal effort to incorporate into the customer’s data lake.
Let’s explore some data lake use cases for healthcare:
- Leveraging clinical data to identify populations or diagnoses that may be under-reported for risk and quality programs
- Equipping care managers with access to real-time clinical data to proactively prevent avoidable emergency department visits, hospitalizations, etc.
- Integrating meaningful clinical outcomes into provider report cards
- Monitoring opioid prescribing patterns to identify potential patient safety issues and detect potential instances of fraud, waste, and abuse
- Evaluating member care-seeking patterns for use in benefit design, network, and quality initiatives
Use Case Example: Improving Cancer Screening Rates in Older Adults
A health plan wants to understand where to focus its patient outreach campaigns to improve cancer screening rates in older adults, so the data analyst logs into the data lake, grabs non-compliant patients for the relevant cancer screening measures using a basic SQL query, groups it by ZIP code, and views the results in table format. The analyst then creates a heatmap to visually display where the patient-specific measure gaps are concentrated using a visualization tool. The outreach manager can use this report to quickly identify a few locations to focus outreach and inform their staffing model for interventions. As a result, a project that would have previously taken months to do can now be completed within days – delivering speed-to-value for both members and the organization.
Discover the Value of Real-Time Access to Your Data, Enriched and at Your Fingertips
If your organization uses data to inform business decisions and you aren’t investing in a cloud-based data lake, now is the perfect time to get started.
Discover how Inovalon’s Healthcare Data Lake can drive speed-to-value for your organization – enabling you to confidently merge and enrich your complex, disparate data with applicable data from the industry’s largest primary source dataset to support customized analytics, business intelligence, and data exploration initiatives.
Schedule a demo today.