Big Data Platform
Throughout the healthcare industry, data is captured from many different sources, and while standards for exchanging information between healthcare applications are emerging, much of the data associated with population health remains in disparate silos, in various formats, on paper, and is both interchanged and processed without automation. Where investments have been made in the digitization of health data, many of the resulting solutions remain ‘‘walled gardens’’ of information — data that is static and not easily shared or interpreted.
Our cloud-based, analytics platform was designed and developed to address these challenges. Our platform enables integration of any data source, on any hardware platform, in any data format at extremely high speeds. This advanced approach to delivering technology is comprehensive in that it provides for real-time capture, extremely rapid analytical processing and redistribution of health data. We believe that no other healthcare technology platform so effectively addresses the integration of payers, providers, pharma/life sciences, specialty pharmacy, and device manufacturers, with high volume, at rapid velocity, with the same depth of data.
We believe that our cloud-based capabilities enable us to receive, integrate, and process extremely large-scale data flows at truly industry-leading speeds, creating what we believe to be a material market differentiator and value creator for us and our clients. While data integration and processing at scale within the healthcare landscape (known for its highly disparate and ‘‘dirty’’ data characteristics) are key technology barriers to many organizations, we believe that we have made these capabilities a true differentiator — we are able to onboard clients and maintain high velocity computes in industry-leading times.
Our cloud-based, analytics platform has been created through the use of internally created software coupled with industry-leading technology frameworks that are vendor-agnostic. We leverage modern big data frameworks such as Hadoop Distributed File System and Hadoop which enable our platform to store structured and unstructured data while making it readily accessible by our analytics engine. Our big data processing capabilities enable dramatic improvements in data integration and analytical cycle speed to value recognition to empower improvements for intelligent product development through the ‘‘real world’’ functional application. Our cloud-based, analytics platform laid the foundation of the data fabric allowing integration into our analytical capabilities. We have moved analytics to the data instead of requiring the data to be brought to the analytics platform.
Our cloud-based, analytics platform receives information from multiple external sources that are loaded into our ‘‘data lake’’ in its native format. Files may be received through secure FTP, web services, and direct connections to external systems. Loading the data into the data lake in its native format ensures that we maintain all data as it is received and allows users to query the data directly in its structured or unstructured format.
Processing data in its raw format presents many technological challenges. We have developed interactive data mapping technologies to support the mapping of the raw data files to staging structures used by our platform to convert data from its native format into a structured format that can be used by all processes on our platform. Once mapped, the data is run through multiple processes to standardize the data and perform data verification and integrity checks. For example, one source may provide person’s gender using code values of ‘‘1’’ for male and ‘‘2’’ for female. Other clients may use values of ‘‘M’’ and ‘‘F’’ to represent the same data. Similarly, one source may send a specific laboratory result value as 7.25 while another source may fill in significant digits and send 7250. Our platform applies our advanced data integrity analytics to convert the incoming data to values that are uniform across our entire platform.
Our technology platform is built upon modern big data frameworks such as Hadoop Distributed File System and Hadoop which enables our platform to store structured and unstructured data while making it readily accessible by our analytics engine. Data access provided by our data lake leverages scalable application program interfaces, or APIs, and service based architecture techniques enabling access to the contextual data needed to perform many different types of analytics. An API is an application program interface, or software intermediary, that makes it possible for disparate systems to communicate and function with each other. Ultimately, data is provided to the analytics process and results are stored via service based requests to provide a scalable repository of source and results data.
We believe that our track record of strong service is the result of our commitment to excellence and our devotion to maintaining one of the industry’s most sophisticated technology infrastructures. We have made significant investments over the past decade to build an industry leading enterprise-scale infrastructure capable of managing the heavy computing and storage requirements of our data-driven business. Today, we employ a combination of owned, virtualized data centers along with hosted facilities to enable seamless, secure, and scalable solutions nationwide.
Our physical compute and storage infrastructure is deployed with a hybrid approach to cloud computing. Leveraging heavily virtualized infrastructure together with orchestration and automation tools, we have achieved tremendous capabilities within our private cloud environment. The following diagram provides a high-level overview of our key infrastructure elements
Our data and compute capacity is maintained within an interconnected set of infrastructure sets made up of two principal datacenters owned by us in the Washington Metro and Atlanta Metro region, and one co-located datacenter facility located in Northern Virginia, with the ability to interconnect agnostically to third party cloud capacity providers such as those shown within the diagram. This macro architecture provides us a significant ability to maintain both enterprise-level capacity and redundancy, while also achieving significant flexibility and cost effectiveness for burst capacity needs.
We have a proven track record in implementing virtualization as our current datacenters are over 85% virtualized using VMware technologies. Operations of the virtualization technologies are streamlined by the orchestration, automation, and reporting capabilities provided by our private cloud and integration with public cloud service providers. These technologies will be used to provide computing, storage, and networking components to the hosting environment and provide operational efficiencies and cost optimization for the corporation.
In partnership with EMC, VMware, and Pivotal, we have implemented a sophisticated hybrid cloud and service based application stack design, enabling ‘‘burst’’ capacity architecture to allow provider-agnostic utilization of public cloud capacity if such capacity is required. Our virtualization technology has been integrated with automation and orchestration technology to create a cloud environment that provides both Infrastructure and Platform as Service capabilities. These service based capabilities allow us to dynamically expand our compute capacity in real time and provide the business with a cost effective and nimble platform. By leveraging both private and public cloud offerings, we can provide efficient, elastic, and cost effective compute resources based on the operational needs of our clients. We believe we are pioneers in the use of big data technology and high performance compute technology stack at the point of care in our industry.
Our platform is built utilizing an innovative enterprise infrastructure platform enabling robust performance scaling, strong security, high availability, and advanced business continuity options. The building blocks of this infrastructure consist of the following:
- Multiple data centers connected by redundant high-speed WAN connections;
- High competency and utilization of virtualization technologies;
- Rapid provisioning of computing capabilities to support the dynamic elasticity needed to support the variable computing needs of the application;
- Measured service to optimize resource utilization and provide transparency of the utilized services; and
- Available hosting facilities providing physical structure compliance with Federal Information Security Management Act, or FISMA, standards.
The following diagram provides a high-level view of our key platform elements.
Network Operations Center
We maintain a central network operations center, or NOC, where systems are monitored to ensure proper operation and capacity utilization. The NOC monitors and collects information about a multitude of technology operating metrics regarding system load and status. In conjunction with the rapid provisioning capability, automation, and standardization, the NOC provides us with the automated capabilities to oversee and manage our technology resources in order to meet business demands.
Infrastructure Certification and Compliance
We leverage third party attestations to test and validate our technology controls and operating framework. Among these attestations, a nationally recognized professional services firm has conducted an annual Statement on Standards for Attestation Engagements, or SSAE, No. 16, Reporting on Controls at a Service Organization audit of our toolsets and infrastructure for the last several years. We also undergo third party audits and assessments as required by our clients.
Privacy Management and Data Security
Protected health information is perhaps the most sensitive component of personal information. It is highly important that information about an individual’s healthcare is properly and thoroughly protected from any inappropriate access, use and disclosure. Given the industry vertical in which we operate, we realize the importance of the safety and sensitivity of personal health information. We have been a trusted partner to our clients and are committed to ensuring the security and privacy of our client data, enterprise data, and our systems through the application of highly trained personnel, robust processes, and technology. Our privacy and security management includes
- Governance, frameworks, and models to promote good decision making and accountability. Our comprehensive privacy and security program is based on industry practices including those of the National Institute of Standards and Technology, the Control Objectives for Information and Related Technology, Defense Information Systems Agency, and FISMA;
- An internal security council, which advises on and prioritizes the development of information security initiatives, projects, and policies;
- A layered approach to privacy and security management to avoid single points of failure;
- Ongoing evaluation of privacy and security practices to promote continuous improvement;
- Use of safeguards and controls including: administrative, technical, and physical safeguards; collaboration with our clients on best security and privacy practices; and
- Working closely with leading researchers, thought leaders, and policy makers.