Every healthcare organization is trying to solve the problem of structuring their data to create a unified 360-degree view of their patients or members. Doing this correctly allows you can make better patient support decisions, operate more efficiently, and identify population health trends.
This is part one (of a two-part blog series) where we demonstrate how modern data architecture on AWS and a strategic approach of data mesh is ideally suited for healthcare organizations to:
- Build data domains
- Enable cross-domain collaboration through a centralized federated governance
- Create and deliver data as a product
In this post we show you how Amazon HealthLake makes it straightforward for data domains to extract meaning from unstructured data, store, transform, query, and analyze large-scale health data to create a data product in the AWS Cloud. Finally, we show you how organizations can create a unified 360-degree view of their members (Member-360).
In the healthcare industry, Member-360 refers to a comprehensive database of patient information, which is unified to provide a 360-degree view of a member. Creating the member view helps organizations deliver the desired health outcomes. These outcomes could be safety of care, member or patient experience, timeliness and effectiveness of care, and many more for each member while capturing financial efficiencies.
Healthcare has recently been transformed by two remarkable innovations: Medical Interoperability and machine learning (ML).
For years healthcare technology, based on a business function, often created data silos. This approach is changing as the industry is adopting cloud infrastructure and moving towards cloud-based modern data architecture to break down data silos. This architecture covers:
- FHIR standard
- Data curation and harmonization
- Advanced analytics using AI and machine learning
Health systems produce petabytes of data every day, this data is highly contextual, heavily multi-modal, and growing exponentially. This mass of information comprises:
- Electronic medical records
- Electronic health records (for example, clinical notes)
- Documents (for example, PDF laboratory reports)
- Forms (for example, insurance claims)
- Images (for example, X-ray, MRI)
- Audio (for example, recorded conversations)
- Time series data (for example, heart ECG or brain EEG traces)
- Streaming data from wearables and more
Adoption of interoperability standards, such as the Fast Healthcare Interoperability Resources (or FHIR), aim to provide a consistent format to describe and exchange structured data across health systems. However, a significant amount of information is unstructured data, which means the data needs to be extracted and transformed before it can be searched and analyzed.
The process to extract this information is labor-intensive and error-prone. The cost and operational complexity can also be a challenge for most health organizations.
To overcome these challenges, healthcare organizations must modernize their data architecture. The modern data architecture acknowledges the idea that taking a one-size-fits-all approach to implementing a Member-360 solution eventually leads to compromises. To make business decisions driven by data, you can become agile and productive by adopting a mindset that delivers data products from/for specialized teams.
For example, a claims data product helps a payor/provider to understand patient interactions with healthcare claims. Or a clinical data product could provide insights into healthcare facilities. Data Products enable healthcare professionals to make data driven decisions about the effectiveness of patient engagement programs and similar healthcare initiatives.
To create these data products the organizations should build their data architecture based on the data mesh principles:
- Domain-oriented decentralized data ownership
- Self-server data infrastructure as a platform
- Data as a product
- Federated computational governance
A health data platform built on data mesh principles enables health organizations to share health data efficiently and securely. This could improve quality of care, patient support, observability and enhance collaboration on R&D use cases.
Fig. 1 – A representation of sources which Amazon HealthLake can ingest information from
Amazon HealthLake (HealthLake) takes data from diverse sources (such as hospitals, pharmaceutical companies, providers, patients, health plans and labs) and creates a FHIR compliant data store. HealthLake also provides you with tools to import, export, and query your data store.
Furthermore, HealthLake transforms unstructured text using natural language processing (NLP) models into FHIR resources. By leveraging these features, healthcare organizations can define and publish the data products which could benefit:
- Clinical research
- Billing and claims
- Population health management
- Patient engagement
- Healthcare informatics
- Practice management and other fields
Fig. 2 – High-level flow – how Amazon HealthLake works
Fig. 3 – A representation of Medical Chronology
Fig. 4 – Distribution of medical conditions across the population by encounter type
The data is indexed and structured in chronological order in the cloud (as shown in Fig 3). These chronological insights help healthcare organizations:
- Explore trends about patients or population and personalize care at the individual level using Amazon QuickSight (as shown in Fig. 4).
- Amazon QuickSight is a cloud-based business intelligence (BI) services that makes it straightforward to build a dashboard in the cloud. It can connect to your data source in AWS, on-premises, or data available in other public cloud providers to build a customizable dashboard leveraging ML integration for insights.
- Build, train and deploy ML models with Amazon SageMaker to make predictions and use the results to recommend early intervention, improve care, and reduce overall cost. For example, predicting a patient’s risk of heart failure from sepsis, or optimize and improve patient flow.
- Amazon SageMaker is a fully managed machine learning service. With SageMaker, data scientists and developers can quickly build and train machine learning models, and then directly deploy then into production-ready hosted environment.
- Share and access data in a secure, compliant and auditable way using standard APIs (create, update, delete, search, read, export).
Specifically for healthcare industry, AWS provides a number of global certifications and accreditations such as HIPAA, HITRUST, GDPR, and others. Services which are compliant allow organizations to store, process, or transmit sensitive data in the cloud and improve their security and compliance posture.
Solution overview and architecture flow
The following architecture uses a data mesh to build a Member-360 solution. The solution uses three AWS accounts:
- Producer Account where the data products reside
- Central Governance Account for managing the access to the producer and consumer accounts, and to centrally manage data governance
- Consumer Account to build a Member-360 solution using Amazon HealthLake
Fig. 5 – Architecture diagram for building a unified Member-360 view using data mesh and Amazon HealthLake
- Using data mesh, healthcare payor or provider organizations organize their data around various domains like members, claims, billings, medical devices, benefits and more.
- With this approach, each of the domains own their data assets end-to-end and are responsible for building, operating, serving, and resolving any issues arising from the usage of their data assets.
- Each of the producer domains leverage a modern data platform by creating domain-based data lakes using AWS Lake Formation (Lake Formation). Lake Formation collects and catalogs data from various sources and moves it to Amazon Simple Storage Service (Amazon S3) where producers store raw and transformed data.
- Amazon S3 is a storage service that offers different tiers for storing hot, warm and cold data. Having data stored in appropriate tiers based on their usage pattern alleviates the infrastructure cost. On top of that you can manage data at any scale with robust access control, flexible replication tools and organization wide visibility. Amazon S3 supports virtually all file formats that are used in Healthcare Industries such as HL7, FHIR, EDI, XML, JSON, and CSV.
- The domain data is then ingested into Amazon HealthLake. Amazon HealthLake makes it straightforward to work with health data and extract relevant data points from unstructured clinical texts.
- Deep learning modeling techniques give us options to build more accurate models with less feature engineering effort. AWS technologies make it possible to visualize model interpretations with a lightweight front-end solution.
- HealthLake supports interoperable standards such as FHIR format and uses natural language processing trained to understand medical terminology to enrich unstructured data with standardized labels (such as for medications, conditions, diagnoses, and procedures). All this information is normalized and added to the member’s record providing a complete view of all the member attributes.
- Each of the domain producers then register and create catalog entries in a centralized data governance account using AWS Glue. AWS Glue is a serverless data integration service that makes it straight forward to discover, prepare, and combine data for analytics, machine learning, and application development.
- The Central Governance Account uses AWS Lake Formation to centrally define security, governance and auditing policies in one place. Lake Formation also provides uniform access control for enterprise-wide data sharing through resource shares with centralized governance and auditing. Each consumer obtains access to shared resources from the Central Governance Account in the form of resource links. These are available in the consumer’s local Lake Formation and AWS Glue Data Catalog, allowing database and table access that can be managed by consumer admins.
- Consumers then runs analytical services such as Amazon Athena and Amazon QuickSight on the data catalog to build a Member-360 view dashboard.
- Amazon Athena is a serverless interactive query service that makes it easy to analyze data using standard SQL. It also supports federated query that allows you to query data in sources other than Amazon S3, and you can visualize that data using Amazon QuickSight.
Handling PII and PHI data
Source data contains protected health information (PHI)/personal identifiable information (PII) related to members, claims, and financial transactions. The solution leverages AWS services which are HIPAA eligible and encrypts data at rest and in-transit:
- AWS Lake Formation is a HIPAA-eligible service which helps you build a secure data lake in a few steps.
- AWS Glue and Amazon S3 can configure encryption of data at-rest and in-transit. For masking PII data you can use AWS Glue DataBrew which is a visual data preparation tool.
- Amazon HealthLake is GDPR-compliant and a HIPAA-eligible service which meets rigorous security and access control standards to ensure patients’ sensitive health data is protected and meets regulatory compliance. Customer data is encrypted at all times, in-transit and at rest. Data is encrypted using Customer-Managed Keys (CMK).
- With AWS Key Management Service (AWS KMS) you can create and control the cryptographic keys that are used to protect your data. This will help you securely generate hash-based message authentication codes (HMACs) that ensure message integrity and authenticity.
In this post, we shared a high-level data mesh architecture that can simplify the process of healthcare specific system integration, derive meaning from unstructured data, and create a Member-360 view.
By leveraging this architecture healthcare organizations can innovate faster within their respective business domains, increase cross-domain collaboration, and create aggregated data products that can accelerate member or patient outcomes.
In our next post we will show you the technical implementation of how a data producer leverages Amazon HealthLake to create a data product, and how a data consumer accesses the data through a centralized federated governance layer.
To learn more about this solution or to know what AWS can do for you contact your AWS Representative.
- Design a data mesh architecture using AWS Lake Formation and AWS Glue
- Build and train ML models using a data mesh architecture on AWS
- Unlock patient data insights using Amazon HealthLake
- Population health applications with Amazon HealthLake
- Build a patient outcome prediction application using Amazon HealthLake and Amazon SageMaker