Data fragmentation in European healthcare systems is a growing challenge. Data mesh technology can help to build a data collaboration platform that connects the central and regional government health agencies, insurers, academic medical centers (AMC) and various care providers.
The healthcare industry is modernizing at a rapid speed. New innovations such as wearable health devices, Internet of Medical Things (IoMT), smartphone apps, and telemedicine/virtual care are resulting in more data fragmentation in healthcare systems. Furthermore, new organizations, which want to compete and be successful in their business model through data, are designed not to share their health data assets. This is leading to even further fragmentation in the current national healthcare data landscape.
Add on top of all the information fractures the COVID-19 pandemic. Across geographies, governments suddenly needed to envision a post pandemic healthcare system. They realized the importance of healthcare data sharing mechanisms in strengthening the overall agility and the readiness of their public health system. For example, as a part of 2022 EU4Health Work Programme, the European Union (EU) is planning to invest €5.3 billion to enhance health systems in European countries, including around €77 million for digital and data initiatives.
In this post, we explore how data mesh technology can help achieve data federation by designing and enhancing collaboration within healthcare systems. Through data mesh principles and supporting technology building blocks, healthcare systems can build decentralized, domain driven data platforms, enabling data driven decision making and collaboration across the healthcare journey.
Data Fragmentation in European Health Systems
Health systems have evolved based on many urgencies, innovations, and social needs. In the current state of healthcare, patient care is organized for episodes of care, not for collective care, such as population health, chronic disease, or aging population.
Figure 1: National healthcare system and data landscape
The above diagram provides a bird’s-eye view of a few main stakeholders and data owners in healthcare systems. Through the Ministry of Health (MoH), the government monitors the healthcare system in the majority of EU countries. For this, national health systems receive data from payer organizations and care providers. Payer organizations have the majority of data in their systems because primary, secondary and tertiary care providers share patient care invoices and details to the contracted insurers. National health systems receive data from payer companies, in order to govern and make policy decisions in healthcare. Healthcare providers use their local information and communication technology systems to store patient Electronic Health Records (EHR) and for care coordination. The ownership of the patient data also lies with these healthcare providers.
Data mesh technology for European Healthcare Systems
We propose that Data Mesh for National Healthcare Systems should be based on Data Mesh enterprise data architecture (Dehghani, 2019), and the FAIR guiding principles, [Findability, Accessibility, Interoperability, and Reusability (Wilkinson et al., 2016)]. The following diagram depicts such a data mesh at a high level.
Figure 2: Outline of a national health data mesh
Figure 3: High-level data mesh architecture on AWS
Following are the four layers of a healthcare data mesh that makes it ideal for creating a de-centralized platform for sharing data within a European country’s health system. We are only going to share a few AWS services, but a variety of AWS services can be aligned to realize these endeavors with competency, accessibility and governance.
Decentralized Data ownership
Data meshes are designed for decentralized data ownership, enabling organizations or business units who are most knowledgeable about specific data sets to own and manage their data. In a national healthcare data mesh, the central and regional government health agencies, insurers, and academic medical centers manage their own data landscape and act as a “node” in the network (the data mesh). Aligning the structure of the data landscape to the organization of the broader healthcare system allows for innovations within individual data domains, while allowing for smoother governance of data. Anonymized data that is ready for external consumption (clinical notes, medical images, EHR, Labs and Omics and device data), are exposed using the concept of ‘data products’ by each domain.
As an example, a primary healthcare provider can own their own systems for managing patient data while still publishing EHR data in an open standard such as HL7 FHIR R4 for authorized consumers. MoH also have their own data landscape with analytical capabilities necessary to understand national health trends. MoH can focus on aggregating published EHR data products from multiple care providers to deduce national health trends. They then publish those findings as their own data products, which other healthcare systems can consume for their needs.
Customers are already using AWS Data Exchange for making Healthcare datasets widely available and so can be a key enabler for a data mesh spanning multiple organizations. Each node in the mesh can have their domain owned data landscape either on AWS or on-premise or on another cloud. These nodes interact with the AWS Data Exchange using its public APIs for publishing and subscribing. Nodes that choose to build their data landscapes on AWS can do so by leveraging the services and architectural patterns for building a modern data landscape on AWS.
Healthcare Data Products
Healthcare data products are easily consumable data artifacts published by the participants of a healthcare data mesh. These are data sets with clear metadata and schema descriptions, usage scenarios, code snippets, and regulatory information. They should be discoverable via a public/private catalogue and must be compliant to regulatory and quality metrics established by the data mesh governance body.
Ensuring the quality and trustworthiness of a data product lies entirely with the publishing parties and is guaranteed by the governance processes of the mesh itself. This allows consumers to focus on innovative usage of the data products and not on redundant quality checks of someone else’s data product.
For example, a hospital generally has EHR, Radiology Information System (RIS), Genomic Information System, Laboratory Information System (LIS), and Picture Archiving and Communication System (PACS) units. Aligned with the expertise and ownership of these units, a health data product could be defined and published for clinical and research purposes.
In our proposed architecture, a producer of a data product publishes the curated data to the catalogue on AWS Data Exchange, a consumer finds it by using the search facility and an aggregator publish new data products based on already published ones on AWS Data Exchange. Additional details to enrich the data product with metadata and usage guidance can be provided through healthcare specific “Long Description” template of AWS Data Exchange.
Self-service Infrastructure Services
Having a decentralized data landscape can quickly result in duplication of infrastructural components across the mesh. Through its “self-service infrastructural services”, a data mesh can generate infrastructural components for its members, thereby reducing duplication and supporting less technically advanced members.
Some examples of components that could be provisioned on demand are:
- Publishing/consumption components that help members to interact with the mesh – These could be lightweight agents for integration with data pipelines to publish/update or consume data products using the AWS Data Exchange API Reference. This aws-dataechange-api-samples can help with development of such components.
- Services to translate between data standards – AWS services like AWS Glue DataBrew and AWS Lambda can be used to build services that transform data between different data standards.
- Security services for encryption and decryption – AWS Key Management Service (AWS KMS) and its encryption and decryption of APIs can be used to build security services available to mesh participants who don’t have the capabilities to build it themselves.
- Quality of service (QoS) services for testing for compliance and generating quality metrics – Logic to validate a data set’s adherence to compliance and quality standards can be encompassed in AWS Lambda functions and dished out to the node participants as APIs using Amazon API Gateway.
Furthermore Infrastructure as code (IaC) services like AWS CloudFormation, component repositories hosted on Amazon Simple Storage Service (Amazon S3), automated deployment services like AWS CodeCommit, AWS CodeDeploy and AWS CodePipeline can be leveraged to implement a feature rich self-service layer.
This layer can also be used to provision end-to-end data pipelines and analytical components for the less tech savvy members of the mesh. This reduces the overall costs of running a data mesh, as well as lowering the barriers of entry to its members, thereby increasing adoption.
Federated Data Governance
A key consideration in the creation of a national health data mesh is how to tackle the complex socio-political and financial relationship between the entities. Having a representative data council with an agile mindset to decision-making is key to the success of the data mesh.
The data council will be responsible for decisions around but not limited to:
- Setting the vision and architectural principles for the data mesh
- Manage the roadmap for platform capabilities
- Manage funding and investment areas
- Separating governance decisions at the central level vs at the domain level
- Setting guidance on health data standards
- Setting out the security policies, quality metrics (SLOs) and regulatory compliance
Data meshes can automate governance processes through “policy as code” and digital execution. Creation of self-service APIs, to be used by the nodes to plug into the governance processes and dishing them out through the self-service layer, will accelerate automation of the governance layer.
On AWS, the Governance layer can be automated by implementing “Policy as code” solutions using AWS Cloud Deployment Kit (AWS CDK), AWS Lambda and declarative policy standards like Open Policy Agent, as described in the blog Cloud governance and compliance on AWS with policy as code.
Envisioning Data Mesh for a European country’s healthcare system
Taking one European country as an example, each year there are approximately 234 million primary care medical consultations, 83 million hospital visits, 4 million hospital admissions and 23 million emergencies in the national health system. Associated with these services, a wide variety of diagnostic tests and prescriptions are generated and stored.
This clinical data is then used as primary data for patient care or secondary for research purposes. However, in the current state, all this data is fragmented across the different regional healthcare systems, and can’t be easily aggregated.
A healthcare data platform following the data mesh technology could be created jointly between the central administration and the different regional services, with coordination at national level. This will help address:
- Investment needs of participating regional healthcare systems
- Aggregation of regional data to detect national health trends
- Unlock the potential of value generation through primary or secondary use-cases for health data
- Allow flexibility and autonomy to each regional and national producer and consumer of health data mesh
The same listed AWS services could be applied here to create a seamless data mesh for European healthcare systems.
In this post, we shared how data mesh technology provides a mechanism to build data federation by design and help remove data fragmentation in national health systems. Giving each individual healthcare organization complete control over their data landscape will promote innovations within each domain. Data mesh’s Federated governance and self-service infrastructure provisioning help reduce infrastructural duplications, optimize overall costs of running a data mesh, and lower the barriers of entry to its members, thereby increasing adoption.
At the same time, democratization of data through data products will promote innovative usage of data by the health care industry, pushing usage beyond the original intend of the data sets. The self-service platform can become a vehicle for distributing investments in the form of technical capabilities to the participating organizations, which can be of great benefit in case of centralized funding.
A health data platform built on data mesh principles could enable health organizations to share health data efficiently and securely to improve quality of care, patient support (clinical use), observability and enhance collaboration on research use cases. With the introduction of new innovations in healthcare enabled by data mesh, organizations can help longitudinally follow a patient journey, thereby massively improving care pathways.
We only named a few of the services that can achieve a data mesh for healthcare systems. However, we also know that not every healthcare system is alike. To that end, AWS has multiple ways and services to best fit each individual situation.
To discuss anything talked about in this blog contact: Krishna Singh, Technical Business Development Manager – Healthcare email@example.com
- For designing a data mesh read:
- For building data mesh check out:
- For more about how Covid19 was a stressor on healthcare systems: