Enable Secure and Efficient Clinical Collaboration With Amazon S3 Access Points

September 6, 2022 Kai Xu

Securely sharing clinical research data with collaborators and efficiently managing data access is a recurring challenge that clinical research teams face. Amazon S3 Access Points enable clinical researchers and healthcare IT teams to efficiently create, share, and maintain scalable access to clinical data in a shared Amazon S3 bucket at the individual user level.

Introduction

Clinical research often requires collaboration from multiple stakeholders and organizations. Collaboration can foster a dynamic scientific community and expedite the development of research. One collaboration approach we often see from our Academic Medical Center customers is clinical data sharing. Clinical researchers usually need to receive clinical data from multiple collaborated institutions or researchers so that the data can be aggregated, transformed, and analyzed.

One key aspect of promoting better collaboration in clinical research is to remove barriers, including technical ones. Having IT teams provide a secure and efficient data sharing solution is critical to improving clinical collaborations for researchers and potentially speeding discoveries to market.

Healthcare IT teams that support the clinical research projects often use Amazon Simple Storage Service (Amazon S3) buckets as the storage solution for external collaborators to upload and share clinical data objects. Historically, they would decide between two options to secure and isolate the uploaded data:

  1. Create multiple Amazon S3 buckets, one for each collaborator. Hence collaborators can only upload the data to their designated bucket based on the S3 bucket policy permission.
  2. Create one shared Amazon S3 bucket and multiple folders, one folder for each collaborator. Then use an S3 bucket policy to grant permissions to each collaborator so they can access their designated bucket folder.

However, both options have their own limitations. By default, users can only create up to 100 buckets in each AWS account. Best practice is to create a S3 bucket for each business function instead of each user. On the other hand, creating a single bucket policy for one shared bucket (to control the access of many users with different permission levels) results in a complex policy. This would be time consuming to manage, need to be audited to make sure that changes don’t have an unexpected impact on other users, and can be easily over the 20 KB size-limit of a bucket policy.

A recommended solution to this would be the use of Amazon S3 Access Points, a feature of Amazon S3, for efficient data access control at the individual user level, which facilitates the security of data sharing during clinical research collaborations. Amazon S3 Access Points enable unique access control policies for each access point and grant permission to individual users. It is more efficient to scale access for hundreds of users by creating individualized access points with names and permissions customized for each user, without having to creating and maintaining a complex bucket policy.

Overview of solution

This solution demonstrates the use of Amazon S3 Access Points to allow a collaborator to access objects with a defined folder in a S3 bucket. An Amazon S3 bucket policy needs to be created once—enabling the use of access points. After that, the solution can scale the bucket access to more collaborators by configuring additional access points and policies, one for each collaborator, without having to edit the Amazon S3 bucket policy.

Below is a diagram depicting the solution with an example use case.

Figure 1: Solution diagram with an example use case

Figure 1: Solution diagram with an example use case

Components of the solution

Below are some of the important components showed in the example use case to depict the recommended solution.

  1. Users and permissions

Alice and Bob are two collaborating users with their own AWS IAM user identities created in an AWS account.

An IAM Policy is attached to both user identities allowing them to list S3 buckets and access S3 access points.

An example policy in JSON format:

{
    "Version": "2012-10-17",
    "Statement": {
        "Action": [
            "s3:GetAccessPoint",
            "s3:ListAllMyBuckets",
            "s3:ListAccessPoints",
            "s3:ListBucket"
        ],
        "Resource": "*",
        "Effect": "Allow"
    }
}
  1. S3 bucket policy

A bucket policy is created to allow put/upload objects with Access Points in the AWS account, in which we created IAM users for Alice and Bob.

An example bucket policy in JSON format:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:PutObject",
            "Resource": {
                "arn:aws:s3::: BUCKET-NAME/*"
            },
            "Condition": {
                "StringEquals": {
                    "s3:DataAccessPointAccount": "ACCOUNT-ID"
                }
            }
        }
    ]
}

To use this policy, please replace the place-holder texts for BUCKET-NAME and ACCOUNT-ID with your own values.

  1. Amazon S3 Access Points and access point policy

An access point is created for each user.

In Alice’s access point setting, an access point policy is created to only allow IAM user Alice to put objects in the bucket prefix assigned to Alice. (The folder/prefix should be created beforehand. This is how to create a folder in a S3 bucket.)

In Bob’s access point setting, an access point policy is created to only allow IAM user Bob to put objects in the bucket prefix assigned to Bob. (The folder/prefix should be created beforehand. Follow the same process as outlined above for Alice.)

An example access point policy in JSON format:

{
    "Version": "2012-10-17",
    "Statement": [
        {
           	"Sid": "Statement1",
          	"Effect": "Allow",
          	 "Principal": {
             	   	"AWS": "arn:aws:iam::ACCOUNT-ID:user/Alice"
           	 },
           	"Action": "s3:PutObject",
"Resource": "arn:aws:s3:REGION:ACCOUNT-ID:accesspoint/ACCESS-POINT-NAME/object/Alice/*"
        }
    ]
}

To use this policy, please replace the place-holder texts for ACCOUNT-ID, REGION, and ACCESS-POINT-NAME with your own values.

This is an example AWS CloudFormation template that can be downloaded and deployed in the CloudFormation console of an AWS account to test out the solution.

On the AWS CloudFormation Specify stack details page, Parameters section, enter passwords for the two IAM users to be created. The passwords should comply with the password policy defined in the AWS account. See Figure 2 below:

Figure 2: Enter passwords for the IAM users on the stack details page

Figure 2: Enter passwords for the IAM users on the stack details page

Using access points

You can access the objects in an Amazon S3 bucket with an access point using the AWS Management Console, Amazon Command Line Interface (AWS CLI), AWS SDK, or the Amazon S3 REST APIs. Access points have Amazon Resource Names (ARN) and a bucket-style alias for data access.

The following example shows a command for uploading files using AWS CLI. Here we upload an image file my-image.jpg through the access point alias my-access-point-hrzrlukc5m36ft7okagglf3gmwluquse1b-s3alias:

aws s3api put-object --bucket my-access-point-hrzrlukc5m36ft7okagglf3gmwluquse1b-s3alias --key my-image.jpg --body my-image.jpg

Testing the solution

We have showed the solution to manage the clinical data upload to a shared Amazon S3 bucket with access points. To test the example use case:

  1. Sign in to the AWS Management Console with IAM user name Alice and password configured in the passwdAlice parameter of the CloudFormation stack. Open the Amazon S3 console.
  2. On the S3 Bucket screen, access the shared Amazon S3 bucket with the name of s3-ap-stack-ACCOUNT_ID-s3-access-point. Try to upload a file. It should fail because the bucket policy only allows uploading of objects through Amazon S3 Access Points for user Alice.
  3. Access the bucket’s access point that was created for Alice, alice-access-point. Try to upload a file to the folder created for Alice. The upload should be successful.

Figure 3: Accessing Amazon S3 Access Points within the Amazon S3 console

Figure 3: Accessing Amazon S3 Access Points within the Amazon S3 console

  1. In the same access point for Alice, try to upload a file to the folder created for user Bob. The upload should fail because the access point policy only allows user Alice to put objects in Alice’s folder.
  2. Now have user Alice try to use the access point created for Bob, bob-access-point. Try to upload a file to the folder created for user Bob. The upload should fail because the access point policy only allows user Bob to put objects in Bob’s folder.

Conclusion

Securely and efficiently controlling the access to shared datasets is a challenge for clinical collaborations. A couple of options to implement data sharing were discussed – managing multiple S3 buckets or maintaining complex S3 bucket policies. Both options present challenges for maintainability. The solution proposed shows how to leverage Amazon S3 Access Points, a feature of Amazon S3, to manage data access in a shared S3 bucket at an individual user level. With this feature, healthcare IT teams can empower clinical researchers to share clinical data with collaborators efficiently. This enables removing technical barriers to unlock new discoveries, speed them to market, and reach a new level of collaboration.

For more information about managing data access with Amazon S3 Access Points, please check out the Amazon S3 User Guide.

Previous Article
Executive Conversations: Realizing the Potential of Cloud Technology in Healthcare
Executive Conversations: Realizing the Potential of Cloud Technology in Healthcare

Vignesh Shetty, Senior Vice President and General Manager of Edison AI & Platform at GE Healthcare, joins T...

Next Article
Executive Conversations: Future-proofing population genomics initiatives through federation with Thorben Seeger of Lifebit
Executive Conversations: Future-proofing population genomics initiatives through federation with Thorben Seeger of Lifebit

Population genomics initiatives amass a multitude of clinical, omics, and phenotypic data from diverse part...