Last Updated: June 2, 2020


The Globus AWS S3 storage connector can be used for access and sharing of data on AWS S3. The connector is available as an add-on subscription to organizations with a Globus Standard subscription - please contact us for pricing.

This document describes how to use the AWS S3 Connector connector to configure AWS S3 Storage Gateways and Collections. After these steps are complete, any Globus user you have authorized can register a credential to access AWS S3 buckets that they have access to and, if enabled, can create guest collections for sharing access using those credentials by following the instructions in How To Share Data Using Globus.

This document assumes that you or another administrator has already installed Globus Connect Server v5 on one or more data transfer nodes, and that you have an administrator role on that endpoint.

The installation must be done by an system administrator, and has the following distinct set of steps:

  • Create a storage gateway on the endpoint configured to use the AWS S3 Connector.

  • Create a mapped collection using the AWS S3 Storage Gateway to provide access to AWS S3 Storage Gateway data.

Please contact us at support@globus.org if you have questions or need help with installation and use of the AWS S3 Connector.


S3 Connector Virtual Filesystem

The S3 connector provides a distributed object store, where each data object is accessed based on a bucket name and an object name.

The S3 connector attempts to make this look like a regular filesystem, by treating the bucket name as the name of a directory in the root of the storage gateway’s file system. For example, if a user has access to buckets bucket1 and bucket2, then those buckets would show up as directories when listing /.

The S3 connector also treats the / character as a delimiter in the S3 API so that it can present something that looks like like subdirectories. For example, the object object1 in bucket1 would appear as /bucket1/object1 to The S3 connector, and the object object2/object3 in bucket2 would appear as a file called object3 in the directory /bucket2/object2.

Authenticated and Anonymous Access

Each S3 Storage Gateway can be configured to perform either authenticated or unauthenticated access to S3 data. When creating an S3 Storage Gateway, you must choose which type of access to require.

authenticated

Globus users must register an S3 Credential with Globus Connect Server in order to access data on its collections. The credential must be associated with a policy that allows the IAM permissions used by the AWS S3 Connector.

unauthenticated

Globus users can only access public AWS Buckets.

Storage Gateway

An S3 Storage Gateway is created with the command globus-connect-server storage-gateway create s3, and can be updated with the command globus-connect-server storage-gateway update s3.

Before looking into the policy options specific to the AWS S3 Connector, please familiarize yourself with the Globus Connect Server v5 Data Access Guide which describes the steps to create and update a storage gateway, using the POSIX connector as an example. The commands to create and update a storage gateway for the AWS S3 Connector are similar.

S3 Storage Gateway Policies

The --s3-user-credential, --s3-unauthenticated, --bucket, and --s3-endpoint command-line options control access to an Amazon S3 or compatible resource.

Endpoint

The --s3-endpoint command-line option is used by Globus Connect Server to contact the S3 API to access data on this storage gateway. This may be an Amazon S3 URL, a regional Amazon S3 URL, or the URL endpoint of another compatible storage system.

Example 1. Selecting an S3 API Endpoint

For our example, we’ll use Amazon S3’s standard US-East-1 regional S3 Endpoint which is located at https://s3.amazonaws.com

--s3-endpoint https://s3.amazonaws.com

Access Mode

The --s3-user-credential and --s3-unauthenticated command-line options are mutually exclusive.

If the --s3-user-credential command-line option is enabled, then each user accessing collections on this storage gateway must register an S3 key_id and secret_key with the storage gateway.

If the --s3-user-credential command-line option is enabled, then all acceses to collections on this storage gateway will be done using unauthenticated access. In this case, the root of the S3 Connector Virtual Filesystem will only be able to list buckets that are explicitly made visible by using the --bucket command-line option.

Example 2. Choosing an Access Mode

For our example, we’ll create a Storage Gateway that provides authenticated access to data buckets. Users will need to register credentials with this endpoint using the Globus web application.

--s3-user-credential

Bucket Restrictions

The --bucket command-line option argument is the name of a bucket which is allowed access by this storage gateway.

Example 3. Restricting Access to Buckets

For our example, we’ll create a Storage Gateway that restricts access to tow buckets owned by our organization: research-data-bucket-1, and research-data-bucket-2. Users will be restricted to only those buckets when using collections created on this storage gateway, and only if their credential has permissions to do so.

--bucket research-data-bucket-1 --bucket research-data-bucket-2

If no buckets are configured, then any buckets accessible using the user’s registered S3 key_id and secret_key may be accessed by collections on this storage gateway. If any are configured, then they act as restrictions to which buckets are visible and accessible on collections on this storage gateway.

Creating the Storage Gateway

Now that we have decided on all our policies, we’ll use the command to create the storage gateway.

% globus-connect-server storage-gateway create s3 \
    "S3 Storage Gateway" \
    --domain example.org \
    --s3-endpoint https://s3.amazonaws.com \
    --s3-user-credential \
    --bucket research-data-bucket-1 --bucket research-data-bucket-2

Storage Gateway Created: 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab

This was successful and the output the ID of the new storage gateway ( 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab in this case) for our reference. Note that this will always be a unique value if you run the command. If you forget the id of a storage gateway, you can always use the command globus-connect-server storage-gateway list to get a list of the storage gateways on the endpoint.

You can also add other policies to configure additional identity mapping and path restriction policies as described in the Globus Connect Server v5 Data Access Guide.

Note that this creates the storage gateway, but does not yet make it accessible via Globus and HTTPS. You’ll need to follow the steps in the next section.

Collection

An AWS S3 Collection is created with the command globus-connect-server collection create, and can be updated with the command globus-connect-server collection update.

As the AWS S3 Connector does not introduce any policies beyond those used by the base collection type, you can follow the sequence in the Collections Section of the Globus Connect Server v5 Data Access Guide. Recall however, that the paths are interpreted as described above in S3 Connector Virtual Filesystem.

User Credential

As mentioned in above, when the storage gateway is configured to provide authenticated access to AWS S3, users must register their own S3 keys. These keys must have the following IAM permissions when accessing S3:

Required S3 Permissions

In order for the AWS S3 Connector to properly access S3 resources on a user’s behalf, credentials that have been granted the following S3 permissions are required.

s3:ListAllMyBuckets and s3:GetBucketLocation on the * resource.

s3:ListBucket and s3:ListBucketMultipartUploads on the buckets: resource arn:aws:s3:::[bucket-name].

s3:GetObject, s3:PutObject, s3:DeleteObject, s3:ListMultipartUploadParts and s3:AbortMultipartUpload on the objects: resource arn:aws:s3:::[bucket-name]/*.

Example JSON policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllBuckets",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:GetBucketLocation"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Bucket",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:ListBucketMultipartUploads"
            ],
            "Resource": "arn:aws:s3:::example-bucket"
        },
        {
            "Sid": "Objects",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:ListMultipartUploadParts",
                "s3:AbortMultipartUpload"
            ],
            "Resource": "arn:aws:s3:::example-bucket/*"
        }
    ]
}

Appendix A: Document Types for the AWS S3 Connector

S3StoragePolicies Document

The S3StoragePolicies document describes S3-specific configuration policies. These policies contain information about how to contact the S3 API and also restrictions on which S3 buckets are available for access via this storage gateway.

Name

Type

Description

DATA_TYPE

string s3_storage_policies#1.0.0

Type of this document

s3_endpoint

string <uri>

URL of the S3 API endpoint

s3_buckets

array (string)

List of buckets not owned by the collection owner that will be shown in the root of collections created at the base of this Storage Gateway.

s3_user_credential_required

boolean

Flag indicating if a Globus User must register a user credential in order to create a Guest Collection on this Storage Gateway.

{
  "DATA_TYPE": "s3_storage_policies#1.0.0",
  "s3_endpoint": "https://s3.amazonaws.com",
  "s3_buckets": [
    "awsexamplebucket1"
  ],
  "s3_user_credential_required": true
}

S3UserCredential Document

The S3UserCredential document describes s3-specific configuration policies. Currently this contains the key information needed to contact the S3 API for this account.

Name

Type

Description

DATA_TYPE

string s3_user_credential_policies#1.0.0

Type of this document

s3_key_id

string

Access Key ID to use with the S3 API to access your buckets and objects.

s3_secret_key

string

Secret Key to use with the S3 API to access your buckets and objects.

{
  "DATA_TYPE": "s3_user_credential_policies#1.0.0",
  "s3_key_id": "AKIAIOSFODNN7EXAMPLE",
  "s3_secret_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
}

© 2010- The University of Chicago Legal