Premium Storage Connectors
  • Premium Storage Connectors for GCSv5.4
  • ActiveScale
  • Amazon Web Services S3
  • Azure Blob
  • BlackPearl Connector
  • Box
  • Ceph
  • Dropbox
  • Google Cloud Storage
  • Google Drive
  • HPSS
  • iRODS
  • OneDrive
  • POSIX Staging
Skip to main content
Globus Docs
  • APIs
    Auth Flows Groups Search Timers Transfer Globus Connect Server Compute Helper Pages
  • Applications
    Globus Connect Personal Globus Connect Server Premium Storage Connectors Compute Command Line Interface Python SDK JavaScript SDK
  • Guides
  • Support
    FAQs Mailing Lists Contact Us Check Support Tickets
  1. Home
  2. Premium Storage Connectors
  3. Premium Storage Connectors for GCSv5.4
  4. Google Cloud Storage

Google Cloud Storage Connector

The Google Cloud Storage connector allows Globus Connect Server to access Google Cloud Storage buckets associated with Google accounts. Access to Google Cloud Storage through the Google Cloud Storage connector is facilitated by the creation of Google Cloud Storage storage gateways on an endpoint. The Google Cloud Storage connector is available as an add-on subscription to organizations with a Globus Standard subscription - please contact us for pricing.

This document describes how to create Google Cloud Storage storage gateways and collections. After the installation is complete, any authorized user can register credentials to access data on Google Cloud Storage buckets, or create a guest collection by following the steps in this How To.

The installation must be done by a system administrator, and has the following distinct set of steps:

  • Registration of the endpoint with Google to obtain credentials for the endpoint to securely use the Google Cloud Storage APIs for accessing data, or creation of a Google service account key.

  • Create a storage gateway on the endpoint configured to use the Google Cloud Storage connector and the credentials from Google.

  • Create a mapped collection using the storage gateway to allow access to Google Cloud Storage Data

Please contact us at support@globus.org if you have questions or need help with installation and use of the Google Cloud Storage connector.


Table of Contents
  • Google Cloud Storage Virtual Filesystem
  • Google Cloud Storage IPv6 Support
  • Registration of endpoint with Google
    • Prerequisites
    • Steps
  • Google Cloud Storage Configuration Encryption
  • Storage Gateway
    • Google Cloud Storage Connector Storage Gateway Policies
    • Creating the Storage Gateway
  • Collection
    • Collection Policies
    • Create a collection
  • User Credential
  • Appendix A: Document Types for the Google Cloud Storage Connector
    • GoogleCloudStoragePolicies Document
    • GoogleCloudStorageCollectionPolicies Document
    • GoogleCloudStorageUserCredentialPolicies Document
  • Appendix B: Google Cloud Storage Connector items of note

Google Cloud Storage Virtual Filesystem

The Google Cloud Storage connector provides a distributed object store, where each data object is accessed based on a bucket name and an object name.

The Google Cloud Storage Connector attempts to make this look like a regular filesystem, by treating the bucket name as the name of a directory in the root of the storage gateway’s file system. For example, if the configured project contains bucket1 and bucket2, then those buckets would show up as directories when listing /. If a project is not configured, the root directory can not be listed, but accessible bucket paths can be accessed directly.

The Google Cloud Storage Connector also treats the / character as a delimiter in the API so that it can present something that looks like like subdirectories. For example, the object object1 in bucket1 would appear as /bucket1/object1 to the Google Cloud Storage connector, and the object object2/object3 in bucket2 would appear as a file called object3 in the directory /bucket2/object2.

Google Cloud Storage IPv6 Support

The Google Cloud Storage Connector supports data transfer to Google using IPv6. No connector-specific configuration is needed to enable this.

Registration of endpoint with Google

The Globus Connect Server v5 endpoint needs to be registered as an application with Google so that users can authorize the endpoint to access Google Cloud Storage on their behalf. The following steps describe how the endpoint can be registered as a Google OAuth client to obtain a client id and secret from Google.

Note

The same client id and secret may be used for both the Google Cloud Storage and Google Drive connectors — it is not necessary to register twice. However, you must enable the APIs and Scopes noted in each connector’s registration documentation.

Prerequisites

It is necessary that these steps be performed on a fully functional Globus Connect Server 5 endpoint, as discussed above.

You will need a Google account to complete these steps, and the registration will be stored under that Google account. This account is only for registration of the application and has no bearing on Google accounts that will be allowed to use this endpoint to access data. An administrator may use an existing Google account.

Note

When configuring a service account key for Google Cloud Storage, it is not necessary to complete these steps. See Google’s service account documentation for more info on creating a service account key.

Steps

  1. To register the endpoint with Google, go to the Google Developer Console

  2. If you have never created a project with Google, you will be prompted to create one. If you create a project, you do not have to change the default permissions for the project when given the option to do so. The project that you create should be associated with your Google/GSuite organization.

  3. After you have created or selected a project, you will use the Google API Dashboard to enable APIs, configure the OAuth consent screen, and create credentials for use with your endpoint.

  4. You must enable this project to use the APIs required to interact with Google Cloud Storage. Select the "Library" menu.

    1. Repeat the following steps for these API names: Cloud Storage, Cloud Storage API, Google Cloud Storage JSON API, and Cloud Resource Manager API.

    2. Search for the API name and select the matching result.

    3. Once on the API page, select "Enable".

  5. Select the "OAuth consent screen" menu to configure the OAuth consent screen that will be shown to users.

    1. When prompted for the "User Type", we recommend that you select "Internal" when possible. You should only use "External" if you need to allow access to accounts outside of your Google/GSuite organization, or if you are not part of a Google/GSuite organization. Select "Create".

      Note

      External apps will be subject to Google’s unverified app restrictions.
    2. For the "Application name", enter "Globus Connect Server".

    3. For the "User Support email" select the appropriate value from the dropdown.

    4. App Domain section

      1. For the below fields enter a URL from your own domain, or "https://globus.org":

        1. "Application Homepage"

        2. "Application Privacy Policy"

        3. "Application terms of service"

    5. For "Authorized domains", add globus.org and your own domain

    6. For the "Developer contact information" field provide your e-mail address.

    7. Other fields are optional.

    8. Select "Save and Continue".

    9. In the "Scopes" section, select "Add or Remove Scopes", then copy and paste the following scopes into the "Manually add scopes" section before selecting "UPDATE":

       https://www.googleapis.com/auth/cloudplatformprojects.readonly
       https://www.googleapis.com/auth/devstorage.read_write
    10. Select "Save and Continue".

  6. Select the "Credentials" button on the left hand navigation menu

  7. Select "Create Credentials," and then the "OAuth client ID" option

    1. You will be prompted to select an application type. Choose "Web application" and configure it as follows:

      1. Name: set a descriptive name to be able to identify the registration of this endpoint in your projects on the Google API Manager. For example, the endpoint Display Name can be used for this.

      2. Authorization redirect URIs: set to the value that was displayed when the endpoint was created. If you don’t have that value handy, you can run the command

        globus-connect-server endpoint show

        You’ll see output that looks something like this:

        Display Name:    Test Endpoint
        ID:              669ec822-ca79-455c-89a7-cccb7aefbf8e
        Subscription ID: 6e62e6d7-e368-45f4-a23d-fb41243e8005
        Public:          True
        GCS Manager URL: https://21542.data.globus.org
        Network Use:     normal

        You can construct the auth callback URL by appending /api/v1/authcallback_google to the value of the GCS Manager URL. In this example case, the result is https://21542.data.globus.org/api/v1/authcallback_google.

      3. Select "Create".

  8. Make note of the client ID and secret you get from Google for this application, as you will need them to configure the storage gateway. The registration is complete.

Google Cloud Storage Configuration Encryption

All configuration information, including Google Cloud Storage secrets and user credential information, is encrypted with a secret key on the node servicing the request before storing it locally and uploading it to GCS cloud services for distribution to other nodes in the endpoint. The encryption key is only available locally to the node and is secured such that only the node admin has access.

Storage Gateway

A Google Cloud Storage Connector Storage Gateway is created with the command globus-connect-server storage-gateway create google-cloud-storage, and can be updated with the command globus-connect-server storage-gateway update google-cloud-storage.

Before looking into the policy options specific to the Google Cloud Storage Connector, please familiarize yourself with the Globus Connect Server v5 Data Access Guide which describes the steps to create and update a storage gateway, using the POSIX connector as an example. The commands to create and update a storage gateway for the Google Cloud Storage Connector are similar.

Google Cloud Storage Connector Storage Gateway Policies

The Google Cloud Storage Connector has policies to manage application credentials and to control access to an enumerated set of buckets and Google Cloud Storage projects.

Application Credentials

The --google-client-id and --google-client-secret command-line options provide information for Globus Connect Server to authenticate with Google Cloud Storage. These values must be configured in order for users to be able to access Google Cloud Storage data using their own credentials.

The values for GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET are acquired when setting up the application project, as described in the Google Cloud Storage Connector configuration guide.

Example 1. Setting Google Cloud Storage Connector Application Credentials.

For this example, we’ll assume we’ve obtained credentials as described above. We’ll use the command-line options --google-client-id and --google-client-secret to configure these on our storage gateway.

    --google-client-id GOOGLE_CLIENT_ID \
    --google-client-secret GOOGLE_CLIENT_SECRET

You may instead configure a single service account key via --google-service-account-key. In this mode, all users that access a collection of this storage gateway will use those service credentials. --google-client-id and --google-client-secret are not used with --google-service-account-key.

Service account keys can be created in the Google Developer Console and downloaded as a json file. See Google’s service account documentation for more info on creating a service account key.

Example 2. Setting a Google Cloud Storage Connector Service Account Key.

We’ll use the command-line option --google-service-account-key to configure service account credentials on our storage gateway.

    --google-service-account-key file:google-service-key.json

Bucket Restrictions

The --bucket command-line option argument is the name of a bucket which is allowed access by this storage gateway.

Example 3. Restricting Access to Buckets

For our example, we’ll create a Storage Gateway that restricts access to two buckets owned by our organization: research-data-bucket-1, and research-data-bucket-2. Users will be restricted to only those buckets when using collections created on this storage gateway, and only if their credential has permissions to do so.

--bucket research-data-bucket-1 --bucket research-data-bucket-2

Google Cloud Storage Gateway Projects

The --google-cloud-storage-project command-line option argument is the name of a Google Project which may be used to create collections on this storage gateway.

If no projects are configured for a Google Cloud Storage Connector Storage Gateway, then any project name can be used when creating a mapped collection. Otherwise, the project must be a member of the configured project list.

Example 4. Restricting Access to Projects

For our example, we’ll create a Storage Gateway that restricts access to two projects green-data-13843 and orange-storage-2749994. Each collection created on this storage gateway must be associated with one of those projects.

--google-cloud-storage-project green-data-13843 \
--google-cloud-storage-project orange-storage-2749994

Creating the Storage Gateway

Now that we have decided on all our policies, we’ll use the command to create the storage gateway (Note: You can find further details for the various Storage Gateway options here).

% globus-connect-server storage-gateway create google-cloud-storage \
    "Google Cloud Storage Gateway" \
    --domain example.org \
    --google-client-id GOOGLE_CLIENT_ID \
    --google-client-secret GOOGLE_CLIENT_SECRET \
    --bucket research-data-bucket-1 --bucket research-data-bucket-2 \
    --google-cloud-storage-project green-data-13843 \
    --google-cloud-storage-project orange-storage-2749994 \

Storage Gateway Created: 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab

This was successful and outputs the ID of the new storage gateway ( 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab in this case) for our reference. Note that this will always be a unique value if you run the command. If you forget the id of a storage gateway, you can always use the command globus-connect-server storage-gateway list to get a list of the storage gateways on the endpoint.

You can also add other policies to configure additional identity mapping and path restriction policies as described in the Globus Connect Server v5 Data Access Guide.

Note that this creates the storage gateway, but does not yet make it accessible via Globus and HTTPS. You’ll need to follow the steps in the next section.

Collection

A Google Cloud Storage Collection is created with the command globus-connect-server collection create, and can be updated with the command globus-connect-server collection update.

Collection Policies

Every Google Cloud Storage Collection can optionally be associated with one Google Cloud Storage project. For a mapped collection, this must be set on creation time (though it may be updated later). For a guest collection, it does not need to be set. If it is included in the collection creation API call, set it must match that of the mapped collection it is being created on.

Google Cloud Storage Collection Project

The --google-project-id command-line option argument is the name of a Google Project which will be used for all bucket operations via the Google Cloud Storage API. Additionally, users accessing a collection must be members of the project.

If the storage gateway has values set as described in Google Cloud Storage Gateway Projects, then the value must be a member of the list. If not, any existing project name may be used. If the storage gateway has exactly one project configured, it will be used by default for the collection policy.

If a project is not set in either the storage gateway or collection policy, any users that match the domain and mapping policies will be able to access the collection, and will be able to access buckets that their Google account is permitted to access. It will not be possible to list buckets via a root directory listing, but bucket paths can be accessed directly.

Example 5. Selecting a Project

For our example, we’ll use the green-data-13843 project we associated with our storage gateway as the project to use for our collection.

--google-project-id green-data-13843

Create a collection

The Google Cloud Storage Connector can use all of the policy setting options described in Collections Section of the Globus Connect Server v5 Data Access Guide. Recall however, that the paths are interpreted as described above in Google Cloud Storage Virtual Filesystem. For our example, we’ll adapt use the policies with some adaption to be suitable for a Google Cloud Storage collection. In particular, we’ll set the base path to / and change the sharing path restrictions to allow read-only sharing of research-data-bucket.

The --google-project-id command-line option can be omitted if not desired to restrict access to users belonging to a Google Cloud Project. Note that when no project is configured, buckets cannot be listed to form the root directory, but bucket paths can be accessed directly.

Example 6. Create a Collection
% globus-connect-server collection create \
    7187a9a0-68e4-48ea-b3b9-7fd06630f8ab \
    / collection_name \
    --organization 'Example organization' \
    --contact-email support@example.org \
    --info-link \https://example.org/storage/info \
    --description "Google Cloud Storage for Project green-data-13843" \
    --keywords example.org,home \
    --allow-guest-collections \
    --sharing-restrict-paths '{
        "DATA_TYPE": "path_restrictions#1.0.0",
        "read": ["/research-data-bucket"]
    }' \
    --google-project-id green-data-13843
Collection ID: 56c3dff0-d827-4f11-91f3-b0704c53aa4c

This was successful and outputs the ID of the new collection ( 56c3dff0-d827-4f11-91f3-b0704c53aa4c in this case) for our reference. Note that this will always be a unique value if you run the command. If you forget the id of a collection, you can always use the command globus-connect-server collection list to get a list of the collections on the endpoint.

You can use this value as an endpoint for the Globus transfer service and web application, or when editing or deleting this endpoint.

There are many policy-related options to this command, they are documented in full in the reference manual, but many are discussed in later sections of this document.

User Credential

As mentioned above, except when using service account authentication, access to Google Cloud Storage mapped collections will require users to register credentials. These credentials are created by performing an authentication flow with Google. This is initiated by visiting the Credentials tab of the collection. The user is directed to that page when they first attempt to access that collection.

The user’s Google account must match the username mapped from their Globus identity, unless the storage-gateway --google-allow-any-account command-line option is set. If --google-project-id command-line option is set on the collection, the user must be a member of that project.

Appendix A: Document Types for the Google Cloud Storage Connector

GoogleCloudStoragePolicies Document

Connector-specific storage gateway policies for the Google Cloud Storage connector

One of the following schemas:

  • GoogleCloudStoragePolicies_1_0_0

  • GoogleCloudStoragePolicies_1_1_0

​

{
  "DATA_TYPE": "google_cloud_storage_policies#1.0.0",
  "auth_callback": "string",
  "buckets": [
    "string"
  ],
  "client_id": "string",
  "projects": [
    "string"
  ],
  "secret": "string",
  "service_account_key": {},
  "user_credential_required": true
}

GoogleCloudStorageCollectionPolicies Document

Connector-specific collection policies for the Google Cloud Storage connector

One of the following schemas:

  • GoogleCloudStorageCollectionPolicies_1_0_0

​

{
  "DATA_TYPE": "google_cloud_storage_collection_policies#1.0.0",
  "project": "string"
}

GoogleCloudStorageUserCredentialPolicies Document

Connector-specific user credential policies for the Google Cloud Storage connector

One of the following schemas:

  • GoogleCloudStorageUserCredentialPolicies_1_0_0

​

{
  "DATA_TYPE": "google_cloud_storage_user_credential_policies#1.0.0",
  "access_token": "string",
  "email": "string",
  "projects": [
    {
      "name": "string",
      "projectId": "string"
    }
  ],
  "refresh_token": "string",
  "scopes": [
    "string"
  ],
  "sub": "string",
  "token_expiry": "2019-08-24T14:15:22Z"
}

Appendix B: Google Cloud Storage Connector items of note

  1. Due to mod-times not being modifiable on Google Cloud Storage, the timestamp for objects transferred into Google Cloud storage based Collections will reflect the time/date that the objects are written to Google Cloud Storage; however, once transferred out of Google Cloud Storage (to a filesystem that supports mod-time modification) the timestamps will reflect the objects original mod-time when initially transferred.

  2. The original mod-time for objects is stored in the mtime metadata tag on the object(s).

  3. Timestamp preservation is currently only supported for files.

  • Premium Storage Connectors for GCSv5.4
  • ActiveScale
  • Amazon Web Services S3
  • Azure Blob
  • BlackPearl Connector
  • Box
  • Ceph
  • Dropbox
  • Google Cloud Storage
  • Google Drive
  • HPSS
  • iRODS
  • OneDrive
  • POSIX Staging
© 2010- The University of Chicago Legal Privacy Accessibility