Google Drive Connector
- Google Drive Connector Virtual Filesystem
- Google Drive IPv6 Support
- Registration of endpoint with Google
- Google Drive Configuration Encryption
- Storage Gateway
- Collection
- User Credential
- Exporting Google Document Types (Beta - new in 5.4.80)
- Appendix A: Limitations
- Appendix B: Google Drive API Information
- Appendix C: Document Types for the Google Drive Connector
The Globus Google Drive storage connector can be used for access and sharing of data on Google Drive. The connector is available as an add-on subscription to organizations with a Globus Standard subscription - please contact us for pricing.
This document describes how to use the Google Drive Connector to configure Google Drive Storage Gateways and Collections. After these steps are complete, any Globus user you have authorized can register a credential to access Google Drive files that they have access to and, if enabled, can create guest collections for sharing access using those credentials by following the instructions in How To Share Data Using Globus.
This document assumes that you or another administrator has already installed Globus Connect Server v5 on one or more data transfer nodes, and that you have an administrator role on that endpoint.
The installation must be done by a system administrator, and has the following distinct set of steps:
-
Create a storage gateway on the endpoint configured to use the Google Drive Connector.
-
Create a mapped collection using the Google Drive Storage Gateway to provide access to Google Drive Storage Gateway data.
Please contact us at support@globus.org if you have questions or need help with installation and use of the Google Drive Connector.
Google Drive Connector Virtual Filesystem
Google Drive presents its data as a hierarchical list of files owned by folders, but has different semantics than a POSIX filesystem, such as associating the file name with the data rather than the directory entry, allowing multiple drives, allowing multiple objects with the same name in the same folder, and attributes to allow files owned by others to become visible in another user’s file space.
The Google Drive Connector provides these subdirectories of the root directory of a Google Drive Connector storage gateway:
- /My Drive
-
Files owned by the user’s Google account that are located in the user’s root directory. This is treated the user’s home directory on collections created using the Google Drive Connector.
- /Shared With Me
-
Files and directories owned by others which have been shared with the user’s Google account.
- /Starred
-
Files and directories to which the user’s Google account has added the starred attribute.
- /Team Drives
-
Directories which are Google Shared Drives (formerly called Team Drives) which the user’s Google account has been granted access to.
- /Trash
-
Files and directories which the user’s Google account has deleted.
Google Drive IPv6 Support
The Google Drive Connector supports data transfer to Google using IPv6. No connector-specific configuration is needed to enable this.
Registration of endpoint with Google
The Globus Connect Server v5 endpoint needs to be registered as an application with Google so that users can authorize the endpoint to access Google Drive on their behalf. The following steps describe how the endpoint can be registered as a Google OAuth client to obtain a client id and secret from Google.
Prerequisites
It is necessary that these steps be performed on a fully functional Globus Connect Server 5 endpoint, as discussed above.
You will need a Google account to complete these steps, and the registration will be stored under that Google account. This account is only for registration of the application and has no bearing on Google accounts that will be allowed to use this endpoint to access data. An administrator may use an existing Google account.
Steps
-
To register the endpoint with Google, go to the Google Developer Console
-
If you have never created a project with Google, you will be prompted to create one. If you create a project, you do not have to change the default permissions for the project when given the option to do so. The project that you create should be associated with your Google/GSuite organization.
-
After you have created or selected a project, you will use the Google API Dashboard to enable APIs, configure the OAuth consent screen, and create credentials for use with your endpoint.
-
You must enable this project to use the APIs required to interact with Google Drive. Select the "Library" menu.
-
Search for the API name Google Drive API and select the matching result.
-
Once on the API page, select "Enable".
-
-
Select the "OAuth consent screen" menu to configure the OAuth consent screen that will be shown to users.
-
When prompted for the "User Type", we recommend that you select "Internal" when possible. You should only use "External" if you need to allow access to accounts outside of your Google/GSuite organization, or if you are not part of a Google/GSuite organization. Select "Create".
-
For the "Application name", enter "Globus Connect Server".
-
For the "User Support email" select the appropriate value from the dropdown.
-
App Domain section
-
For the below fields enter a URL from your own domain, or "https://globus.org":
-
"Application Homepage"
-
"Application Privacy Policy"
-
"Application terms of service"
-
-
-
For "Authorized domains", add globus.org and your own domain
-
For the "Developer contact information" field provide your e-mail address.
-
Other fields are optional.
-
Select "Save and Continue".
-
In the "Scopes" section, select "Add or Remove Scopes", then copy and paste the following scopes into the "Manually add scopes" section before selecting "UPDATE":
https://www.googleapis.com/auth/drive.appdata https://www.googleapis.com/auth/drive
-
Select "Save and Continue".
-
-
Select the "Credentials" button on the left hand navigation menu
-
Select "Create Credentials," and then the "OAuth client ID" option
-
You will be prompted to select an application type. Choose "Web application" and configure it as follows:
-
Name: set a descriptive name to be able to identify the registration of this endpoint in your projects on the Google API Manager. For example, the endpoint Display Name can be used for this.
-
Authorization redirect URIs: set to the value that was displayed when the endpoint was created. If you don’t have that value handy, you can run the command
globus-connect-server endpoint show
You’ll see output that looks something like this:
Display Name: Test Endpoint ID: 669ec822-ca79-455c-89a7-cccb7aefbf8e Subscription ID: 6e62e6d7-e368-45f4-a23d-fb41243e8005 Public: True GCS Manager URL: https://21542.data.globus.org Network Use: normal
You can construct the auth callback URL by appending
/api/v1/authcallback_google
to the value of the GCS Manager URL. In this example case, the result ishttps://21542.data.globus.org/api/v1/authcallback_google
. -
Select "Create".
-
-
-
Make note of the client ID and secret you get from Google for this application, as you will need them to configure the storage gateway. The registration is complete.
Google Drive Configuration Encryption
All configuration information, including Google Drive secrets and user credential information, is encrypted with a secret key on the node servicing the request before storing it locally and uploading it to GCS cloud services for distribution to other nodes in the endpoint. The encryption key is only available locally to the node and is secured such that only the node admin has access.
Storage Gateway
A Google Drive Storage Gateway is created with the command globus-connect-server storage-gateway create google-drive, and can be updated with the command globus-connect-server storage-gateway update google-drive.
Before looking into the policy options specific to the Google Drive Connector, please familiarize yourself with the Globus Connect Server v5 Data Access Guide which describes the steps to create and update a storage gateway, using the POSIX connector as an example. The commands to create and update a storage gateway for the Google Drive Connector are similar.
Google Drive Connector Storage Gateway Policies
The Google Drive Connector has policies to manage application credentials, and set the user api rate quota.
Application Credentials
The --google-client-id and --google-client-secret command-line options provide information for Globus Connect Server to authenticate with Google Drive Connector. These values must be configured in order to be able to access data on collections created with the Google Drive Connector type.
For our example, we’ll assume we’ve obtained credentials as described above. We’ll use the command-line options --google-client-id and --google-client-secret to configure these on our storage gateway.
--google-client-id GOOGLE_CLIENT_ID
\
--google-client-secret GOOGLE_CLIENT_SECRET
User API Quota
The --google-drive-user-api-rate-quota command-line option allows you to configure a value for the User API Quota in order to try to avoid issues when interacting with the Google Drive API. This can be helpful to increase the quota if you have negotiated a larger API quota on your Google Drive API project or to decrease the value if you are experiencing frequent API quota errors when performing transfers. The value of the setting is a number of API operations per 100 seconds per user.
For our example, we’ll assume we’ve negotiated a larger API rate quota, and of 5000 operations per 100 seconds per user. We’ll set this using the --google-drive-user-api-rate-quota option.
--google-drive-user-api-rate-quota 5000
Creating the Storage Gateway
Now that we have decided on all our policies, we’ll use the command to create the storage gateway.
% globus-connect-server storage-gateway create google-drive \
"Google Drive Storage Gateway" \
--domain example.org
\
--google-client-id GOOGLE_CLIENT_ID
\
--google-client-secret GOOGLE_CLIENT_SECRET
\
--google-drive-user-api-rate-quota 5000
Storage Gateway Created: 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab
This was successful and outputs the ID of the new storage gateway (
in this case) for our reference. Note that this will always
be a unique value if you run the command. If you forget the id of a storage
gateway, you can always use the command
globus-connect-server storage-gateway
list to get a list of the storage gateways on the endpoint.7187a9a0-68e4-48ea-b3b9-7fd06630f8ab
You can also add other policies to configure additional identity mapping and path restriction policies as described in the Globus Connect Server v5 Data Access Guide.
Note that this creates the storage gateway, but does not yet make it accessible via Globus and HTTPS. You’ll need to follow the steps in the next section.
Collection
A Google Drive Collection is created with the command globus-connect-server collection create, and can be updated with the command globus-connect-server collection update.
As the Google Drive Connector does not introduce any policies beyond those used by the base collection type, you can follow the sequence in the Collections Section of the Globus Connect Server v5 Data Access Guide. Recall however, that the paths are interpreted as described above in Google Drive Connector Virtual Filesystem.
User Credential
As mentioned in above, access to mapped collections on a Google Drive require users to register credentials. These credentials are created by performing an authentication flow with Google. This is initiated by visiting the Credentials tab of the collection. The user is directed to that page when they first attempt to access that collection.
The user’s Google account must match the username mapped from their Globus identity, unless the storage-gateway --google-allow-any-account command-line option is set.
Exporting Google Document Types (Beta - new in 5.4.80)
Certain Google document types can be exported to a variety of standard formats which are supported by applications such as Microsoft Office.
By default, the document export feature is disabled. To enable export of a
document type, modify the configuration file
/etc/globus/globus-gridftp-server-google-drive.conf
to set configuration keys from the table below to one of the supported
extensions. The extension will determine the format of the file on the
destination. The details of each format can be found
here.
The configuration must be set in the same way on all nodes of an endpoint.
Document Type | Configuration Key | Supported Extensions |
---|---|---|
Document |
export_document_ext |
docx, odt, rtf, pdf, txt, zip, md |
Spreadsheet |
export_spreadsheet_ext |
xlsx, ods, pdf, zip, csv, tsv |
Presentation |
export_presentation_ext |
pptx, odp, pdf, txt |
Drawing |
export_draw_ext |
pdf, jpg, png, svg |
Appendix A: Limitations
Google Document Types
In versions of GCS older than 5.4.80, the Google Drive connector is not intended to transfer Google document types (doc, sheet, slides, etc) between different Google accounts or outside of Google Drive. The content of Google documents are never downloaded or uploaded. When downloading a Google document, the data that gets created on the destination endpoint is simply a metadata file that contains the Google document id. When uploading, a Google document is created which is a copy of that id, but this requires that your Google account already has permission to access the original document id.
Transfers of Google document metadata back into Google Drive do not support checksum verification, and will fail if verification is enabled for that task.
Appendix B: Google Drive API Information
The connector uses the Google Drive v3 REST API for all storage interactions. The following API calls are used.
API Calls
API Call | Purpose | Frequency | Notes |
---|---|---|---|
|
File and folder listings and other metadata |
Multiple calls per listing request, depending on path length |
When resolving paths to object ids, requests are batched when possible to query multiple levels at once. Results are cached within a given session. |
|
Download a file |
Per file |
Ranged requests may be used if partial download is needed. |
|
Create a file or folder |
Per file |
Files are created as resumable so that uploads can be restarted. |
|
Trash or rename a file or folder, or update other metadata |
Per file |
This updates a file or folder’s parent when renaming to a different folder, or the modification time when transferring with the preserve modification time option. This is also used to mark a file as Trashed when deleting. |
|
Delete a file or folder from Trash |
Per file |
Files are only permanently deleted when deleted from Trash. |
|
Create a copy of a Google document when uploading a Google document stub file |
Per stub upload |
|
|
Shared drive listings |
When listing the /Team Drives path |
Shared drives are listed under the /Team Drives path. |
|
Create a shared drive |
Per drive created |
|
|
Delete a shared drive |
Per drive deleted |
|
|
Checking token validity |
Per GCS login |
Appendix C: Document Types for the Google Drive Connector
GoogleDriveStoragePolicies Document
Connector-specific storage gateway policies for the Google Drive connector
One of the following schemas:
{
"DATA_TYPE": "google_drive_storage_policies#1.0.0",
"auth_callback": "string",
"client_id": "string",
"secret": "string",
"user_api_rate_quota": 0
}
GoogleDriveUserCredentialPolicies Document
Connector-specific user credential policies for the Google Drive connector
One of the following schemas:
{
"DATA_TYPE": "google_drive_user_credential_policies#1.0.0",
"access_token": "string",
"email": "string",
"refresh_token": "string",
"scopes": [
"email",
"profile",
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/drive.appfolder"
],
"sub": "string",
"token_expiry": "2019-08-24T14:15:22Z"
}