HTTPS Access to Collections
1. Overview
Globus Connect Server v5 includes support for basic HTTPS access to Globus collections. This includes download and upload of files only, and may be used either programmatically, or linked to or downloaded from a web application. This guide describes how to access data on a Globus collection via HTTPS.
2. Determining the Collection HTTPS Base URL
Information about Globus Connect Server v5 collections are available using the Globus Transfer API. Among this information is the base URL which can be used to access collection data.
To determine this information, look for the https_server
property in the
collection’s Endpoint document. (If querying GCS directly, look for the
https_url
property in the collection’s document.)
2.1. From the command-line
% globus gcs collection show "6c54cade-bde5-45c1-bdea-f4bd71dba2cc" \ --jq 'https_url' -F unix https://m-d3a2c3.collection1.tutorials.globus.org
2.2. Using the python sdk
from globus_sdk import TransferClient, AccessTokenAuthorizer
TRANSFER_ACCESS_TOKEN = '...'
transfer_client = TransferClient(
authorizer=AccessTokenAuthorizer(TRANSFER_ACCESS_TOKEN))
endpoint_id = '6c54cade-bde5-45c1-bdea-f4bd71dba2cc'
endpoint = transfer_client.get_endpoint(endpoint_id)
https_server = endpoint['https_server']
2.3. Using the globus.org web app
Visit the collection in the file browser, select a file, then click on "Get Link" on the right panel.
For this example endpoint: https://app.globus.org/file-manager?origin_id=6c54cade-bde5-45c1-bdea-f4bd71dba2cc
3. Authorization
3.1. Mapped Collections
Mapped collections require authenticated access using a Globus Auth identity.
-
You must have a linked identity in one or more of the allowed domains configured for the mapped collection,
-
You must have authenticated with the linked identity within the authentication timeout configured for the collection (default: 11 days),
-
The collection administrator must provide an account mapping that maps your your linked identity to a valid account on the storage service.
3.2. Guest Collections
Guest collections allow access based upon user-specified Access Control Lists, which may include either any Globus Auth identity, a specific Globus Auth identity, a Globus Groups identity, or be completely public with no authentication. Guest collections do not require the user to have an account in the allowed domains for the collection nor do guest collections have authentication timeout requirements (unless High Assurance as noted below).
3.3. High Assurance Collections
All High Assurance collections, either mapped or guest, have additional requirements for access.
-
All access must be authenticated,
-
The identity used to access the collection must have authenticated in the current session within the authentication timeout configured for the collection.
3.4. Support for Portal Access
Collections can be configured to allow access from a external application, for example from a science portal, which has a subtle change on the authorization requirements for access.
For mapped collections, a system administrator may create a mapping from an
application client_id@clients.auth.globus.org
to a local
account in the gridmap file.
A system administrator must also include the application identity’s domain
(clients.auth.globus.org) to the storage gateway’s allowed domains.
For guest collections, the collection administrator may create permissions for the
identity of an application based on its Globus Auth client_id (by allowing
access to client-id@clients.auth.globus.org
).
In either case (mapped or guest collection), the client identity is not subject to authentication timeout or session requirements necessary for user access.
If the collection requires an authenticated identity, your application may use
an OAuth client_credentials
grant to obtain an access token and present that
to the HTTPS service.
4. Accessing Data
4.1. Supported HTTP Methods
The HTTPS server supports the OPTIONS
, HEAD
GET
, PUT
, and DELETE
,
provided the client identity is authorized for that operation on the
particular path.
Note that the HTTPS service does not support directory listings.
4.2. Programmatic Access
When accessing the resource programmatically, the HTTPS interface will respond with HTTP reply codes (ie. 200, 401, 403) that allow you to interpret the status of the authorization for the request. The programmatic responses are enabled by either:
-
the presence of a OAUTH2 access token in the
Authorization
header or -
the presence of the
X-Requested-With: XMLHttpRequest
header
For example, either of these headers will enable programmatic responses:
Authorization: Bearer AgK6JMDBlM0Ey87qJavoPok3kk8xxg3E2MY9K6136G7m2kBYlzceCe4 X-Requested-With: XMLHttpRequest
4.2.1. Access Tokens for HTTPS
If the collection requires an authenticated Globus Auth identity to access the
endpoint, your application must present an access token using the
Authorization
HTTP header with the required scopes.
Required scopes
All authenticated, programmatic access to a collection through its HTTPS
interface requires the https
scope. In addition, mapped (non High Assurance)
collections additionally require the data_access
scope.
The full name of each scope is based on the ID of the collection using this format:
https://auth.globus.org/scopes/COLLECTION_ID/SCOPE_SUFFIX
For example, authenticated, programmatic access to a guest collection
with collection ID 60a0c6af-3f73-453c-afbe-c8504fc428b6
would require the
https
scope:
https://auth.globus.org/scopes/60a0c6af-3f73-453c-afbe-c8504fc428b6/https
And authenticated, programmatic access to a mapped collection
with collection ID 1aaef64c-d812-408c-b33c-d49d3973ecbd
would require the
https
and data_access
scopes:
https://auth.globus.org/scopes/1aaef64c-d812-408c-b33c-d49d3973ecbd/https https://auth.globus.org/scopes/1aaef64c-d812-408c-b33c-d49d3973ecbd/data_access
4.3. Browser Access
The collection HTTPS interface can be accessed directly using a web browser, for example, by following a link shared via email. This method will attempt anonymous access to the resource and if authorization fails, the HTTPS response will redirect the browser (HTTP status 302) through an OpenID Connect flow to authenticate the user and retry the authenticated request.
4.3.1. Request a Browser Download
By default, the HTTPS service includes the Content-Disposition
header set to
inline
so that the data is loaded directly in to the browser and can be
viewed. To request the data be downloaded by the browser, append the query
parameter download
when requesting the object. This changes the service
response to include a Content-Disposition
header set to attachment
with a
suggested filename.