API Overview
1. Overview
The Transfer API provides remote data access via GridFtp to file-systems that have been configured with collections within the Transfer service. Collections are generally created using the Globus Connect Server or Globus Connect Personal software packages.
Most requests to the Transfer API fall under three categories: configuring and managing Globus Connect Personal collections, synchronously accessing data on a collection such as directory listings, and submitting and managing asynchronous tasks between collections such as file transfers.
This documentation assumes a basic familiarity with HTTP, including the GET, POST, PUT, and DELETE request methods, Content-Type and Accepts headers, and standard status codes.
Python users may wish to use the Globus Python SDK when making requests to the Transfer API as it handles authentication and exposes methods for most Transfer API functionally.
2. Transfer API Terminology
-
task - a batch of asynchronous file transfer or delete operations that were submitted together, identified by a unique ID.
-
collection - a definition for a source / destination for data access with a convenient name and unique ID.
-
endpoint - a definition for an installation of Globus Connect Server used as an interface for management and creation of collections.
-
consent - a string associated with an OAuth2 token that tells services what access the user has agreed to. In addition to the transfer scope that users consent to in order make requests to the Transfer API, Globus Connect Server v5 data access requires dependent consents to access data on some collections.
-
resource - a URL addressable part of the API, which can be interacted with using a subset of the GET, POST, PUT, and DELETE HTTP methods.
-
document - a representation of data, returned by resources as output and accepted by resources as input. There are several standard document types, and some types include sub-documents (for example, the
task_list
type is a container for many documents of typetask
).
2.1. Legacy Endpoint Terminology
The Transfer API largely predates the above definitions of endpoint and collection, and will frequently use the term "endpoint" in urls, fields, error codes, and messages to refer to entities and resources that are actually collections in the current terminology.
For example, the
endpoint or collection document
will always have a DATA_TYPE
of "endpoint" regardless of what type of entity the
document actually refers to.
This documentation will use the above endpoint and collection terminology and note locations where legacy endpoint terminology is being used by the API to attempt to reduce confusion.
3. Making Requests
3.1. Base URL
All the URLs in the Transfer documentation are relative to a base url for the transfer service.
https://transfer.api.globusonline.org/v0.10
so the full URL to /task_list will be:
https://transfer.api.globusonline.org/v0.10/task_list
Clients should store the base URL in one place and use it when constructing resource URLs, to simplify changing versions.
3.2. Document Formats
The API uses json for all input and output, including error documents. As such,
the Content-Type
header should be set to application/json
when making POST
requests.
Note that application/x-www-form-urlencoded is not supported. The body should contain the actual JSON data, not a form encoded version of that data.
The json representation uses a "DATA_TYPE" key to specify the type of resource and a "DATA" key containing a list of sub-documents, if any. For example, the task document type is described in detail here:
3.3. Authentication
Authentication to the Transfer API requires using Globus Auth to obtain an access token. The Globus Auth developer guide describes the steps to register a client and request tokens for that client to make requests to Globus services.
When requesting tokens, request Transfer all
scope in order to get an access
token scoped to make requests to the Transfer API:
urn:globus:auth:scope:transfer.api.globus.org:all
If you know you will be using specific Globus Connect Server mapped collections, you can also proactively request their dependent scopes when first getting tokens. See data access consent for details.
Once obtained, the access token needs to be passed to the Transfer API in the
Authorization
header with the method Bearer
:
Authorization: Bearer TOKEN
3.4. Linked Identities
Authorization is based on the capabilities granted to any of the linked identities associated with the Globus Auth token used to authenticate to the REST API, combined with capabilities granted to any of the groups any of the linked identities belong too. For example, a private collection will be visible if any of the linked identities own the collection, or if any of the linked identities has an appropriate effective role on the collection.
3.5. Errors
When an error occurs an HTTP status code >=400 will be returned. The body of
the response will be a JSON document with details about the error, including
code
and message
fields. The error code will also be provided in the
"X-Transfer-API-Error" header. Note that requests outside the API path version
prefix may return an HTML or plaintext error body instead. Here is an example
EndpointNotFound
[1]:
{
"code": "EndpointNotFound",
"message": "No such endpoint '23c1a962-7e68-11e5-ac37-f0def10a689e'",
"request_id": "HrbjJy3QJ",
"resource": "/endpoint/23c1a962-7e68-11e5-ac37-f0def10a689e"
}
A 404 status code is used for this response. The code field has the same value as the X-Transfer-API-Error header, for convenient access.
3.6. Example Requests
The following examples are a series of Transfer requests to submit and view a transfer from Globus Tutorial Collection 1 to Globus Tutorial Collection 2.
curl is used to give low level language agnostic examples. For routine command line level usage, the Globus CLI is recommended. Python users may also want to reference the Globus Python SDK examples.
The TOKEN
variable in the below examples is an Access token granted by Globus auth,
see Authentication for details. Note that since the Globus Tutorial
Collections are Globus Connect Server Mapped collections, the token’s scopes will need
to include dependent scopes for data access consent.
3.6.1. Get a submission id from Globus Transfer
This example is a GET
against the
/submission_id
resource which
gets a submission_id
needed to submit a task.
curl --request GET --header "Authorization: Bearer $TOKEN" \
https://transfer.api.globus.org/v0.10/submission_id
{
"DATA_TYPE": "submission_id",
"value": "f0930d8a-e83c-448c-9c43-a63296e541d3"
}
3.6.2. Submit the transfer request
This example is a POST
against the
/transfer
resource
containing a JSON body containing the submission_id
from the above example and the
fields needed to transfer file1.txt
from Globus Tutorial Collection 1’s
/home/share/godata/
directory to the user’s home directory (/~/
) on
Globus Tutorial Collection 2.
Note: The fields source_endpoint
and destination_endpoint
refer to the
source and destination collections due to
legacy endpoint terminology.
curl --request POST --header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/json" \
--data '{
"DATA": [
{
"DATA_TYPE": "transfer_item",
"destination_path": "file1.txt",
"source_path": "/home/share/godata/file1.txt"
}
],
"DATA_TYPE": "transfer",
"destination_endpoint": "31ce9ba0-176d-45a5-add3-f37d233ba47d",
"source_endpoint": "6c54cade-bde5-45c1-bdea-f4bd71dba2cc",
"submission_id": "f0930d8a-e83c-448c-9c43-a63296e541d3"
}' \
https://transfer.api.globus.org/v0.10/transfer
{
"DATA_TYPE": "transfer_result",
"code": "Accepted",
"message": "The transfer has been accepted and a task has been created and queued for execution",
"request_id": "gOvWY74zr",
"resource": "/transfer",
"submission_id": "f0930d8a-e83c-448c-9c43-a63296e541d3",
"task_id": "2c594ded-b6ae-4957-b706-ffe34d307c6b",
"task_link": {
"DATA_TYPE": "link",
"href": "task/2c594ded-b6ae-4957-b706-ffe34d307c6b?format=json",
"rel": "related",
"resource": "task",
"title": "related task"
}
}
3.6.3. Check the task’s status
This example is a GET
against the
/task/<task_id>
resource with the
task_id
URL variable set to the task_id
from the above example’s response in order
to view that task’s status.
curl --request GET --header "Authorization: Bearer $TOKEN" \
https://transfer.api.globus.org/v0.10/task/2c594ded-b6ae-4957-b706-ffe34d307c6b
{
"DATA_TYPE": "task",
"bytes_checksummed": 0,
"bytes_transferred": 4,
"canceled_by_admin": null,
"canceled_by_admin_message": null,
"command": "API 0.10",
"completion_time": "2024-01-29T16:52:56+00:00",
"deadline": "2024-01-30T16:52:55+00:00",
"delete_destination_extra": false,
"destination_base_path": null,
"destination_endpoint": "u_4ms37xszkjddtk3w5gtgrttr4i#476a7a90-80a0-11ee-8c55-fd88ce9321ad",
"destination_endpoint_display_name": "Globus Tutorial Collection 2",
"destination_endpoint_id": "31ce9ba0-176d-45a5-add3-f37d233ba47d",
"destination_local_user": null,
"destination_local_user_status": null,
"directories": 0,
"effective_bytes_per_second": 3,
"encrypt_data": false,
"fail_on_quota_errors": false,
"fatal_error": null,
"faults": 0,
"files": 1,
"files_skipped": 0,
"files_transferred": 1,
"filter_rules": null,
"history_deleted": false,
"is_ok": null,
"is_paused": false,
"label": null,
"nice_status": null,
"nice_status_details": null,
"nice_status_expires_in": null,
"nice_status_short_description": null,
"owner_id": "b0febcb2-c10b-4ec2-b03b-d97f6ada6a33",
"preserve_timestamp": false,
"recursive_symlinks": "ignore",
"request_time": "2024-01-29T16:52:55+00:00",
"skip_source_errors": false,
"source_base_path": null,
"source_endpoint": "u_eyljfjd6jfg67nm6zp6gly4qpu#e09c6728-80a0-11ee-bddb-c52a29481bea",
"source_endpoint_display_name": "Globus Tutorial Collection 1",
"source_endpoint_id": "6c54cade-bde5-45c1-bdea-f4bd71dba2cc",
"source_local_user": null,
"source_local_user_status": null,
"status": "SUCCEEDED",
"subtasks_canceled": 0,
"subtasks_expired": 0,
"subtasks_failed": 0,
"subtasks_pending": 0,
"subtasks_retrying": 0,
"subtasks_skipped_errors": 0,
"subtasks_succeeded": 2,
"subtasks_total": 2,
"symlinks": 0,
"sync_level": null,
"task_id": "2c594ded-b6ae-4957-b706-ffe34d307c6b",
"type": "TRANSFER",
"username": "exampleuser",
"verify_checksum": false
}
4. Data Access Consent
Standard Globus Connect Server v5 mapped collections require users to consent to
a collection specific dependent data_access
scope in order to
access data on that collection. Every such scope will be in the format:
urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/COLLECTION_UUID/data_access]
Where COLLECTION_UUID is the uuid of a collection that requires the data_access scope. You may request multiple dependent scopes by separating them with spaces:
urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/COLLECTION_UUID_1/data_access *https://auth.globus.org/scopes/COLLECTION_UUID_2/data_access]
If you know ahead of time that you will be accessing a mapped collection
that requires a data_access
dependent scope, you should request those consents
when you first request tokens for authenticating with Transfer.
However, if you don’t know ahead of time which collections you
will be accessing, you can add error handling logic to handle ConsentRequired
errors raised when Transfer is unable to get the dependent tokens needed
to access data. The body of these error responses will contain a
required_scopes
field with a list of scopes to which the user must consent in
order to complete the failing request.
{
"code": "ConsentRequired",
"message": "Missing required data_access consent",
"request_id": "WmMV97A1w",
"required_scopes": [
"urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/ea4c8cf2-d3a2-4c1a-92c2-59925fe7ac5b/data_access *https://auth.globus.org/scopes/0357da44-c908-47e5-8351-ba5c695d11a7/data_access]"
],
"resource": "/transfer"
}
5. Common Query Parameters
Most resources support field selection using the fields
parameter. Most list
resources support pagination using one of the paging methods described below.
Some list resources also support filtering on certain fields using a filter
parameter, and sorting on certain fields using orderby
.
5.1. Paging
The Transfer service uses multiple types of paging for controlling results depending on the resource used. Resource documentation will specify which type of paging that resource uses.
5.1.1. Offset Paging
Resources which use this type of paging use the offset
and limit
query
parameters to guide pagination. The default offset
is 0, while the
default limit
and maximum offset
and limit
vary among resources.
Typical usage involves starting with offset
0, choosing a page size
(limit=PAGE_SIZE
), and incrementing offset
by PAGE_SIZE
to display
successive pages. The limit
and offset
values will be echoed in the
response body, and some resources will also include a has_next_page
value
which will be True if making a query at the next offset would yield more
results.
For example, with a page size of 50:
# page 1 GET /task_list?offset=0&limit=50 # page 2 GET /task_list?offset=50&limit=50 # page 3 GET /task_list?offset=100&limit=50
5.1.2. Marker Paging
Resources which use this type of paging use the marker
query parameter
and the marker
and next_marker
response body fields to guide pagination.
If the next_marker
field in a response is not null, it can be passed as the
marker
query parameter on the next request to fetch the next page.
If the next_marker
field in a response is null, there are no further results.
The marker
query parameter used for that request will be echoed in the
response body. Page size cannot be controlled for resources using this
paging method.
For example:
# page 1 GET task/<task_id>/successful_transfers { "marker": null, "next_marker": 123, ... } # page 2 GET task/<task_id>/successful_transfers?marker=123 { "marker": 123, "next_marker": 456, ... } # page 2 GET task/<task_id>/successful_transfers?marker=456 { "marker": 456, "next_marker": null, ... }
5.1.3. Last Key Paging
Resources which use this type of paging use the last_key
query parameter
along with the last_key
and has_next_page
response body fields to guide
pagination. If has_next_page
is true, the last_key
response field can
be passed as a query parameter to fetch the next page. If has_next_page
is
false, there are no more results at the time of the request. Page size can
be controlled using the limit
query parameter which will be echoed in
the response body.
For example:
# page 1 GET /endpoint_manager/task_list?limit=100 { "has_next_page": true, "last_key": "abc", "limit": 100, ... } # page 2 GET /endpoint_manager/task_list?limit=100&last_key="abc" { "has_next_page": false, "last_key": null, "limit": 100, ... }
5.1.4. Next Token Paging
Resources which use this type of paging use the next_token
query parameter
to guide pagination. Responses will include a next_token
value that will either
be null if there are no more results, or should be passed as the next_token
query parameter to fetch the next page of results. Page size can be controlled
using the max_results
query parameter.
For example:
# page 1 GET /endpoint/<endpoint_or_collection_id>/shared_endpoint_list?max_results=100 { "next_token": "abc", ... } # page 2 GET /endpoint/<endpoint_or_collection_id>/shared_endpoint_list?max_results=100&next_token=abc { "next_token": null, ... }
5.2. Filtering
Only certain fields support filtering; see the reference documentation for a full list. There are several types of filters, including date range, a single value, or a list of values. See the field documentation for descriptions and examples.
This example for the task list returns ACTIVE and SUCCESSFUL tasks submitted before December 20 2010:
GET /task_list?filter=status:ACTIVE,SUCCESSFUL/request_time:,2010-12-20 00:00:00
The new convention for filters is to use separate parameters for each, of the form filter_NAME - see Endpoint and Collection Search for an example.
5.3. Sorting
The orderby
parameter sets a sort field and direction. Only fields
which support filtering are sortable. The value is a comma separated
list of field names, with an optional direction specifier. For example:
GET /task_list?orderby=status,request_time desc
returns tasks first ordered by status, in ascending alphabetical order, then within tasks with the same status sorts by request_time
, with newer tasks first (descending).
5.4. Limiting Result Fields
The fields
query parameter can be used to limit which fields are included
in the response, for example:
GET /task_list?fields=task_id,status
will return a task list with only task_id
and status fields in each task.
This can save bandwidth and parsing time if you know you only need certain
fields.
Field selection can also be done on sub-documents, by prefixing the field name with the document type name. For example:
GET /endpoint_search?filter_scope=my-endpoints&fields=id,display_name
will include only the id
and display_name
of each endpoint or collection.
6. Request Rate Limiting
This service will send an error if too many requests are being received.
The HTTP status is 429 and the error document code is RateLimitExceeded
.
Upon receiving this error, a client MUST use an exponential backoff (delay).
If a client does not use a delay before retrying, that client and/or user may
be locked out from the service.
The rate limit is 20 requests per second, per effective identity. Metering is done on 10 second periods to allow for some bursts.