API Overview

1. Overview

The Transfer API provides remote data access via GridFtp to file-systems that have been configured with collections within the Transfer service. Collections are generally created using the Globus Connect Server or Globus Connect Personal software packages.

Most requests to the Transfer API fall under three categories: configuring and managing Globus Connect Personal collections, synchronously accessing data on a collection such as directory listings, and submitting and managing asynchronous tasks between collections such as file transfers.

This documentation assumes a basic familiarity with HTTP, including the GET, POST, PUT, and DELETE request methods, Content-Type and Accepts headers, and standard status codes.

Python users may wish to use the Globus Python SDK when making requests to the Transfer API as it handles authentication and exposes methods for most Transfer API functionally.

2. Transfer API Terminology

task - a batch of asynchronous file transfer or delete operations that were submitted together, identified by a unique ID.
collection - a definition for a source / destination for data access with a convenient name and unique ID.
endpoint - a definition for an installation of Globus Connect Server used as an interface for management and creation of collections.
consent - a string associated with an OAuth2 token that tells services what access the user has agreed to. In addition to the transfer scope that users consent to in order make requests to the Transfer API, Globus Connect Server v5 data access requires dependent consents to access data on some collections.
resource - a URL addressable part of the API, which can be interacted with using a subset of the GET, POST, PUT, and DELETE HTTP methods.
document - a representation of data, returned by resources as output and accepted by resources as input. There are several standard document types, and some types include sub-documents (for example, the task_list type is a container for many documents of type task).

2.1. Legacy Endpoint Terminology

The Transfer API largely predates the above definitions of endpoint and collection, and will frequently use the term "endpoint" in urls, fields, error codes, and messages to refer to entities and resources that are actually collections in the current terminology.

For example, the endpoint or collection document will always have a DATA_TYPE of "endpoint" regardless of what type of entity the document actually refers to.

This documentation will use the above endpoint and collection terminology and note locations where legacy endpoint terminology is being used by the API to attempt to reduce confusion.

3. Making Requests

3.1. Base URL

All the URLs in the Transfer documentation are relative to a base url for the transfer service.

https://transfer.api.globusonline.org/v0.10

so the full URL to /task_list will be:

https://transfer.api.globusonline.org/v0.10/task_list

Clients should store the base URL in one place and use it when constructing resource URLs, to simplify changing versions.

3.2. Document Formats

The API uses json for all input and output, including error documents. As such, the Content-Type header should be set to application/json when making POST requests.

Note that application/x-www-form-urlencoded is not supported. The body should contain the actual JSON data, not a form encoded version of that data.

The json representation uses a "DATA_TYPE" key to specify the type of resource and a "DATA" key containing a list of sub-documents, if any. For example, the task document type is described in detail here:

Task Document

3.3. Authentication

Authentication to the Transfer API requires using Globus Auth to obtain an access token. The Globus Auth developer guide describes the steps to register a client and request tokens for that client to make requests to Globus services.

When requesting tokens, request Transfer all scope in order to get an access token scoped to make requests to the Transfer API:

urn:globus:auth:scope:transfer.api.globus.org:all

If you know you will be using specific Globus Connect Server mapped collections, you can also proactively request their dependent scopes when first getting tokens. See data access consent for details.

Once obtained, the access token needs to be passed to the Transfer API in the Authorization header with the method Bearer:

Authorization: Bearer TOKEN

3.4. Linked Identities

Authorization is based on the capabilities granted to any of the linked identities associated with the Globus Auth token used to authenticate to the REST API, combined with capabilities granted to any of the groups any of the linked identities belong too. For example, a private collection will be visible if any of the linked identities own the collection, or if any of the linked identities has an appropriate effective role on the collection.

3.5. Errors

When an error occurs an HTTP status code >=400 will be returned. The body of the response will be a JSON document with details about the error, including code and message fields. The error code will also be provided in the "X-Transfer-API-Error" header. Note that requests outside the API path version prefix may return an HTML or plaintext error body instead. Here is an example EndpointNotFound^[1]:

{
  "code": "EndpointNotFound",
  "message": "No such endpoint '23c1a962-7e68-11e5-ac37-f0def10a689e'",
  "request_id": "HrbjJy3QJ",
  "resource": "/endpoint/23c1a962-7e68-11e5-ac37-f0def10a689e"
}

A 404 status code is used for this response. The code field has the same value as the X-Transfer-API-Error header, for convenient access.

3.6. Example Requests

The following examples are a series of Transfer requests to submit and view a transfer from Globus Tutorial Collection 1 to Globus Tutorial Collection 2.

curl is used to give low level language agnostic examples. For routine command line level usage, the Globus CLI is recommended. Python users may also want to reference the Globus Python SDK examples.

The TOKEN variable in the below examples is an Access token granted by Globus auth, see Authentication for details. Note that since the Globus Tutorial Collections are Globus Connect Server Mapped collections, the token’s scopes will need to include dependent scopes for data access consent.

3.6.1. Get a submission id from Globus Transfer

This example is a GET against the /submission_id resource which gets a submission_id needed to submit a task.

Request:

curl --request GET --header "Authorization: Bearer $TOKEN" \
https://transfer.api.globus.org/v0.10/submission_id

Response:

{
  "DATA_TYPE": "submission_id",
  "value": "f0930d8a-e83c-448c-9c43-a63296e541d3"
}

3.6.2. Submit the transfer request

This example is a POST against the /transfer resource containing a JSON body containing the submission_id from the above example and the fields needed to transfer file1.txt from Globus Tutorial Collection 1’s /home/share/godata/ directory to the user’s home directory (/~/) on Globus Tutorial Collection 2.

Note: The fields source_endpoint and destination_endpoint refer to the source and destination collections due to legacy endpoint terminology.

Request:

curl --request POST --header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "DATA": [
    {
      "DATA_TYPE": "transfer_item",
      "destination_path": "file1.txt",
      "source_path": "/home/share/godata/file1.txt"
    }
  ],
  "DATA_TYPE": "transfer",
  "destination_endpoint": "31ce9ba0-176d-45a5-add3-f37d233ba47d",
  "source_endpoint": "6c54cade-bde5-45c1-bdea-f4bd71dba2cc",
  "submission_id": "f0930d8a-e83c-448c-9c43-a63296e541d3"
}' \
https://transfer.api.globus.org/v0.10/transfer

Response:

{
  "DATA_TYPE": "transfer_result",
  "code": "Accepted",
  "message": "The transfer has been accepted and a task has been created and queued for execution",
  "request_id": "gOvWY74zr",
  "resource": "/transfer",
  "submission_id": "f0930d8a-e83c-448c-9c43-a63296e541d3",
  "task_id": "2c594ded-b6ae-4957-b706-ffe34d307c6b",
  "task_link": {
    "DATA_TYPE": "link",
    "href": "task/2c594ded-b6ae-4957-b706-ffe34d307c6b?format=json",
    "rel": "related",
    "resource": "task",
    "title": "related task"
  }
}

3.6.3. Check the task’s status

This example is a GET against the /task/<task_id> resource with the task_id URL variable set to the task_id from the above example’s response in order to view that task’s status.

Request:

curl --request GET --header "Authorization: Bearer $TOKEN" \
https://transfer.api.globus.org/v0.10/task/2c594ded-b6ae-4957-b706-ffe34d307c6b

Response:

{
  "DATA_TYPE": "task",
  "bytes_checksummed": 0,
  "bytes_transferred": 4,
  "canceled_by_admin": null,
  "canceled_by_admin_message": null,
  "command": "API 0.10",
  "completion_time": "2024-01-29T16:52:56+00:00",
  "deadline": "2024-01-30T16:52:55+00:00",
  "delete_destination_extra": false,
  "destination_base_path": null,
  "destination_endpoint": "u_4ms37xszkjddtk3w5gtgrttr4i#476a7a90-80a0-11ee-8c55-fd88ce9321ad",
  "destination_endpoint_display_name": "Globus Tutorial Collection 2",
  "destination_endpoint_id": "31ce9ba0-176d-45a5-add3-f37d233ba47d",
  "destination_local_user": null,
  "destination_local_user_status": null,
  "directories": 0,
  "effective_bytes_per_second": 3,
  "encrypt_data": false,
  "fail_on_quota_errors": false,
  "fatal_error": null,
  "faults": 0,
  "files": 1,
  "files_skipped": 0,
  "files_transferred": 1,
  "filter_rules": null,
  "history_deleted": false,
  "is_ok": null,
  "is_paused": false,
  "label": null,
  "nice_status": null,
  "nice_status_details": null,
  "nice_status_expires_in": null,
  "nice_status_short_description": null,
  "owner_id": "b0febcb2-c10b-4ec2-b03b-d97f6ada6a33",
  "preserve_timestamp": false,
  "recursive_symlinks": "ignore",
  "request_time": "2024-01-29T16:52:55+00:00",
  "skip_source_errors": false,
  "source_base_path": null,
  "source_endpoint": "u_eyljfjd6jfg67nm6zp6gly4qpu#e09c6728-80a0-11ee-bddb-c52a29481bea",
  "source_endpoint_display_name": "Globus Tutorial Collection 1",
  "source_endpoint_id": "6c54cade-bde5-45c1-bdea-f4bd71dba2cc",
  "source_local_user": null,
  "source_local_user_status": null,
  "status": "SUCCEEDED",
  "subtasks_canceled": 0,
  "subtasks_expired": 0,
  "subtasks_failed": 0,
  "subtasks_pending": 0,
  "subtasks_retrying": 0,
  "subtasks_skipped_errors": 0,
  "subtasks_succeeded": 2,
  "subtasks_total": 2,
  "symlinks": 0,
  "sync_level": null,
  "task_id": "2c594ded-b6ae-4957-b706-ffe34d307c6b",
  "type": "TRANSFER",
  "username": "exampleuser",
  "verify_checksum": false
}

4. Data Access Consent

Standard Globus Connect Server v5 mapped collections require users to consent to a collection specific dependent data_access scope in order to access data on that collection. Every such scope will be in the format:

urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/COLLECTION_UUID/data_access]

Where COLLECTION_UUID is the uuid of a collection that requires the data_access scope. You may request multiple dependent scopes by separating them with spaces:

urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/COLLECTION_UUID_1/data_access *https://auth.globus.org/scopes/COLLECTION_UUID_2/data_access]

If you know ahead of time that you will be accessing a mapped collection that requires a data_access dependent scope, you should request those consents when you first request tokens for authenticating with Transfer.

However, if you don’t know ahead of time which collections you will be accessing, you can add error handling logic to handle ConsentRequired errors raised when Transfer is unable to get the dependent tokens needed to access data. The body of these error responses will contain a required_scopes field with a list of scopes to which the user must consent in order to complete the failing request.

{
  "code": "ConsentRequired",
  "message": "Missing required data_access consent",
  "request_id": "WmMV97A1w",
  "required_scopes": [
    "urn:globus:auth:scope:transfer.api.globus.org:all[*https://auth.globus.org/scopes/ea4c8cf2-d3a2-4c1a-92c2-59925fe7ac5b/data_access *https://auth.globus.org/scopes/0357da44-c908-47e5-8351-ba5c695d11a7/data_access]"
  ],
  "resource": "/transfer"
}

5. Common Query Parameters

Most resources support field selection using the fields parameter. Most list resources support pagination using one of the paging methods described below. Some list resources also support filtering on certain fields using a filter parameter, and sorting on certain fields using orderby.

5.1. Paging

The Transfer service uses multiple types of paging for controlling results depending on the resource used. Resource documentation will specify which type of paging that resource uses.

5.1.1. Offset Paging

Resources which use this type of paging use the offset and limit query parameters to guide pagination. The default offset is 0, while the default limit and maximum offset and limit vary among resources. Typical usage involves starting with offset 0, choosing a page size (limit=PAGE_SIZE), and incrementing offset by PAGE_SIZE to display successive pages. The limit and offset values will be echoed in the response body, and some resources will also include a has_next_page value which will be True if making a query at the next offset would yield more results.

For example, with a page size of 50:

# page 1
GET /task_list?offset=0&limit=50

# page 2
GET /task_list?offset=50&limit=50

# page 3
GET /task_list?offset=100&limit=50

5.1.2. Marker Paging

Resources which use this type of paging use the marker query parameter and the marker and next_marker response body fields to guide pagination. If the next_marker field in a response is not null, it can be passed as the marker query parameter on the next request to fetch the next page. If the next_marker field in a response is null, there are no further results. The marker query parameter used for that request will be echoed in the response body. Page size cannot be controlled for resources using this paging method.

For example:

# page 1
GET task/<task_id>/successful_transfers
{
  "marker": null,
  "next_marker": 123,
  ...
}

# page 2
GET task/<task_id>/successful_transfers?marker=123
{
  "marker": 123,
  "next_marker": 456,
  ...
}

# page 2
GET task/<task_id>/successful_transfers?marker=456
{
  "marker": 456,
  "next_marker": null,
  ...
}

5.1.3. Last Key Paging

Resources which use this type of paging use the last_key query parameter along with the last_key and has_next_page response body fields to guide pagination. If has_next_page is true, the last_key response field can be passed as a query parameter to fetch the next page. If has_next_page is false, there are no more results at the time of the request. Page size can be controlled using the limit query parameter which will be echoed in the response body.

For example:

# page 1
GET /endpoint_manager/task_list?limit=100
{
    "has_next_page": true,
    "last_key": "abc",
    "limit": 100,
    ...
}

# page 2
GET /endpoint_manager/task_list?limit=100&last_key="abc"
{
    "has_next_page": false,
    "last_key": null,
    "limit": 100,
    ...
}

5.1.4. Next Token Paging

Resources which use this type of paging use the next_token query parameter to guide pagination. Responses will include a next_token value that will either be null if there are no more results, or should be passed as the next_token query parameter to fetch the next page of results. Page size can be controlled using the max_results query parameter.

For example:

# page 1
GET /endpoint/<endpoint_or_collection_id>/shared_endpoint_list?max_results=100
{
    "next_token": "abc",
    ...
}

# page 2
GET /endpoint/<endpoint_or_collection_id>/shared_endpoint_list?max_results=100&next_token=abc
{
    "next_token": null,
    ...
}

5.2. Filtering

Only certain fields support filtering; see the reference documentation for a full list. There are several types of filters, including date range, a single value, or a list of values. See the field documentation for descriptions and examples.

This example for the task list returns ACTIVE and SUCCESSFUL tasks submitted before December 20 2010:

GET /task_list?filter=status:ACTIVE,SUCCESSFUL/request_time:,2010-12-20 00:00:00

The new convention for filters is to use separate parameters for each, of the form filter_NAME - see Endpoint and Collection Search for an example.

5.3. Sorting

The orderby parameter sets a sort field and direction. Only fields which support filtering are sortable. The value is a comma separated list of field names, with an optional direction specifier. For example:

GET /task_list?orderby=status,request_time desc

returns tasks first ordered by status, in ascending alphabetical order, then within tasks with the same status sorts by request_time, with newer tasks first (descending).

5.4. Limiting Result Fields

The fields query parameter can be used to limit which fields are included in the response, for example:

GET /task_list?fields=task_id,status

will return a task list with only task_id and status fields in each task. This can save bandwidth and parsing time if you know you only need certain fields.

Field selection can also be done on sub-documents, by prefixing the field name with the document type name. For example:

GET /endpoint_search?filter_scope=my-endpoints&fields=id,display_name

will include only the id and display_name of each endpoint or collection.

6. Request Rate Limiting

This service will send an error if too many requests are being received. The HTTP status is 429 and the error document code is RateLimitExceeded. Upon receiving this error, a client MUST use an exponential backoff (delay). If a client does not use a delay before retrying, that client and/or user may be locked out from the service.

The rate limit is 20 requests per second, per effective identity. Metering is done on 10 second periods to allow for some bursts.

1. This use of the term "endpoint" is a case of legacy endpoint terminology and can also/exclusively refer to collections