Task Management
- 1. Overview
- 2. Document Types
- 3. Path Arguments
- 4. Common Query Parameters
- 5. Common Errors
- 6. Operations
1. Overview
Transfer and delete are asynchronous operations, and result in a background task being created. The task id is returned from successful submission, and can be used to monitor the progress of the task.
history_deleted
flag
will be set on the task document.
1.1. Task Visibility
A task will be visible if the current primary identity is the same as the primary identity when the task was submitted. The only time this will not be the case is when two separate identities with existing task history are linked. When that happens, the task history for the linked identity that is no longer primary will not be visible. It can still be recovered and viewed by unlinking the identity and signing in to that identity separately.
2. Document Types
2.1. Task Document
The "task" document type represents a single transfer or delete submission. Other task types may be added in the future.
2.1.1. Task Fields
Field Name | JSON Type | Description |
---|---|---|
DATA_TYPE |
string |
Always has value "task" to indicate this document type. |
task_id |
string |
Globally unique uuid for this task. |
type |
string |
The type of task - either "TRANSFER" or "DELETE". |
status |
string |
One word indicating the progress of the task:
|
fatal_error |
object |
For tasks with "FAILED" status, this is an object with |
label |
string |
User defined label to make finding tasks simpler. |
username |
string |
[DEPRECATED] Use |
owner_id |
string |
Identity id of the task owner. If the task was submitted with multiple linked identities, the owner will be the primary identity. |
request_time |
string |
The date and time the task was created, in ISO 8601 format. |
completion_time |
string |
The date and time the task was completed or failed, in ISO 8601 format. If the task is still in progress (status "ACTIVE" or "INACTIVE"), this will be null. |
deadline |
string |
If set when the task is created, cancel the task if it’s not finished when this date and time is reached. By default transfer tasks will not be canceled as long as they are making progress. Delete tasks default to 24 hours after task submission. |
source_endpoint[1] |
string |
[DEPRECATED] Use |
source_endpoint_id[1] |
string |
|
source_endpoint_display_name[1] |
string |
|
destination_endpoint[1] |
string |
[DEPRECATED] Use |
destination_endpoint_id[1] |
string |
|
destination_endpoint_display_name[1] |
string |
Like |
sync_level |
integer |
The sync level used for the transfer, or null if not used. Always null for delete tasks. |
encrypt_data |
boolean |
For transfer tasks, this will be true if the data channel
is encrypted. This can happen in two ways: if one of the
collections or involved in the transfer has
|
verify_checksum |
boolean |
The verify_checksum option used for the transfer. Always false for delete tasks. |
delete_destination_extra |
boolean |
The delete_destination_extra option used for the transfer. Always false for delete tasks. |
recursive_symlinks |
string |
The |
preserve_timestamp |
boolean |
The preserve_timestamp option used for the transfer. Always false for delete tasks. |
skip_source_errors |
boolean |
The skip_source_errors option used for the transfer. Always false for delete tasks. |
fail_on_quota_errors |
boolean |
The fail_on_quota_errors option used for the transfer. Always false for delete tasks. |
command |
string |
If the task was submitted via the CLI, this field will contain the original command line, including options, that created the task. If submitted via the Globus Web App or directly via the Transfer API, this will contain the string "API", followed by the API version, and optionally followed by a short string representing the client that submitted the request. The format of this field is subject to change and should not be relied upon. |
history_deleted |
boolean |
This flag will be set for tasks older than one month, and indicates that details of the task are no longer available. In particular, subtasks and successful transfers will not be available for the task if this is true, so clients should always check this before querying for subtasks and successful transfers. |
faults |
int |
The number of errors this task encountered. Note that certain types of faults are not fatal (for example, network communication errors) and can be successfully retried. A CANCELED or EXPIRED event is not included in this fault count. |
files |
int |
The total number of files in a transfer task. This can grow as the task recursively scans directories. |
directories |
int |
The total number of directories in a transfer task. This can grow as the task recursively scans directories. |
symlinks |
int |
The total number of kept symlinks (not copied or ignored) in a transfer task. This can grow as the task recursively scans directories. |
files_skipped |
int |
The number of files skipped because no changes were
detected. This will always be zero for non-sync transfer
tasks (with null |
files_transferred |
int |
The number of files actually transferred over the network.
For a successful transfer,
|
subtasks_total |
int |
Total number of subtasks. This includes file transfer or delete subtasks and helper subtasks such as directory expansion, so is not a reliable measure of the number of files being transferred. It can also grow over time as new files and directories are discovered in directory expansions. |
subtasks_pending |
int |
Number of subtasks which are still in progress. |
subtasks_retrying |
int |
This field is deprecated, and may be 0 or removed in a future release. |
subtasks_succeeded |
int |
Number of subtasks which have completed successfully. |
subtasks_expired |
int |
Number of subtasks which expired and were not completed. |
subtasks_canceled |
int |
Number of subtasks which were canceled. |
subtasks_failed |
int |
Number of subtasks which failed for reasons other than expiring or being cancelled. |
subtasks_skipped_errors |
int |
Number of subtasks that were skipped due to skip_source_errors being set on the task. This will equal the total number of discovered files and directories skipped from the source collection. Note that if a directory is skipped then the files and directories under it will not be discovered and not included in this count. |
bytes_transferred |
int |
The total number of bytes transferred summed across all subtasks. |
bytes_checksummed |
int |
If sync level 3 is used, the number of bytes checksummed while determining which files need to be transferred. |
effective_bytes_per_second |
int |
A simplistic calculation of bytes/second based on the start time of the task and its completion time, if applicable, or the current time. Valid for transfer tasks. Always 0 for other task types. |
nice_status |
string |
For tasks with status |
nice_status_details |
string |
DEPRECATED This field is always null.
Use the task |
nice_status_short_description |
string |
3-4 word description of nice_status code |
nice_status_expires_in |
string |
Seconds until any credential required for the task expires (-1=never, 0=expired) |
canceled_by_admin |
string |
If the task completes successfully, is canceled by the task owner using the standard cancel resource, or hits the deadline before completing, this will be null. It is set only for tasks canceled by a collection activity manager using the Advanced Collection Management resources. For such tasks, it will have one of the following values:
|
canceled_by_admin_message |
string |
For tasks with |
is_paused |
boolean |
"true" if the task is in progress (status "ACTIVE" or "INACTIVE") and has been paused by the activity manager of the source or destination collection, "false" if the task has not been paused or is complete (status "SUCCEEDED" or "FAILED"). Use Get task pause info to get information about why the task is paused. |
filter_rules |
list |
List of filter_rules applied on this task during recursive expansion. Can be null. See task submit for details. |
source_local_user |
string |
The value of the |
source_local_user_status |
string |
If |
destination_local_user |
string |
Like |
destination_local_user_status |
string |
Like |
source_base_path |
string |
If |
destination_base_path |
string |
If |
{
"DATA_TYPE": "task",
"bytes_checksummed": 0,
"bytes_transferred": 4,
"canceled_by_admin": null,
"canceled_by_admin_message": null,
"command": "API 0.10",
"completion_time": "2023-12-13T22:30:04+00:00",
"deadline": "2023-12-14T22:30:02+00:00",
"delete_destination_extra": false,
"destination_base_path": null,
"destination_endpoint": "u_4ms37xszkjddtk3w5gtgrttr4i#476a7a90-80a0-11ee-8c55-fd88ce9321ad",
"destination_endpoint_display_name": "Globus Tutorial Collection 2",
"destination_endpoint_id": "31ce9ba0-176d-45a5-add3-f37d233ba47d",
"destination_local_user": null,
"destination_local_user_status": null,
"directories": 0,
"effective_bytes_per_second": 2,
"encrypt_data": false,
"fail_on_quota_errors": false,
"fatal_error": null,
"faults": 0,
"files": 1,
"files_skipped": 0,
"files_transferred": 1,
"filter_rules": null,
"history_deleted": false,
"is_ok": null,
"is_paused": false,
"label": null,
"nice_status": null,
"nice_status_details": null,
"nice_status_expires_in": null,
"nice_status_short_description": null,
"owner_id": "8b386098-7951-4956-9e49-b1dfe507cdc9",
"preserve_timestamp": false,
"recursive_symlinks": "ignore",
"request_time": "2023-12-13T22:30:02+00:00",
"skip_source_errors": false,
"source_base_path": null,
"source_endpoint": "u_eyljfjd6jfg67nm6zp6gly4qpu#e09c6728-80a0-11ee-bddb-c52a29481bea",
"source_endpoint_display_name": "Globus Tutorial Collection 1",
"source_endpoint_id": "6c54cade-bde5-45c1-bdea-f4bd71dba2cc",
"source_local_user": null,
"source_local_user_status": null,
"status": "SUCCEEDED",
"subtasks_canceled": 0,
"subtasks_expired": 0,
"subtasks_failed": 0,
"subtasks_pending": 0,
"subtasks_retrying": 0,
"subtasks_skipped_errors": 0,
"subtasks_succeeded": 2,
"subtasks_total": 2,
"symlinks": 0,
"sync_level": null,
"task_id": "2715582a-9a07-11ee-87f6-a52c65340a88",
"type": "TRANSFER",
"username": "exampleuser",
"verify_checksum": true
}
2.2. Event Document
Events are logged as a task makes progress or runs into errors.
{
"DATA_TYPE": "event",
"code": "PERMISSION_DENIED",
"description": "Permission denied",
"details": "Error (transfer)\nServer: ballen#uc-laptop (Globus Connect)\nFile: /~/Downloads/plus-plan-exposure.png\nCommand: STOR ~/Downloads/plus-plan-exposure.png\nMessage: Fatal FTP response\n---\n500 Command failed. : Path not allowed.\n",
"is_error": true,
"time": "2014-07-08 18:50:18+00:00"
}
2.2.1. Event Fields
Field Name | JSON Type | Description |
---|---|---|
DATA_TYPE |
string |
Always has value "event" to indicate this document type. |
code |
string |
A code indicating the type of the event. |
is_error |
boolean |
true if event is an error event |
description |
string |
A description of the event. |
details |
string |
Type specific details about the event. |
time |
string |
The date and time the event occurred, in ISO 8601 format (YYYY-MM-DD HH:MM:SS) and UTC. |
The full list of event codes is below.
ACL_CHANGED AMBIGUOUS_PATH AUTH CANCELED CLOCK CONNECT_FAILED CONNECTION_BROKEN CONNECTION_RESET ENDPOINT_ERROR{fn-let} ENDPOINT_TOO_BUSY{fn-let} EXPIRED EXTERNAL_CHECKSUM_MISMATCH FILE_NOT_FOUND FILE_SIZE_CHANGED GC_NOT_CONNECTED GC_PAUSED GSI_DN_MISMATCH INTERNAL_ERROR INVALID_PATH_NAME INVALID_SERVICE_CREDENTIAL INVALID_SYMLINK IS_A_DIRECTORY IS_A_FILE LIMIT_EXCEEDED MPU_NOT_FOUND NO_APPEND_FILESYSTEM NO_CREDENTIALS NO_SPACE_LEFT NOT_A_DIRECTORY NOT_A_FILE NOT_A_SYMLINK PAUSED PERMISSION_DENIED PROGRESS QUOTA_EXCEEDED STARTED SUCCEEDED TIMEOUT UNKNOWN UNPAUSED VERIFY_CHECKSUM
2.3. Limited pause rule document
The limited pause rule document is a subset of the full pause rule document.
It does not contain sensitive fields, in particular modified_by
, which is
only viewable by collection activity monitors using the Advanced Collection Management
resources. The DATA_TYPE
is "pause_rule_limited" instead of "pause_rule", to
indicate the exclusions. See
pause_rule
document for details.
4. Common Query Parameters
Name | Type | Description |
---|---|---|
fields |
string |
Comma separated list of fields to include in the response. This can be used to save bandwidth on large list responses when not all fields are needed. |
limit |
int |
For paged resources, change the page size. For |
offset |
int |
For paged resources, specify an offset within the full result set. Typically a fixed page size is specified with limit, and offset is incremented by the page size to fetch each page. |
orderby |
string |
For paged resources, a comma separated list of order by options. Each order by option is either a field name, or a field name followed by space and 'ASC' or 'DESC' for ascending and descending; ascending is the default. Note that only certain fields are supported for ordering; see the specific operation documentation for details. |
filter |
string |
For paged resources, return only resources that match all of the specified
filter criteria. The value must be a "/" separated list of
"FIELD_NAME:FIELD_FILTER" strings. See the |
5. Common Errors
Code | HTTP Status | Description |
---|---|---|
TaskNotFound |
404 |
If task specified by <task_id> is not found |
Conflict |
409 |
If task is complete and can’t be updated. |
ServiceUnavailable |
503 |
If the service is down for maintenance. |
6. Operations
6.1. Get task list
Get a list of tasks submitted by the current user.
This resource uses offset paging.
URL |
/task_list |
---|---|
Method |
GET |
Response Body |
{ "DATA_TYPE": "task_list", "length": 2, "limit": 10, "offset": 20, "total": 125, "DATA": [<task document>, ...] ] } |
6.1.1. Filter and Order By Options
Fields allowed in the filter
and orderby
query parameters.
GET /task_list?filter=status:ACTIVE,INACTIVE/label:~experiment1*
Get tasks that are still running (status ACTIVE or INACTIVE), and have a label that begins with the string "experiment1".
Name | Type | Description |
---|---|---|
endpoint_id[1] |
string list |
Comma separated list of UUID strings. Return only tasks run against the specified collection ids. Only returns tasks owned by the current user. |
task_id |
string list |
Comma separated list of UUID strings. Return tasks matching the specified task ids. Only returns tasks owned by the current user. |
type |
string list |
Comma separated list of type names (TRANSFER, DELETE). Return only tasks of the specified type(s). The default is all tasks. |
status |
string list |
Comma separated list of status codes (ACTIVE, INACTIVE, FAILED, SUCCEEDED). Return only tasks with one of the specified statuses. |
label |
pattern list |
Comma separated list of patterns to match against the label field. Returns
tasks that match any of the patterns. Each pattern is an operator, followed
by a string. The operator is one of Backslashes are used to escape ambiguous literal characters
in the label filter. |
request_time |
datetime range |
Accepts a time range, specified by a comma separated list of two ISO 8601 date/time strings. If one of the dates is omitted, it forms an open range, so "dt," returns all records with date greater or equal to dt, and ",dt" returns all records with dates less than or equal to dt. If there is no comma, it is treated in the same way as "dt,". If the time is omitted from a date/time, it’s assumed to be 00:00. |
completion_time |
datetime range |
Like the request time filter, but for the completion time. |
6.2. Get task by id
Get a single task by task id. All fields are included by default,
but the fields
query parameter can be used to fetch only specific fields.
URL |
/task/<task_id> |
---|---|
Method |
GET |
Response Body |
6.3. Update task by id
Update a single task by task id. Only the label
and deadline
fields can be
updated, and only on tasks that are still running. If the task is complete a
"Conflict" error will be returned. A copy of the task body with one of those
fields modified can be used, or a partial document containing only DATA_TYPE
and the fields to be modified.
URL |
/task/<task_id> |
---|---|
Method |
PUT |
Response Body |
Result resource |
6.4. Cancel task by id
Submit a cancel request for an active task, by id. Cancel requests are
processed asynchronously, but this API call will wait up to 10 seconds for the
cancel request to be completed before returning a response. If the task was
already complete, result code "TaskComplete" is returned. If the cancel request
is processed within 10 seconds, result code "Canceled" is returned. Note that
when "Canceled" is returned, it’s still possible that the task completed
successfully just as the request was processed but after the check was made to
see if the task was already complete. Clients should always check the status
field of the task to verify what happened if they care about whether the task
succeeded or failed. What will always be true when "Canceled" or "TaskComplete"
is returned is that the task is no longer active. If the cancel request can’t
be processed in 10 seconds, code "CancelAccepted" is returned, and the client
can use "Get task by id" to fetch the task and see when it’s status changes
from "ACTIVE" to "FAILED" or "SUCCEEDED".
Only the owner of a task can cancel it via this API resource. If the owner is
an activity manager on one of the collections involved in the task, tasks canceled
with this resource will still NOT be marked as as canceled_by_admin
. This
resource is designed for when the user is acting as a normal user, regardless
of any higher level authority they have been granted.
URL |
/task/<task_id>/cancel |
---|---|
Method |
POST |
Response Body |
|
6.5. Remove task by id
Remove a task by id. Only tasks against High Assurance collections are eligible for removal. Tasks must first be either SUCCEEDED or FAILED. On success, a 200 is returned. After removal, any attempts by the task owner to access the task or information about the task (skipped_errors, successful_transfers) will return a 404. The task will remain visible to admins and manager roles via the Advanced Collection Management resources .
If a task is still running or paused, any attempt to remove the task will fail with an HTTP 409. In this circumstance, it may be necessary to cancel the task first before proceeding to remove it.
If a task is not against a High Assurance collection and a removal is attempted, an HTTP 403 will be returned.
If a task does not exist, or has already been removed, an HTTP 404 will be returned.
Only the owner of a task can remove a task.
URL |
/task/<task_id>/remove |
---|---|
Method |
POST |
Response Body |
|
6.5.1. Errors
Code | HTTP Status | Description |
---|---|---|
PermissionDenied |
403 |
Task is not eligible for removal, for example if it does not involve a HA collection. |
TaskNotFound |
404 |
Task does not exist, or has already been removed |
Conflict |
409 |
Task is still active (paused or running). Canceling the task will resolve this. |
6.6. Get event list
Get a list of all events, including error and info events. The results
are ordered by time
descending (newest first).
This resource uses offset paging.
URL |
/task/<task_id>/event_list |
---|---|
Method |
GET |
Response Body |
|
6.6.1. Filter Options
Fields allowed in the filter
query parameter.
GET /task/<task_id>/event_list?filter=is_error:1
Get only error events.
Name | Type | Description |
---|---|---|
is_error |
boolean |
"1" for true, "0" for false. If true, return only events that are classified as errors. If false, return only events that are classified as non-error (informational). By default, returns all events. |
6.7. Get task successful transfers
For "TRANSFER" tasks that are completed (have status
"SUCCEEDED" or
"FAILED"), get a list of files that were successfully transferred. The list
includes the source and destination paths of each file. Note that this does not
include files that were checked but skipped as part of a sync transfer or due
to the skip_source_errors flag being set to true, only files that were actually
transferred. The list does not include any directories.
This resource uses marker paging.
Returns 404 "ClientError.NotFound" if history has been deleted (history is kept for only one month ). Returns 400 "ClientError.BadRequest" if the task is not yet complete or is not of type "TRANSFER".
URL |
/task/<task_id>/successful_transfers [?marker=MARKER] |
---|---|
Method |
GET |
Response Body |
|
6.7.1. Successful Transfer Fields
Field Name | JSON Type | Description |
---|---|---|
DATA_TYPE |
string |
Always has value "successful_transfer" to indicate this document type. |
source_path |
string |
The path to the file on the source. |
destination_path |
string |
The path to the file on the destination. |
checksum |
string |
The file checksum, if one was used to verify the result of the transfer. |
checksum_algorithm |
string |
The checksum algorithm, if one was used to verify the result of the transfer. |
dynamic |
boolean |
True if the file was dynamically exported from the source, such as a Google Doc being exported to a PDF. |
size |
int |
The size (bytes) of the file that was transferred. |
6.8. Get task skipped errors
For "TRANSFER" tasks that are completed (have status
"SUCCEEDED" or
"FAILED"), get a list of discovered paths that were skipped due to the
skip_source_errors flag being set to true. Additional paths may have remained
undiscovered and therefore also skipped as a result of a skipped error on a
parent directory.
The list will contain information about the error and information needed to create a transfer_item or transfer_symlink_item to resubmit the path in a new task.
This resource uses marker paging.
A result set may be too large to fit into a single response. In this case, the
response will have the next_marker
field set to a non-empty, opaque integer
token. Pass this token as the marker
query parameter in another request to
get the next chunk of data. A client must not attempt to generate or assume
knowledge of a marker’s format. Note that this differs from the offset based
paging supported by some other resources.
Returns 404 "ClientError.NotFound" if history has been deleted (history is kept for only one month ). Returns 400 "ClientError.BadRequest" if the task is not yet complete or is not of type "TRANSFER".
URL |
/task/<task_id>/skipped_errors [?marker=MARKER] |
---|---|
Method |
GET |
Response Body |
|
6.8.1. Skipped Error Fields
Field Name | JSON Type | Description |
---|---|---|
DATA_TYPE |
string |
Always has value "skipped_error" to indicate this document type. |
error_code |
string |
The code of the skipped error. Either PERMISSION_DENIED or FILE_NOT_FOUND. |
error_details |
string |
Additional information from the collection about the skipped error. |
source_path |
The path on the source collection that hit the error. |
|
destination_path |
string |
The path on the destination collection the item would have been transferred to. |
is_directory |
boolean |
True if the source_path is for a directory. |
is_symlink |
boolean |
True if the source_path is for a symlink. |
is_delete_destination_extra |
boolean |
True if the delete destination extra stage of a task was skipped for this path. Note that delete destination extra is a separate task stage. This error indicates a problem traversing the source directory path and that the checks required to compare source and destination were incomplete. Partial work including deletions may have occurred. Some transfers may have also occurred for files beneath this path. |
external_checksum |
string |
The external_checksum submitted with the item. |
checksum_algorithm |
string |
The checksum_algorithm submitted with the item. |
6.9. Get task pause info
Get details about why a task is paused (or possibly about to be paused). This includes pause rules on source and destination collections that affect the owner of the task, and per-task pause flags set by source and destination collection activity managers. Any pause rules that have been overridden by an administrator are not listed.
If the task is not paused, this may still return pause rules that have been created but not yet applied to the task. This is because pause rules are processed asynchronously.
If the task is complete, this will return an empty result set, meaning that
pause_rules
list will be empty and both pause messages will be null.
Requires the user to be the owner of the task. To access pause info as an activity manager effective role, use the collection manager pause info operation.
A pause rule is set by an activity manager of a collection and causes all matching tasks to or from that collection to be paused. The rules returned by this operation have some sensitive fields removed, see the pause_rule_limited document.
URL |
/task/<task_id>/pause_info |
---|---|
Method |
GET |
Response Body |
{ "DATA_TYPE": "pause_info_limited", "pause_rules": [... list of pause_rule_limited documents...], "source_pause_message": null, "destination_pause_message": "Disk problems, pausing all tasks until we resolve", "source_pause_message_share": null, "destination_pause_message_share": null } |