Task Behavior
1. Overview
After submission through the task submit resources, tasks are asynchronously executed by the Globus Transfer backend. This document is meant to help users understand how these tasks behave during asynchronous execution.
2. Queuing
The Transfer backend will only work on 3 transfer tasks and 3 delete tasks per user at a time. Additional tasks will be queued until other tasks stop working due to completion, pausing, or backing off from an error.
The nice_status
of a queued task’s task document
will be "Queued" unless it also hit an error within the past 15 minutes, in
which case the nice_status
will instead show error information.
3. Deadlines
Transfer tasks submitted without a user-specified deadline
will have their
deadlines automatically extended if they are making progress. Delete tasks
default to 24 hours and do not have their deadlines automatically extended.
Automatic transfer deadline extension behavior follows these rules:
-
The deadline starts at 24 hours after submission.
-
If the transfer is making progress, it will regularly have its deadline pushed to 72 hours after its most recent successfully completed batch of work.
-
If the transfer is queued and hasn’t hit any errors in the past 24 hours, its deadline will regularly be pushed to 24 hours away.
Paused transfers will not have their deadlines automatically extended, but a paused task will have its deadline set to 90 days after the pause time. When a task is unpaused its deadline is set to 24 hours after the unpause time.
4. Errors
When the Transfer backend hits an error while working on a task, it categorizes the error, emits an event to the task’s event list, and handles the error in one of the following ways.
4.1. Skippable Errors
Transfer tasks with the skip_source_errors
flag set to true will skip paths
hitting errors classified as PERMISSION_DENIED
, FILE_NOT_FOUND
, or
AMBIGUOUS_PATH
on the source collection.
When one or more paths hit skippable errors during a batch of work, the task’s
subtasks_skipped_errors
in the task document
will be incremented by the number of paths skipped.
Skippable errors will not count against the backoff described under Retry Behavior, but the Transfer backend will still wait 10 seconds before starting the task again to avoid overwhelming the destination collection. It is possible for the task to become queued during this time.
4.2. Fatal Errors
Some error codes will cause a task to immediately fail as they cannot be recovered from.
These are AMBIGUOUS_PATH
(when not skipped due to skip_source_errors
),
INVALID_PATH_NAME
, INVALID_SYMLINK
, and EXTERNAL_CHECKSUM_MISMATCH
.
4.3. Retry Behavior
All other errors that are not fatal or skipped will cause the task to back off before retrying the operation that hit the error. It is possible for the task to become queued during this time.
The backoff time is based on the number of errors the task has seen in the past 2 hours, starting at 30 seconds and growing exponentially to a maximum of 5 minutes.
If the error never resolves the task will not be able to make progress and will eventually fail when it reaches its deadline.
subtasks_retrying
field of the task document is deprecated and does
not accurately reflect information about a task’s retry behavior.