POSIX Staging Connector
The Globus Connect Server Posix Staging Connector provides access to POSIX file systems that cache data from a tertiary storage system. In most respects, it acts like the POSIX connector, supporting all policies provided by the POSIX connector, but includes additional policies to control data staging.
Globus will probably interleave staging and transfer operations during the processing of the transfer task to improve performance.
This document describes the data types supported by this version of the connector.
Posix Staging Connector Virtual Filesystem
The Posix Staging Connector filesystem reflects the file system hierarchy on the data transfer nodes that the collection is visible on. If there are multiple data transfer nodes, they must use a shared file system to provide a coherent view of the file system.
When accessing data on a POSIX collection, if the storage gateway’s restrict_paths or a mapped collection’s sharing_restrict_paths property are set to disallow all access to a file or directory, those directory entries will not be visible in the collection.
Also, the collection_base_path value is set on collection creation and acts as the root of the collection’s virtual filesystem, similar to a POSIX chroot.
Stage App
The stage app is used by Globus Connect Server to stage files prior to accessing them via Globus. It is run by the local account of the Globus user who is transferring files.
The stage app can be any script or binary executable, and will be called for each file that is a part of a transfer task. The following variables are defined in the environment when Globus invokes the stage app.
GLOBUS_STAGE_PATH
-
The full path of the file to stage.
GLOBUS_STAGE_TASKID
-
The Task ID of the Globus transfer task attempting to transfer the file.
Stage App Responses
The stage app may return information indicating that the staging request is processed but not completed, completed, or could not be processed.
Success
If the staging is successfully processed, the stage app must exit with
status 0
and print one of the following strings to standard output:
resident
-
File is resident on disk storage and may be accessed.
archived
-
The file is still being retrieved.
When the stage app returns archived
, Globus will periodically call
the stage app with that file path until it returns resident
Failure
If the stage app returns any other response with a 0
exit code, or exits
with a non-zero exit code, Globus Connect Server will trigger a fault
in the transfer task.
If the stage app exits with a non-0
value, the data that it writes to
standard error (up to KiB) is read and Globus will return that message as a
transfer fault, which is visible to the user who initiated the transfer.
Do not include any confidential information in that response.
If the stage app exits with 0
but without a valid output string, the
output and error values are ignored and a generic staging fault is
triggered.
Globus will continue to retry the transfer according to its normal retry policies.
Timeout
The stage app should avoid long run times and return as quickly as possible. An app that does not respond within the required time will be terminated, and a timeout fault will be triggered. The current time limit is 45 seconds.
If extended processing is necessary, an app may wish to start subprocesses
in the background and immediately return archived
. It is important that
any subprocesses redirect their standard file descriptors (stdout/stderr/stdin);
if they are inherited from the main staging app, the stage request will block
until the subprocess completes, possibly resulting in a timeout fault.
Staging Queue
Currently, Globus supports 64 outstanding staging requests per transfer
task. Each time the stage app returns resident
for a path, Globus
will call the stage app with a new file path, until all files for the
task are successfully staged and transferred. The policy regarding the
maximum number of outstanding staging requests is subject to change.
If a stage app wishes to batch multiple stage requests, it should keep state
locally, adding new paths from each stage request while immediately returning
archived
. When a task has fewer than 64 files, the maximum outstanding
requests will have been reached when a duplicate path is seen.
Posix Staging Connector Storage Gateway Policies
The Posix Staging Connector has policies to configure POSIX group-level access controls, that complement the user based access controls in the base storage gateway document. See the storage gateway create reference manual for information about how these policies interact with the storage gateway policies.
The Posix Staging Connector also has policies for configuring the stage app and its environment
Groups Allow
The {groups_allow} property is used restrict access to users who are not explicitly allowed or denied by the storage gateway user policy to be allowed access if their account is a member of one of the named POSIX groups.
Groups Deny
The {groups_deny} property is used restrict access to users who are not explicitly allowed or denied by the storage gateway user policy to be allowed access if their account is a member of one of the named POSIX groups.
The Posix Staging Connector has policies to configure the staging command and the environment in which it runs.
Stage App
The stage_app property is used to specify the path of the program that implements the Stage App interface.