POSIX Staging connector
The Posix Staging Connector provides access to POSIX file systems that cache data from a tertiary storage system. In most respects, it acts like the POSIX connector. It invokes an administrator-specified command to stage data to the cache.
The connector is available as part of the a Globus Standard subscription.
This document describes how to install and configure the Posix Staging Connector as well as create a mapped collection. After these configuration steps are completed, any authorized user can access their data using the Globus web application, SDK or CLI, and if policies permit, users can create guest collections to share their data with other users.
The installation must be done by a system administrator, and has the following steps:
-
Install a Globus Connect Server version 5 endpoint. Globus Connect Server Version 5.4.10 or greater is required to support the Posix Staging Connector, and installation instructions are here. The rest of this document assumes a functional Globus Connect Server version 5 endpoint
-
Write a stage app to initiate data staging.
-
Create a storage gateway on the endpoint configured to use the Posix Staging Connector with this staging command.
-
Create a Posix Staging mapped collection.
Please contact us at support@globus.org if you have questions or need help with installation or use of the Posix Staging Connector.
The Posix Staging Connector requires a functional Globus Connect Server 5 endpoint in order to be used. Instructions for installing and configuring and endpoint using Globus Connect Server 5 can be found here. The rest of this document assumes that a functional Globus Connect Server 5 endpoint is being used when attempting to configure the Posix Staging Connector.
Posix Staging Connector Virtual Filesystem
The Posix Staging Connector filesystem reflects the file system hierarchy on the data transfer nodes that the collection is visible on. If there are multiple data transfer nodes, they must use a shared file system to provide a coherent view of the file system.
When accessing data on a POSIX collection, if the storage gateway’s restrict_paths or a mapped collection’s sharing_restrict_paths property are set to disallow all access to a file or directory, those directory entries will not be visible in the collection.
Also, the collection_base_path value is set on collection creation and acts as the root of the collection’s virtual filesystem, similar to a POSIX chroot.
Stage App
The stage app is used by Globus Connect Server to stage files prior to accessing them via Globus. It is run by the local account of the Globus user who is transferring files.
The stage app can be any script or binary executable, and will be called for each file that is a part of a transfer task. The following variables are defined in the environment when Globus invokes the stage app.
GLOBUS_STAGE_PATH
-
The full path of the file to stage.
GLOBUS_STAGE_TASKID
-
The Task ID of the Globus transfer task attempting to transfer the file.
Stage App Responses
The stage app may return information indicating that the staging request is processed but not completed, completed, or could not be processed.
Success
If the staging is successfully processed, the stage app must exit with
status 0
and print one of the following strings to standard output:
resident
-
File is resident on disk storage and may be accessed.
archived
-
The file is still being retrieved.
When the stage app returns archived
, Globus will periodically call
the stage app with that file path until it returns resident
Failure
If the stage app returns any other response with a 0
exit code, or exits
with a non-zero exit code, Globus Connect Server will trigger a fault
in the transfer task.
If the stage app exits with a non-0
value, the data that it writes to
standard error (up to KiB) is read and Globus will return that message as a
transfer fault, which is visible to the user who initiated the transfer.
Do not include any confidential information in that response.
If the stage app exits with 0
but without a valid output string, the
output and error values are ignored and a generic staging fault is
triggered.
Globus will continue to retry the transfer according to its normal retry policies.
Timeout
The stage app should avoid long run times and return as quickly as possible. An app that does not respond within the required time will be terminated, and a timeout fault will be triggered. The current time limit is 45 seconds.
If extended processing is necessary, an app may wish to start subprocesses
in the background and immediately return archived
. It is important that
any subprocesses redirect their standard file descriptors (stdout/stderr/stdin);
if they are inherited from the main staging app, the stage request will block
until the subprocess completes, possibly resulting in a timeout fault.
Staging Queue
Currently, Globus supports 64 outstanding staging requests per transfer
task. Each time the stage app returns resident
for a path, Globus
will call the stage app with a new file path, until all files for the
task are successfully staged and transferred. The policy regarding the
maximum number of outstanding staging requests is subject to change.
If a stage app wishes to batch multiple stage requests, it should keep state
locally, adding new paths from each stage request while immediately returning
archived
. When a task has fewer than 64 files, the maximum outstanding
requests will have been reached when a duplicate path is seen.
Storage Gateway
Before reading below about the policy options specific to the Posix Staging Connector, please familiarize yourself with the Globus Connect Server v5 Data Access Guide, which describes the steps to create and update a storage gateway and set storage gateway policies.
Posix Staging Connector Storage Gateway Policies
The Posix Staging Connector has policies to configure POSIX group-level access controls, that complement the user based access controls in the base storage gateway document. See the storage gateway create reference manual for information about how these policies interact with the storage gateway policies.
The Posix Staging Connector also has policies for configuring the stage app and its environment
Groups Allow
The --posix-group-allow command-line option is used restrict access to users who are not explicitly allowed or denied by the storage gateway user policy to be allowed access if their account is a member of one of the named POSIX groups.
For our example, we’ll allow accounts that are members of the
GROUP_ALLOW_NAME
group to have access to the storage gateway.
--posix-group-allow GROUP_ALLOW_NAME
Groups Deny
The --posix-group-deny command-line option is used restrict access to users who are not explicitly allowed or denied by the storage gateway user policy to be allowed access if their account is a member of one of the named POSIX groups.
For our example, we’ll deny accounts that are members of the
GROUP_DENY_NAME
group to access to the storage gateway.
--posix-group-deny GROUP_DENY_NAME
The Posix Staging Connector has policies to configure the staging command and the environment in which it runs.
Stage App
The --posix-stage-app command-line option is used to specify the path of the program that implements the Stage App interface.
For our example, we’ll use the stage app /usr/local/bin/globus-stage-data
. This
application must be installed on each data transfer node.
--posix-stage-app /usr/local/bin/globus-stage-data
Stage App Environment Variables
The Posix Staging Connector allows the administrator to set additional environment variables in the environment of the stage app. These are specified by the --posix-staging-environment command-line option.
For our example, we’ll add the variables
--posix-staging-environment VAR
=VALUE
to the stage app’s environment.
We’ll use these in the storage gateway create command-line options.
--posix-staging-environment VAR
=VALUE
Creating the Storage Gateway
A Posix Staging Connector Storage Gateway is created with the command globus-connect-server storage-gateway create posix-staging, and can be updated with the command globus-connect-server storage-gateway update posix-staging.
Now that you understand the policies specific to the Posix Staging Connector, you can create the storage gateway.
% globus-connect-server storage-gateway create posix-staging \
"Posix-Staging Storage Gateway" \
--domain example.org
\
--posix-sharing-group-allow SHARING_GROUP_ALLOW_NAME
\
--posix-sharing-group-deny SHARING_GROUP_DENY_NAME
\
--posix-stage-app /usr/local/bin/globus-stage-data
\
--posix-staging-environment VAR
=VALUE
Storage Gateway Created: 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab
If storage gateway creation is successful, the unique ID of the new storage
gateway (7187a9a0-68e4-48ea-b3b9-7fd06630f8ab
in this case) will be displayed for your reference.
If you forget the ID of a storage gateway, you can use the command
globus-connect-server storage-gateway
list to retrieve the IDs of the storage gateways on the endpoint.
You can also add other policies to configure additional identity mapping and path restriction policies as described in the Globus Connect Server v5 Data Access Guide.
Note that this creates the storage gateway, but does not yet make it accessible via Globus and HTTPS. You’ll need to follow the steps in the next section.
Note that you have created the storage gateway, but it is not yet accessible via Globus or HTTPS. To enable data access, you will need to create a mapped collection by following the instructions in the Collections section of the data access guide.
Appendix A: Document Types for the Posix Staging Connector
PosixStagingStoragePolicies Document
Connector-specific storage gateway policies for the POSIX Staging connector
One of the following schemas:
{
"DATA_TYPE": "posix_staging_storage_policies#1.0.0",
"environment": [
{
"name": "string",
"value": "string"
}
],
"groups_allow": [
"string"
],
"groups_deny": [
"string"
],
"stage_app": "string"
}