Last Updated: November 11, 2020


The Posix Staging Connector provides access to POSIX file systems that cache data from a tertiary storage system. In most respects, it acts like the POSIX connector. It invokes an administrator-specified command to stage data to the cache.

The connector is available as part of the a Globus Standard subscription.

This document describes how to install and configure the Posix Staging Connector as well as create a mapped collection. After these configuration steps are completed, any authorized user can access their data using the Globus web application, SDK or CLI, and if policies permit, users can create guest collections to share their data with other users..

The installation must be done by a system administrator, and has the following steps:

  1. Install a Globus Connect Server version 5 endpoint. Globus Connect Server Version 5.4.10 or greater is required to support the Posix Staging Connector, and installation instructions are here. The rest of this document assumes a functional Globus Connect Server version 5 endpoint

  2. Write a stage app to initiate data staging.

  3. Create a storage gateway on the endpoint configured to use the Posix Staging Connector with this staging command.

  4. Create a Posix Staging mapped collection.

Please contact us at support@globus.org if you have questions or need help with installation or use of the Posix Staging Connector.


The Posix Staging Connector requires a functional Globus Connect Server 5 endpoint in order to be used. Instructions for installing and configuring and endpoint using Globus Connect Server 5 can be found here. The rest of this document assumes that a functional Globus Connect Server 5 endpoint is being used when attempting to configure the Posix Staging Connector.

Posix Staging Connector Virtual Filesystem

The Posix Staging Connector filesystem reflects the file system hierarchy on the data transfer nodes that the collection is visible on. If there are multiple data transfer nodes, they must use a shared file system to provide a coherent view of the file system.

When accessing data on a POSIX collection, if the storage gateway’s restrict_paths or a mapped collection’s sharing_restrict_paths property are set to disallow all access to a file or directory, those directory entries will not be visible in the collection.

Also, the collection_base_path value is set on collection creation and acts as the root of the collection’s virtual filesystem, similar to a POSIX chroot.

Stage App

The stage app is used by Globus Connect Server to stage files prior to accessing them via Globus. It is run by the local account of the Globus user who is transferring files.

The stage app can be any script or binary executable, and will be called for each file that is a part of a transfer task. The following variables are defined in the environment when Globus invokes the stage app.

GLOBUS_STAGE_PATH

The full path of the file to stage.

GLOBUS_STAGE_TASKID

The Task ID of the Globus transfer task attempting to transfer the file.

Stage App Responses

The stage app may return information indicating that the staging request is processed but not completed, completed, or could not be processed.

Success

If the staging is successfully processed, the stage app must exit with status 0 and print one of the following strings to standard output:

resident

File is resident on disk storage and may be accessed.

archived

The file is still being retrieved.

When the stage app returns archived, Globus will periodically call the stage app with that file path until it returns resident

Failure

If the stage app returns any other response with a 0 exit code, or exits with a non-zero exit code, Globus Connect Server will trigger a fault in the transfer task.

If the stage app exits with a non-0 value, the data that it writes to standard error (up to KiB) is read and Globus will return that message as a transfer fault, which is visible to the user who initiated the transfer. Do not include any confidential information in that response.

If the stage app exits with 0 but without a valid output string, the output and error values are ignored and a generic staging fault is triggered.

Globus will continue to retry the transfer according to its normal retry policies.

Timeout

The stage app should avoid long run times and return as quickly as possible. An app that does not respond within the required time will be terminated, and a timeout fault will be triggered. The current time limit is 45 seconds.

If extended processing is necessary, an app may wish to start subprocesses in the background and immediately return archived. It is important that any subprocesses redirect their standard file descriptors (stdout/stderr/stdin); if they are inherited from the main staging app, the stage request will block until the subprocess completes, possibly resulting in a timeout fault.

Staging Queue

Currently, Globus supports 64 outstanding staging requests per transfer task. Each time the stage app returns resident for a path, Globus will call the stage app with a new file path, until all files for the task are successfully staged and transferred. The policy regarding the maximum number of outstanding staging requests is subject to change.

If a stage app wishes to batch multiple stage requests, it should keep state locally, adding new paths from each stage request while immediately returning archived. When a task has fewer than 64 files, the maximum outstanding requests will have been reached when a duplicate path is seen.

Storage Gateway

Before reading below about the policy options specific to the Posix Staging Connector, please familiarize yourself with the Globus Connect Server v5 Data Access Guide, which describes the steps to create and update a storage gateway and set storage gateway policies.

Posix Staging Connector Storage Gateway Policies

The Posix Staging Connector has policies to configure POSIX group-level access controls, that complement the user based access controls in the base storage gateway document. See the storage gateway create reference manual for information about how these policies interact with the storage gateway policies.

The Posix Staging Connector also has policies for configuring the stage app and its environment

Groups Allow

The --posix-group-allow command-line option is used restrict access to users who are not explicitly allowed or denied by the storage gateway user policy to be allowed access if their account is a member of one of the named POSIX groups.

Example 1. Allowing members of a group

For our example, we’ll allow accounts that are members of the GROUP_ALLOW_NAME group to have access to the storage gateway.

--posix-group-allow GROUP_ALLOW_NAME

Groups Deny

The --posix-group-deny command-line option is used restrict access to users who are not explicitly allowed or denied by the storage gateway user policy to be allowed access if their account is a member of one of the named POSIX groups.

Example 2. Denying members of a group

For our example, we’ll deny accounts that are members of the GROUP_DENY_NAME group to access to the storage gateway.

--posix-group-deny GROUP_DENY_NAME

The Posix Staging Connector has policies to configure the staging command and the environment in which it runs.

Stage App

The --posix-stage-app command-line option is used to specify the path of the program that implements the Stage App interface.

Example 3. Specifying the stage app

For our example, we’ll use the stage app /usr/local/bin/globus-stage-data. This application must be installed on each data transfer node.

--posix-stage-app /usr/local/bin/globus-stage-data

Stage App Environment Variables

The Posix Staging Connector allows the administrator to set additional environment variables in the environment of the stage app. These are specified by the --posix-staging-environment command-line option.

Example 4. Set Stage App Environment

For our example, we’ll add the variables --posix-staging-environment VAR=VALUE to the stage app’s environment. We’ll use these in the storage gateway create command-line options.

--posix-staging-environment VAR=VALUE

Creating the Storage Gateway

A Posix Staging Connector Storage Gateway is created with the command globus-connect-server storage-gateway create posix-staging, and can be updated with the command globus-connect-server storage-gateway update posix-staging.

Now that you understand the policies specific to the Posix Staging Connector, you can create the storage gateway.

% globus-connect-server storage-gateway create posix-staging \
    "Posix-Staging Storage Gateway" \
    --domain example.org \
    --posix-sharing-group-allow SHARING_GROUP_ALLOW_NAME \
    --posix-sharing-group-deny SHARING_GROUP_DENY_NAME \
    --posix-stage-app /usr/local/bin/globus-stage-data \
    --posix-staging-environment VAR=VALUE

Storage Gateway Created: 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab

If storage gateway creation is successful, the unique ID of the new storage gateway (7187a9a0-68e4-48ea-b3b9-7fd06630f8ab in this case) will be displayed for your reference. If you forget the ID of a storage gateway, you can use the command globus-connect-server storage-gateway list to retrieve the IDs of the storage gateways on the endpoint.

You can also add other policies to configure additional identity mapping and path restriction policies as described in the Globus Connect Server v5 Data Access Guide.

Note that this creates the storage gateway, but does not yet make it accessible via Globus and HTTPS. You’ll need to follow the steps in the next section.

Note that you have created the storage gateway, but it is not yet accessible via Globus or HTTPS. To enable data access, you will need to create a mapped collection by following the instructions in the Collections section of the data access guide.

Appendix A: Document Types for the Posix Staging Connector

PosixStagingStoragePolicies 1.0.0 Document

This document contains version 1.0.0 of the POSIX staging storage gateway policies with an explicit DATA_TYPE value.

Name

Type

Description

DATA_TYPE

string posix_staging_storage_policies#1.0.0

Type of this document

groups_allow

array (string)

List of POSIX group IDs allowed to access this Storage Gateway.[Private]

groups_deny

array (string)

List of POSIX group IDs denied access this Storage Gateway.[Private]

stage_app

string

Path to the stage app.[Private]

environment

array (object)

Variables to set in the environment when executing the stage_app.[Private]

{
  "DATA_TYPE": "posix_staging_storage_policies#1.0.0",
  "groups_allow": [
    "globus"
  ],
  "groups_deny": [
    "nonglobus"
  ],
  "stage_app": "/usr/local/bin/globus-stage-data",
  "environment": [
    {
      "name": "VOLUME",
      "value": "/vol/0"
    }
  ]
}