Last Updated: September 3, 2018

Table of Contents

1. Introduction

This installation guide provides an overview of Globus Connect Server version 5 for system administrators who will install and operate this version of Globus Connect Server.

Globus Connect Server version 5 is the next evolution of the server. It provides new capabilities and enhancements for both administrators and users, and platform features to build interesting solutions for data management.

Important:Globus Connect Server version 5 does not yet have all of the features available in version 4. Until it does, we are offering a series of point releases (e.g., version 5.2), each release adding incrementally more capabilities. At this time, there is no path to upgrade or migrate from version 4 to version 5. When version 5 has feature parity with version 4, we will provide an upgrade mechanism and instructions. In the meantime, Globus Connect Server version 5 releases are in limited production and are intended for organizations that need to use specific features that aren’t available in version 4. Others should continue to use Globus Connect Server version 4 until version 5 is fully featured and ready for broad deployment.

The latest version, 5.2, supports the following features:

  • Google Drive storage

  • POSIX storage

  • AWS S3 storage NEW in version 5.2.49+

  • Ceph storage NEW in version 5.2.49+

  • Guest collections (equivalent to v4 shared endpoints)

  • Mapped collections (need local account for data access)

  • High assurance data handling

  • HTTPS access to data - for direct access from browsers and other HTTPS clients

  • GridFTP access to data - for reliable, bulk data transfer via the Globus transfer service

2. Globus Connect Version 5 Terminology

The Globus Connect Server architecture has evolved to support several new capabilities. This section provides an overview of the components in Globus Connect Server version 5 and how they relate to version 4 components.

  • Endpoint (changed from version 4): The endpoint is a deployment of Globus Connect Server version 5. A single endpoint may optionally include multiple data transfer nodes (DTNs). (Multiple DTNs are not supported in versions 5.1 or 5.2.) The endpoint provides the interface for server management and configuration.

  • Storage connector: A storage connector allows the endpoint to use a particular type of storage. (E.g., POSIX file system, Google Drive.) You may configure multiple storage connectors for a single endpoint, allowing simultaneous access to all connectors.

  • Storage gateway: Storage gateways provide the storage access policies for the endpoint’s connected storage systems. A storage gateway is a named, discoverable interface by which authorized users can create and manage collections on a connected storage system. A connected storage system may have multiple storage gateways.

  • Collection: Collections provide the data access interfaces for an endpoint. In version 4, these are called "endpoints." In Globus Connect Server version 5, a collection is a named set of files (or blobs), hierarchically organized in folders, associated with a specific storage gateway. Collections can be accessed via HTTPS (client/server access), GridFTP (asynchronous bulk transfer), and REST API (for advanced operations). Access to a collection is authenticated with Globus Auth-issued OAuth2 access tokens, with data access policies defined in the collection itself. Globus Connect Server version 5 supports two types of collections:

    • Mapped collection: Each user accessing the collection must have a local account on the storage system. Their Globus identity is mapped to their local account. In version 4, these are called “host endpoints.”

    • Guest collection: Users can access the collection without a local account on the storage system. Access is based on permissions granted by an authorized user via Globus. In version 4, these are called “shared endpoints.”

Globus Connect Server version 5.1

With the above architecture, Globus Connect Server version 5 supports many new features including:

  • Multiple storage types connected to the same endpoint

  • Multiple storage gateways against the same storage type

  • Clear separation between management and configuration, and data access interfaces

2.1. Installation summary

In this section, we summarize the steps for creating an endpoint and making it accessible to users with Globus Connect Server version 5.

  1. The server administrator installs the Globus Connect Server version 5 software and uses it to create the endpoint. The endpoint includes the configuration for the server and its network use.

  2. The administrator registers the endpoint with Globus so that Globus can be used to secure access to the endpoint.

  3. The administrator creates one or more storage gateways (see terminology above) to define access policies for the endpoint’s storage.

  4. The administrator may also create mapped collections that allow data access by users who have local accounts.

With the above in place, authorized users interact with the endpoint as follows.

  1. Discover storage gateways and create new guest collections as allowed by storage gateway policies.

  2. Access data on existing collections using the GridFTP and/or HTTPS protocols. They may use a web browser (for HTTPS links), the Globus Web app, the Globus command-line interface (CLI), the Globus software development kit (SDK), or the Globus REST APIs.

Note:Globus Connect Server version 5.2 only supports use of a single server (data transfer node) and should be used only for high assurance data access. Subsequent releases will support multiple servers (data transfer nodes).

3. Prerequisites

Important:The prerequisites listed in this section must be met before you begin to install Globus Connect Server version 5 on your system. Contact us if you have any questions regarding the prerequisites.

3.1. Supported Linux distributions

Globus Connect Server version 5 is currently supported on the following Linux distributions:

  • CentOS 7

  • Red Hat Enterprise Linux 7

  • Ubuntu 16.04 LTS

  • Debian 9

Note:Globus Connect Server version 5.2 cannot be run on the same machine as Globus Connect Server version 4.

3.2. Administrator privileges

You must have administrator (root) privileges on your system to install Globus Connect Server version 5; sudo can be used to perform the installation.

3.3. Globus subscription

You must have a Globus subscription to install Globus Connect Server version 5.2 because this version only supports premium features.

3.4. System time synchronization

Your system must be running ntpd or another daemon for synchronizing with standard time servers.

3.5. Internet-accessible system

Other hosts on the Internet must be able to initiate connections to the system where you will be installing Globus Connect Server version 5. If your system is behind a network address translation (NAT) firewall/router, you may not be able to use the default configuration to install Globus. Please see the NAT/Firewall configuration instructions in the globus-connect-server-setup Command Reference. Otherwise, perform the checks shown below to confirm that your system meets the default accessibility requirements. If you are installing on an Amazon EC2 instance, you can skip ahead to the Open TCP ports section.

Your network administrator may be able to offer assistance if you run into problems, or contact us.

3.5.1. Check local hostname

Execute this command on the system where you plan to install Globus Connect Server version 5:

$ hostname -f

Confirm that a fully qualified domain name (FQDN) is returned (e.g., 'ep1.transfer.globus.org' ).

3.5.2. Check external DNS resolution

Use a public DNS server operated by a different organization to verify that the returned FQDN is publicly resolvable. More concretely, you can use nslookup to check that your server’s FQDN resolves against one of Google’s public DNS servers:

$ nslookup 'ep1.transfer.globus.org' 8.8.4.4

If you get a message of the form "server can’t find ep1.transfer.globus.org: NXDOMAIN", your system’s hostname is not resolvable via public DNS and you need to address the issue before continuing with the installation.

3.6. Open TCP ports

If your system is behind a firewall, several TCP ports must be opened for Globus to work. You may need to coordinate with your network or security administrator to open the ports.

The TCP ports that must be open for the default Globus Connect Server version 5 installation are as follows.

  • Ports 50000—51000 inbound and outbound to/from ANY

    • Used for GridFTP data channel traffic.

    • The use of the default port range is strongly recommended (you can read why here).

    • Data channel traffic is sent directly between endpoints—it is not relayed by the Globus service.

  • Port 443 inbound from ANY

    • Used by Globus Connect Server version 5 Manager Service

    • Used for GridFTP control channel traffic.

    • Used for HTTPS access to collections.

  • Port 443 outbound to ANY

    • Used to communicate with the Globus service via its REST API.

    • Used to communicate with Google Drive servers.

    • Used to pull Globus Connect Server version 5 packages from the Globus repository.

4. Installation and setup

This section covers the installation and setup of Globus Connect Server version 5, including endpoint configuration, a basic POSIX storage gateway, and an initial guest collection. As we walk through each part of this installation, links to alternate configurations (e.g., Google Drive storage gateway) will be provided. You can fine-tune this configuration to your specific needs later without doing a reinstallation.

Before continuing, please confirm that the prerequisites detailed in the previous section have been met.

4.1. Install Globus Connect Server version 5 software

Skip to the appropriate section for your Linux distribution and follow the instructions to install Globus Connect Server version 5 on your system.

4.1.1. CentOS and Red Hat Enterprise Linux

$ wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
$ sudo yum install epel-release-latest-7.noarch.rpm
$ sudo yum install http://downloads.globus.org/toolkit/gt6/stable/installers/repo/rpm/globus-toolkit-repo-latest.noarch.rpm
$ sudo yum-config-manager --enable Globus-Connect-Server-5-Stable
$ sudo yum-config-manager --enable Globus-Toolkit-6-Stable
$ sudo yum install yum-plugin-priorities
$ sudo yum install globus-connect-server52

4.1.2. Ubuntu

$ sudo curl -LOs http://downloads.globus.org/toolkit/gt6/stable/installers/repo/deb/globus-toolkit-repo_latest_all.deb
$ sudo dpkg -i globus-toolkit-repo_latest_all.deb
$ sudo sed -i /etc/apt/sources.list.d/globus-toolkit-6-stable*.list \
        -e 's/^# deb /deb /'
$ sudo sed -i /etc/apt/sources.list.d/globus-connect-server-stable*.list \
        -e 's/^# deb /deb /'
$ sudo apt-get install software-properties-common
$ sudo add-apt-repository ppa:certbot/certbot
$ sudo apt-get update
$ sudo apt-get install globus-connect-server52

4.1.3. Debian

$ echo 'deb http://ftp.debian.org/debian stretch-backports main' \
        | sudo tee /etc/apt/sources.list.d/stretch-backports.list
$ sudo curl -LOs http://downloads.globus.org/toolkit/gt6/stable/installers/repo/deb/globus-toolkit-repo_latest_all.deb
$ sudo dpkg -i globus-toolkit-repo_latest_all.deb
$ sudo sed -i /etc/apt/sources.list.d/globus-toolkit-6-stable*.list \
        -e 's/^# deb /deb /'
$ sudo sed -i /etc/apt/sources.list.d/globus-connect-server-stable*.list \
        -e 's/^# deb /deb /'
$ sudo apt-get update
$ sudo apt-get install globus-connect-server52

4.2. Create the endpoint

With the Globus Connect Server version 5 software installed on your server, the next step is to establish the endpoint on your system. You will register the endpoint to get credentials to access Globus services, edit the globus-connect-server.conf config file, then run the globus-connect-server-setup command to create the endpoint. After the endpoint is created, your institution’s subscription manager must set the endpoint as managed by your High Assurance or HIPAA BAA subscription.

4.2.1. Register the endpoint and obtain credentials

The first step in establishing your endpoint is to register it with Globus and obtain credentials for the server itself. These credentials allow the endpoint to securely identify itself to—and interact with—Globus services.

  1. Log into the Globus Developers Console.

  2. If this is your first time using the Developers Console, you’ll be asked to create a project before proceding. Otherwise, click "Add another project."

  3. Fill out the form to create your project. This project will be used to track your Globus Connect Server registrations. Keep it separate from any other projects you might have. Pay special attention to the Project Admins section. You must authenticate to one of the checked identities (or any you add later) in order to manage your project once it is created.

  4. You may have to logout and then return and login in with a specific authorized identity to manage the project after you create it. (This is part of the high-assurance features.)

  5. Use the "Add…" menu to add other appropriate users in your organization as administrators of the project. Adding other administrators helps your organization avoid losing administrative control should any one administrator leave your organization.

  6. From the "Add…" menu for the project click "Add a new Globus Connect Server" and fill out the form. The display name will be used to identify this endpoint to users when they access it for the first time. Use the same name here that you plan to use in later steps so your users will have a consistent experience.

  7. Click "Generate a New Client Secret" and fill out the form.

  8. Save the Client ID and Client Secret values. You will use them soon when editing your globus-connect-server.conf file.

Note:Each new endpoint requires a new Globus Connect Server version 5 registration with its own Client ID and Client Secret. These registrations can be within the same project.

4.2.2. Configure the endpoint

To configure your new endpoint, edit the /etc/globus-connect-server.conf file as follows.

  1. Set the [Globus].ClientId value to your endpoint’s Client ID from the registration step above.

  2. Set the [Globus].ClientSecret value to your endpoint’s Client Secret from the registration step above.

  3. Change the [Endpoint].Name value to your endpoint’s name. It is best if this name matches the display name you picked during the registration step.

  4. Change the [Endpoint].ServerName value to the public DNS name of your server.

  5. Set the [LetsEncrypt].Email value to the email address of the endpoint administrator. This address will receive expiration warnings/notices regarding your endpoint’s HTTPS certificates, so it should be an email address that is regularly checked.

  6. Set the [LetsEncrypt].AgreeToS value to True.

Note:The globus-connect-server-setup command reference provides an overview of the settings in the /etc/globus-connect-server.conf file. A detailed description of every setting can be found in the comments within the globus-connect-server.conf file that gets created on your system during the Globus Connect Server version 5 install process.

4.2.3. Run globus-connect-server-setup

When you are finished editing the configuration file, run the globus-connect-server-setup command to create the endpoint.

$ sudo globus-connect-server-setup

You’ll need the output of this command for later steps. Take note of the values for

  1. “Deployed GCS Manager”, “Google Drive Redirect URL”, and “Created GCS Endpoint.” The “Deployed GCS Manager” value is the address for your endpoint’s GCS Manager service. Globus support may ask for this if you submit a support ticket.

  2. You’ll need the “Google Drive Redirect URL” value if you configure the Google Drive connector on this endpoint.

  3. The “Created GCS Endpoint” value will include your endpoint’s display name as well as its UUID, both of which are important for identifying your endpoint.

4.3. Set the endpoint as managed

Endpoints that require premium functionality—such as high assurance data handling, guest collections and premium connectors—must be managed by a Globus subscription. Globus Connect Server version 5.2 only supports premium features, so your endpoint must be associated with your organization’s subscription.

To make your endpoint managed, contact a Globus subscription manager at your institution and ask them to set the endpoint as managed under the High Assurance or HIPAA BAA subscription. If you need help identifying subscription managers for your institution, please submit a support request to Globus. (Instructions for the subscription manager are provided below.) When you contact your subscription manager, include the following information in your request.

  1. State that you are requesting designation of a new managed Globus Connect Server version 5.2 endpoint.

  2. Include the UUID of your endpoint as shown in the globus-connect-server-setup command output.

After submitting the request, you’ll need to wait for a response that your endpoint has been made managed before proceeding.

The subscription manager at your institution can go to the Endpoints section of the Globus web app and search for the UUID you provided. When the endpoint is found, click the "Update Managed Status" button to set the endpoint managed with the High Assurance or HIPAA BAA subscription ID.

Update managed status
Important:Globus issues two subscription IDs to institutions with the High Assurance or HIPAA BAA tiers on their subscriptions. One is for standard Globus uses and the other is specifically for High Assurance or HIPAA BAA uses. Endpoints to be used with protected data must be managed by the High Assurance or HIPAA BAA subscription ID.

4.4. Assign the administrator role

Once your endpoint has been designated as managed by your Globus subscription, you must assign the Administrator role for the endpoint to your Globus identity—and any others who should be able to configure your endpoint. Run the following command, where USER@DOMAIN is your Globus identity.

$ sudo globus-connect-server-config endpoint admin add-role USER@DOMAIN

Do the same for any other users in your organization who should be able to configure your endpoint. We recommend assigning multiple administrators to prevent your organization from losing administrative control of your endpoint should any one administrator leave your organization.

4.5. Create a storage gateway (POSIX)

Now that you’ve created, configured, and registered your endpoint, you’re ready to add the data access interfaces. The first step is to create the policies for storage access. You’ll do this by creating a storage gateway. We’ll use a high-assurance POSIX storage gateway for our examples.

The POSIX storage connector allows Globus Connect Server version 5 to use POSIX file systems mounted on the host server. This connector is pre-installed. A storage gateway allows authorized users to create collections that allow portions of the storage to be accessed via HTTPS and GridFTP.

Note:Your version 5.2 endpoint also supports Google Drive storage gateways. To create a Google Drive storage gateway, follow the instructions in the Google Drive storage connector documentation.

Use the globus-connect-server-config storage-gateway create command to create a new POSIX storage gateway on your endpoint. You can use the command to setup policies, including whether the storage gateway supports guest or mapped collections, or both.

Note:The Managing storage gateways section provides full details on managing storage gateways, and the POSIX connector section describes the options unique to POSIX storage gateways. The globus-connect-server-config Reference lists all of the available commands and their options.

The following example shows how to create a high-assurance POSIX gateway.

$ sudo globus-connect-server-config storage-gateway create \
  --connector POSIX --display-name "Data Storage Gateway" \
  --root /data --high-assurance --domain example.edu \
  --authentication-assurance-timeout 720 \
  --allow-guest-collections --disallow-mapped-collections

Storage Gateway Created: c81cf69c-e494-465c-83f3-7baf272de1c0

The command above creates a storage gateway on the endpoint with the following attributes.

  • It uses POSIX storage.

  • It has a display name of “Data Storage Gateway”.

  • Only folders and files within the /data directory on the server are accessible via the gateway.

  • The storage gateway enforces high-assurance features for accessing protected data.

  • Globus users must have an identity from the example.edu identity provider in their Globus Account in order to create collections using this storage gateway.

  • Globus users must re-authenticate after 12 hours in each application/device to create collections or access existing collections.

  • Guest collections may be created, but mapped collections are not allowed.

On success, the command provides the ID of the new storage gateway.

You can now list the storage gateways on your endpoint as follows.

$ sudo globus-connect-server-config storage-gateway list

ID | Storage Type | Display Name | Root
c81cf69c-e494-465c-83f3-7baf272de1c0 | POSIX | Data Storage Gateway | /data

Given the ID of a storage gateway, you can display its configuration details as follows. (If no ID is given as an argument, the configuration information for all existing storage gateways will be shown.)

$ sudo globus-connect-server-config storage-gateway show c81cf69c-e494-465c-83f3-7baf272de1c0

c81cf69c-e494-465c-83f3-7baf272de1c0
--identity-provider 927d2228-f927-42b2-9ace-c523fa2ba34e
--domain example.edu
--connector "POSIX"
--display-name "Data Storage Gateway"
--authentication-assurance-timeout 720
--allow-guest-collections
--disallow-mapped-collections
--high-assurance
--root "/data"

The following example creates a high-assurance POSIX storage gateway that supports only mapped collections. It also uses a trick to take users directly to their home directories on the endpoint.

$ sudo globus-connect-server-config storage-gateway create \
    --root '$HOME' -connector "POSIX" --display-name "Home directories" -d uchicago.edu \
    --allow-mapped-collections --disallow-guest-collections \
    --high-assurance --authentication-assurance-timeout 3600
Storage Gateway Created: f97ca3dc-96b2-4923-97ac-3291ffefe74f

The --root configuration above is notable. Because it is specified as a quoted environment variable, the variable isn’t evaluated until a mapped collection exists (see below) and a user accesses it. When a user accesses the mapped collection, the user will automatically be taken to his or her $HOME directory on the endpoint system.

4.6. Create a mapped collection (POSIX)

Mapped collections must be created by an endpoint administrator. Globus users with accounts on the endpoint system can then use a mapped collection to access data. Mapped collections can be accessed via HTTPS and GridFTP and found in the Globus Web application, allowing users to browse their contents, transfer files to/from the collection, and perform other operations on the collections' files.

To access a mapped collection, Globus users must have accounts on the endpoint system and their usernames on the endpoint system must match the usernames of their Globus identities in the domain specified by the storage gateway. For example, if the storage gateway requires an identity in the campus.edu domain, and a user with username bob on the endpoint system wishes to access the mapped collection, Globus will require the user to authenticate to the identity bob@campus.edu.

Use the globus-connect-server-config collection create command to create a new mapped collection on a storage gateway.

 $ sudo globus-connect-server-config collection create \
    --storage-gateway-id f97ca3dc-96b2-4923-97ac-3291ffefe74f \
    --display-name "Mapped Home Directories"
 Created collection 1e16f4c0-a7cd-4c58-9e6a-3684d92a3de9

The command above creates a mapped collection with the following attributes.

  • It inherits access policies from the storage gateway with ID f97ca3dc-96b2-4923-97ac-3291ffefe74f.

  • It has the display name Mapped Home Directories.

On success, the command provides the ID of the new mapped collection.

Because this mapped collection was created using a high-assurance storage gateway, user access is subject to high-assurance rules configured by the storage gateway. Specifically, Globus users must authenticate to an identity from the provider required by the storage gateway. (Authenticating to a linked identity isn’t sufficient.) Authentication in Globus is only valid for as long as the storage gateway’s authentication assurance timeout value, after which the user must re-authenticate. Users must authenticate separately for each application session (e.g., web browser session) and on each device used to access the collection.

The steps for discovering and using mapped collections to access data are described in the how-to guide, "Find and use a mapped collection from the Globus web app."

4.7. Create a guest collection (POSIX)

If the storage gateway allows guest collections, then authorized users may create guest collections using the endpoint’s POSIX storage. Guest collections can be accessed via HTTPS and GridFTP and found in the Globus Web application, allowing users to browse their contents, transfer files to/from the collection, and perform other operations on the collections' files. (The collection’s access policies—which may allow other users to access the collection’s data—are set within the collection itself.)

Because this is a high-assurance storage gateway, both collection management and data access are subject to high-assurance rules, configured by the storage gateway. Specifically, collection managers must authenticate to identities from the required identity provider and must have a local account on the endpoint system with a matching username. (See the POSIX connector section for additional POSIX storage gateway configuration options.) To access data in the guest collection, Globus users must authenticate to a specific identity mentioned in the collection’s access control rules. (Authenticating to a linked identity isn’t sufficient.) Authentication in Globus is only valid for as long as the storage gateway’s authentication assurance timeout value, after which the user must re-authenticate. Users must authenticate separately for each application session (e.g., web browser session) and on each device used to access the collection.

The how-to guides, "Creating a collection from a high assurance storage gateway," and, "Access and share data from a guest collection," provide more details on using Guest collections in the high-assurance environment.

4.8. Next steps

With the creation of your first collection, you’ve completed and verified a basic installation of Globus Connect Server version 5.2. The installation includes a POSIX storage connector and a high-assurance POSIX storage gateway. You can now fine-tune the configuration of your endpoint by editing the /etc/globus-connect-server.conf file or by adding, deleting, and modifying storage gateways.

5. Managing storage gateways

Storage gateways define policies for access and use of storage systems. Once a storage gateway has been created on an endpoint, authorized users may create collections as permitted by the storage gateway. These collections can then be accessed via HTTPS and GridFTP and found in the Globus Web application, allowing authorized users to browse their contents, transfer files to/from the collection, and perform other operations on the collections' files.

Each storage gateway is associated with a specific storage connector, which determines the type of storage system it addresses. Each storage connector offers a specific set of policy and access options which are described in the connector’s documentation.

The globus-connect-server-config storage-gateway command is used to manage storage gateways on an endpoint. This section covers the general features of the globus-connect-server-config storage-gateway command that work with all storage connectors. Refer to the following resources for connector-specific details.

5.1. Create a storage gateway

The globus-connect-server-config storage-gateway create command creates a new storage gateway. This command supports the following general options. Connector-specific options are added for each storage connector. See the connector documentation for details.

  • The --connector option specifies the storage connector to be addressed by the storage gateway.

  • The --display-name option specifies the name of the storage gateway when displayed in the Globus service.

  • The --root option specifies the path within the storage system that can be addressed by the storage gateway. All collections created using this storage gateway will be located within this path and cannot access other parts of the storage system.

  • The --domain option specifies the domain required for identities allowed to create collections using the storage gateway. A Globus user must have an identity from this domain in their Globus Account in order to be able to create collections using the storage gateway. Authorization rules vary for each storage connector. See the connector documentation for details.

  • The --restrict-paths option can be used to place further access restrictions on how directories located in the storage system for which the storage gateway is configured can be accessed. Paths are specified in a comma separated list prefixed by the access permission permitted for the path as given by: R(read), RW(read/write), or N(no access). For example, consider a storage gateway rooted at /data that the storage gateway creator wants to make available to users generally read/write. Let us also assume that there is a directory /data/static that the creator of the storage gateway wants to make accessible in a read-only manner. Let us further say that there is a /data/secret directory that the storage gateway creator does not want to be accessible via this storage gateway at all. This could be accomplished by setting the --restrict-paths option to RW/data,R/data/static,N/data/secret.

  • The --help option displays help text explaining the use and options of the globus-connect-server-config storage-gateway create command.

On success, the command provides the ID of the new storage gateway.

5.2. List storage gateways

Use the globus-connect-server-config storage-gateway list command to see the currently configured storage gateways on an endpoint.

$ sudo globus-connect-server-config storage-gateway list

ID | Storage Type | Display Name | Root
b4be2c2e-1d13-4591-b984-746e32fb655b | POSIX | posix-storage-gateway-demo | /shared
c81cf69c-e494-465c-83f3-7baf272de1c0 | POSIX | Data Storage Gateway | /data
Note:This command is useful for finding a storage gateway’s ID.

5.3. View details of a storage gateway

Use the globus-connect-server-config storage-gateway show command to view the configuration details of a storage gateway.

$ sudo globus-connect-server-config storage-gateway show c81cf69c-e494-465c-83f3-7baf272de1c0

c81cf69c-e494-465c-83f3-7baf272de1c0
--identity-provider 927d2228-f927-42b2-9ace-c523fa2ba34e
--domain example.edu
--connector "POSIX"
--display-name "Data Storage Gateway"
--authentication-assurance-timeout 720
--allow-guest-collections
--disallow-mapped-collections
--high-assurance
--root "/data"

The argument to this command is a storage gateway’s ID. If no ID is provided, the configuration information for all storage gateways on the endpoint will be shown.

5.4. Change a storage gateway

Use the globus-connect-server-config storage-gateway update command to change the configuration of a storage gateway. You must supply the ID of the storage gateway that you wish to change. This command supports the same options as the globus-connect-server-config storage-gateway create command.

$ sudo globus-connect-server-config storage-gateway update --root /new-data c81cf69c-e494-465c-83f3-7baf272de1c0

$ sudo globus-connect-server-config storage-gateway show c81cf69c-e494-465c-83f3-7baf272de1c0

c81cf69c-e494-465c-83f3-7baf272de1c0
--identity-provider 927d2228-f927-42b2-9ace-c523fa2ba34e
--domain example.edu
--connector "POSIX"
--display-name "Data Storage Gateway"
--authentication-assurance-timeout 720
--allow-guest-collections
--disallow-mapped-collections
--high-assurance
--root "/new-data"
Important:Changing the configuration of a storage gateway with existing collections can easily break those collections. We do not recommend changing the configuration of a storage gateway that already has collections.

5.5. Delete a storage gateway

Use the globus-connect-server-config storage-gateway delete command to delete a storage gateway. You must supply the ID of the storage gateway that you wish to delete. If the storage gateway has collections, you will be prompted to delete those collections in order to continue deleting the storage gateway.

$ globus-connect-server-config storage-gateway delete c81cf69c-e494-465c-83f3-7baf272de1c0

Storage System c81cf69c-e494-465c-83f3-7baf272de1c0 in use
Delete collection "test-share-001" (48840f6d-ef3f-4039-b8b7-31a3064b8fa3) [y/N]: y
Deleted storage gateway c81cf69c-e494-465c-83f3-7baf272de1c0
Important:Collections that are deleted by globus-connect-server-config storage-gateway delete cannot be recovered.

6. POSIX storage connector

The POSIX storage connector allows Globus Connect Server version 5 to access POSIX storage systems mounted on the endpoint server. The policies for accessing these storage systems are configured by POSIX storage gateways.

6.1. Creating a storage gateway using the POSIX connector

Use the globus-connect-server-config storage-gateway create command to create a new POSIX storage gateway on an endpoint. The example below illustrates a minimal definition.

$ sudo globus-connect-server-config storage-gateway create --connector POSIX --display-name "Data Storage Gateway" --root /data --domain example.edu

Storage Gateway Created: c81cf69c-e494-465c-83f3-7baf272de1c0

The example above creates a storage gateway on the endpoint with the following attributes.

  • It uses the POSIX storage connector.

  • It has a display name of “Data Storage Gateway”.

  • Only folders and files within the /data directory on the server are accessible via the gateway.

  • Globus users must have an identity from the example.edu identity provider in their Globus Account in order to create collections using this storage gateway. Only users who have local accounts on the server with matching usernames will be able to create collections.

The ID of the new storage gateway is given in the output.

The the globus-connect-server-config storage-gateway create command supports the following options for storage gateways configured to use the POSIX connector, in addition to the common options supported for all storage connectors.

  • The --domain option specifies the domain required for identities allowed to create collections using the storage gateway. A Globus user must have an identity from this domain in their Globus Account in order to be able to create collections using the storage gateway. In addition to this requirement, for POSIX storage gateways, the username part of the user’s identity must correspond to a local account on the server hosting the endpoint. For example, given a storage gateway with the domain set to "abc.edu" and a user with a "bob@abc.edu" identity linked to their Globus account, in order for this user to create collections, the endpoint server must also have a local "bob" account.

  • The --users-deny option is used to specify a list of local user accounts that are explicitly forbidden from creating collections using this storage gateway. Users are specified as a comma separated list. This option takes precedence over the --users-allow, --groups-deny, and --groups-allow options.

  • The --users-allow option is used to specify the complete list of local users that are allowed to create collections using this storage gateway. If --users-allow is used, only the specified local accounts are permitted to create collections via this gateway. Users are specified as a comma separated list. This option takes precedence over the --groups-deny and --groups-allow options, but is overridden by --users-deny.

  • The --groups-deny option is used to specify a list of local groups whose members are explicitly forbidden from creating collections using this storage gateway. Groups are specified as a comma separated list. This option takes precedence over the --groups-allow option but is overriden by --users-deny and --users-allow.

  • The --groups-allow option is used to specify the complete list of local groups whose members are allowed to create collections using this storage gateway. If --groups-allow is used, only members of the specified groups are permitted to create collections via this gateway. Groups are specified in a comma separated list. This option is overridden by --users-deny, --users-allow, and --groups-deny.

The precedence rules of the allow/deny options above might look overwhelming, but in practice, most storage gateways will only need one of these options at a time. For example, consider a scenario in which the system administrator wants to allow a specific set of users to create collections. The administrator is familiar with creating and managing groups for purposes like this. The following command would serve this purpose. (Note the additional --groups-allow setting at the end of the command.)

$ sudo globus-connect-server-config storage-gateway create --connector POSIX --display-name "Data Storage Gateway" --root /data --domain example.edu --groups-allow datamgrs

This example is just like the example above, with the exception that it restricts collection creation to local users who are members of the datamgrs group on the endpoint server. The --groups-allow option prohibits anyone not in the named group(s) from creating collections. To allow someone new to be able to create collections, the administrator would add the new user to the datamgrs group.

6.2. Creating collections via a POSIX storage gateway

A POSIX storage gateway allows your users to create collections that allow access to the associated storage system via HTTPS and GridFTP. Collections make specific portions of the storage visible in the Globus Web app. The collection’s access policies—which may allow other users to access the data in the collection—are set within the collection itself.

To understand how this works for your authorized users, follow the instructions in the How To article Creating a collection via a POSIX storage gateway to create a collection for your POSIX storage.

6.3. Precedence of local POSIX permissions

For collections within a POSIX storage gateway, the Globus ACLs for the collection aren’t the only rules that determine access. Globus ACLs can never result in greater access than the local POSIX permissions provide to the user who created the collection. The effective permissions that a Globus user has to a collection are determined by the most restrictive of the Globus ACLs and the local POSIX permissions granted to the collection’s creator.

To illustrate, consider a collection that provides access to the /data/public/ local path on the endpoint server. The collection was created by the "alice" local user (Alice), and Alice has read-only permissions to this path. Alice creates an ACL entry for the collection that grants read and write permissions for the collection to "bob@abc.edu" (Bob). Despite the fact that an ACL entry grants both read and write permissions to Bob, Bob only has effective read-only permissions for the collection because the collection owner, Alice, is limited to read-only access to the /data/public/ path on the endpoint server.

7. Management Console

The management console, available on managed endpoints, provides a graphical web interface that can be used to monitor endpoint activity and to identify and troubleshoot faults that may indicate underlying infrastructure issues. An Administrator for an endpoint decides who has access to the Management Console for an endpoint via the assignment of the Activity Manager or Activity Monitor role to users, as appropriate. Instructions on how to manage and assign roles for an endpoint can be found here.

You can read about the details and benefits of the management console here.

8. Roles and privileges

Users (or groups) can be granted various roles on any managed endpoint, with each role granting the user (or group) different privileges with respect to that endpoint. All roles can be managed via the Transfer API or the Roles tab on the Endpoints page on the Globus webapp.

The following roles are supported on managed endpoints. These roles need to be explicitly set and none of the privileges are inherited.

  • Administrator

    • Has full control over the endpoint definition of the endpoint.

    • Can delete endpoint definition

    • Can see endpoint definition even if set to private

    • Can manage roles for endpoint

    • An administrator for an S3 endpoint or a share also has all of the abilities of an Access Manager

    • Can be granted by other administrators. The creator of the endpoint is granted Administrator role by default

    • Does not have Activity Manager and/or Activity Monitor capabilities without being granted such explicitly

  • Activity Manager

    • Has full access to the Management Console for the endpoint

    • Can see endpoint definition even if set to private

    • Can be granted by any user who has Administrator role on the endpoint

  • Activity Monitor

    • Has read only access to the Management Console for the endpoint

    • Can see endpoint definition even if set to private

    • Can be granted by any user who has Administrator role on the endpoint

  • Access Manager

    • Can manage permissions on any endpoints that supports sharing (S3 or shared endpoints)

    • Has read/write access to folders and files on the endpoint

    • If the endpoint is set to private (in the case of S3 endpoint), cannot see the endpoint.

    • Can be granted by any user who has Administrator role on the endpoint

9. Generating and Monitoring Log Files

9.1. GridFTP Server Log Files

By default, the GridFTP log is located at:

/var/log/gridftp.log

The configuration settings for the GridFTP log file are found in this file:

/etc/gridftp.d/globus-connect-server

Logging for the GridFTP service is enabled by default. Additional details concerning logging for the GridFTP server are available in the globus-gridftp-server man page here.

10. Getting Help

10.1. Troubleshooting Common Problems

This section describes some basic tests you can run when you experience problems with a transfer or an endpoint. These tests can help you narrow down the potential causes of the issue and simplify troubleshooting.

10.1.1. Test Basic Endpoint Functionality

An important verification of endpoint health is to confirm that the endpoint is able to successfully participate in transfers from and to other endpoints. Globus maintains two test endpoints, Globus Tutorial Endpoint 1 and Globus Tutorial Endpoint 2, that are always available for users to access when checking the functionality of their own endpoints. First, attempt to transfer the contents of the /share/godata/ directory on the Globus Tutorial Endpoint 1 endpoint to your own endpoint. After that, attempt to transfer those same files to the /~/ directory on the Globus Tutorial Endpoint 2 endpoint. If these tests both succeed, then your endpoint is functional and able to serve as the destination and the source of transfers. For more detailed instructions on how to use the Globus service to transfer files, see here.

10.1.2. Verify globus-gridftp-server Service

Another important check on servers hosting a Globus endpoint is to verify that the globus-gridftp-server service has properly started and is running. To do this, first use the ps command to see if there is an instance of globus-gridftp-server running:

# ps aux | grep globus-gridftp-server
root       604  0.0  0.7  97924  7312 ?        Ss   14:18   0:00 /usr/sbin/globus-gridftp-server -c /etc/gridftp.conf -C /etc/gridftp.d -pidfile /var/run/globus-gridftp-server.pid -no-detach -config-base-path /

If you do not see an instance of globus-gridftp-server running, then the service has not started. You can try to start it by executing the globus-connect-server-setup command and then checking to see if an instance of globus-gridftp-server appears in the ps output. If you still don’t see an instance of globus-gridftp-server running after issuing the globus-connect-server-setup command, you can take a look in the logs for clues as to what might be wrong.

10.2. Globus Help Resources

10.2.1. Documentation Website

This website (docs.globus.org) contains a wealth of information about configuring and using the Globus service. Many common issues can be resolved quickly by browsing our frequently asked questions and reading the relevant guides and how-to’s. We recommend consulting these resources first when looking for fast resolution to any issue you are having with the Globus service.

10.2.2. Mailing Lists

If you use Globus, then participating in one or more of the public email lists is an excellent way to keep in touch with your peers in the Globus Community. For questions about managing your Globus deployment, e.g. installing software for a Globus endpoint, configuring your firewall, and integrating your institution’s identity system, subscribe to the admin list. For other inquiries and discussions, try the user or developer lists. For more information on mailing lists and how to subscribe, click here.

10.2.3. Globus Support

Questions or issues that pertain to Globus Connect Server version 5 installation or to any client or service that is used in the Globus software-as-a-service (SaaS) or platform-as-a-service (PaaS) offering can be directed to the Globus support team by submitting a ticket. Subscriptions include a guaranteed support service level.

When submitting a ticket for an issue with Globus Connect Server, please include the endpoint name, a description of your issue, and screenshot/text dumps of any errors you are seeing. Please also include the output of the following commands, run as root, from the server hosting the endpoint:

uname -a
sestatus
ifconfig
ping $(hostname -f)
cat /etc/os-release; cat /etc/redhat-release
cat /etc/gridftp.d/*
cat /etc/gridftp.conf
cat /var/lib/globus-connect-server/endpoint-uuid.txt
globus-gridftp-server --version
grep -v "^$\|^;" /etc/globus-connect-server.conf

Appendix A: Understanding Data Channel Traffic

When the Globus transfer service moves data between two endpoints, it uses the high-performance GridFTP protocol. As part of this protocol, a GridFTP data channel connection is established between the two endpoints. (The transfer service initiates the data channel, but the data channel is a direct connection from one endpoint to the other.) The default port range used for data channel connections is TCP 50000 to 51000. We strongly recommend that all endpoints be configured to use the default data port range, as this will provide maximum compatibility with other endpoints that are also configured to use the default data port range and have their firewall rules configured to allow traffic in this range. If your endpoint uses a non-default data port range, then you are - in effect - requiring other sites to potentially have to create additional firewall rules in order to be able to communicate properly with your endpoint. Many sites will not want to do this, which will thus limit the ability of your endpoint to interoperate with the majority of endpoints which are configured to use the default port range.

If two endpoints (ep1 and ep2) are to be able to successfully conduct transfers, then those endpoints must each be able to connect to each other in their configured data port ranges. For example, consider the following:

Globus Connect Server ep1 uses data port range 40000 to 41000

Globus Connect Server ep2 uses data port range 50000 to 51000

When two Globus Connect Server endpoints attempt to conduct a transfer, the endpoint that will be the recipient in that transfer picks out a port (or ports) in its configured data port range that it will listen on to receive the the transfer from the sender endpoint. This port value gets communicated back from the receiver endpoint to the sender endpoint via GridFTP control channel data mediated by the Globus service, which both the sender and recipient are listening to on port 443. Once the sender endpoint receives the data port range info for the recipient endpoint, it then initiates an outbound connection to the recipient to that port (or ports) on the recipient to conduct the actual data transfer.

To illustrate, consider the case of ep1 and ep2 mentioned above. If ep1 wanted to send ep2 a file, then ep2 would pick out a port (or ports) in its configured data port range of 50000 to 51000. For the sake of example let’s say that port 50021 has been chosen. This value would then get communicated from ep2 to ep1, via the Globus service through the GridFTP control channel that both ep1 and ep2 are listening to. At that point, ep1 would then initiate a connection out to port 50021 on ep2.

To further illustrate, consider again the case of ep1 and ep2 mentioned above. If ep2 wanted to send ep1 a file, then ep1 would pick out a port (or ports) in its configured data port range of 40000 to 41000. For the sake of example let’s say that port 40331 has been chosen. This value would then get communicated from ep1 to ep2, via the Globus service through the GridFTP control channel that both ep1 and ep2 are listening to. At that point, ep2 would then initiate a connection out to port 40331 on ep1.

It is also important to consider what happens in cases where one endpoint is a Globus Connect Server endpoint and the other endpoint is a Globus Connect Personal endpoint. In such cases, the Globus Connect Personal endpoint will always initiate the connection to the Globus Connect Server endpoint for the transfer. Thus, it will always be the Globus Connect Server endpoint that picks the port (or ports) on which it will listen for that connection. This is the case irrespective of which endpoint is the sender or the recipient. As discussed previously, this information gets communicated from the Globus Connect Server endpoint to the Globus Connect Personal endpoint via the Globus service.

After looking at the example given we can see that, in terms of firewall rules, the outbound rules for ep1 must allow it to connect outbound to ep2 on ep2’s configured data port range if ep1 is to be able to send files to ep2. In terms of inbound rules, the firewall rules for ep1 must be configured to allow it to accept inbound connections on its own configured data port range for it to be able to receive files from other endpoints. The firewall rules for the data port range of any endpoint will be similar, and must allow outbound connections to the configured data port range of a remote endpoint for the local endpoint to be able to send files to the remote endpoint, and must allow inbound connections to the configure data port range of the local endpoint for that endpoint to be able to receive files from other endpoints.

As illustrated, an endpoint must be able to receive inbound connections on its own configured data port range, as well as be able to make outbound connections to the data port range of any endpoint it wishes to communicate with. If all Globus Connect Server admins pick their own custom port ranges, then this quickly leads to a situation in which site firewall policies become littered with custom rules for these various port ranges and endpoints. However, if everyone uses the default data port range, then firewall rules are much more predictable and manageable. It is for this reason that we recommend that everyone use the default data port range for their endpoint. Those who use a custom data port range may find that they have problems with their endpoint being able to communicate with other endpoints, for the reasons detailed above. Those using custom data port ranges may also find that the admins of other sites and endpoints may not be willing to set up custom firewall rules to accommodate custom data port range choices.

Appendix B: How to update a Globus Connect Server 5.2 install

If you are using a version of Globus Connect Server released prior to Globus Connect Server 5.2 — including all versions of Globus Connect Server 4.x and Globus Connect Server 5.1 — then you cannot upgrade to version 5.2 with the instructions given here. Please contact us to discuss your options for migrating to Globus Connect Server version 5.2.

If you are using Globus Connect Server version 5.2, use the following instructions to update your installation.

Red Hat Enterprise Linux, CentOS

$ sudo yum update \*globus\*

Debian, Ubuntu

$ sudo apt-get update
$ sudo apt-get install --only-upgrade ".*globus.*"

After updating your packages, run the following command to restart the services and ensure that the update takes full effect.

$ sudo globus-connect-server-setup

Appendix C: Setting Endpoint Network Use Options

Globus transfer uses configured network use levels and location of an endpoint to determine performance parameters to set on transfers against the endpoint. Administrators of endpoint may override the default values to best suit their deployment and needs. The configuration settings from source and destination endpoints are used to determine the concurrency and parallelism options used for a given transfer, thus leveraging the available transfer capacity, without overwhelming smaller capacity endpoints during transfers with larger capacity endpoints.

The location parameter is used to determine the distance and hence expected latency between the two endpoints, and is used in the automatic tuning of the transfers. By default the value of location parameter is automatically determined by Globus, but can be set by the endpoint administrator to explicit coordinates (in decimal degrees). This parameter cannot be set for S3 endpoints or shared endpoints.

Network use is set to "Normal" level by default. An administrator of a managed endpoint can set the network use levels for transfers against their endpoint. Endpoints that have multiple physical servers, and good end to end connectivity (network and storage) can set higher network use to ensure that Globus uses the bandwidth available, while smaller deployments can set this to lower levels.

Three preset options are provided for the endpoint administrator, which have the following values:

Option Value

Minimal

MaxConcurrency = 1

PreferredConcurrency = 1

MaxParallelism = 1

PreferredParallelism = 1

Normal (Default)

MaxConcurrency = number of servers * 4

PreferredConcurrency = number of servers * 2

MaxParallelism = 8

PreferredParallelism = 4

Aggressive

MaxConcurrency = number of servers * 8

PreferredConcurrency = number of servers * 4

MaxParallelism = 16

PreferredParallelism = 4

Note: S3 endpoints do not support parallelism options, only concurrency.

In addition to above, an administrator can choose the "Custom" option that lets them set absolute values for both concurrency and parallelism. All these options have a limit of 64 for MaximumConcurrency and MaximumParallelism. These values can be modified by using the —network-use option on endpoint-modify command in the Globus CLI.

For a given transfer, the concurrency is calculated as the smallest value across the MaximumConcurrency values of both endpoints, and the maximum of the PreferredConcurrency of both endpoints. Parallelism is also calculated similarly, with an additional consideration for transfers with high latency (trans-oceanic transfers) where the parallelism is set to minimum of the Maximum Parallelism value set for both endpoints.

Glossary

Access Manager

The access manager role grants the ability to control read and/or write access permissions for other Globus users on a shared endpoint. You can read a more in-depth discussion here.

Collection

Collections provide the data access interfaces for an endpoint. In version 4, these are called "endpoints." In Globus Connect Server version 5, a collection is a named set of files (or blobs), hierarchically organized in folders, associated with a specific storage gateway. Collections can be accessed via HTTPS (client/server access), GridFTP (asynchronous bulk transfer), and REST API (for advanced operations). Access to a collection is authenticated via Globus Auth-issued OAuth2 access tokens, with data access policies defined in the collection itself. Globus Connect Server version 5 supports two types of collections:

  • Mapped collection: Each user accessing the collection must have a local account on the storage system. Their Globus identity is mapped to their local account. In version 4, these are called "host endpoints."

  • Guest collection: Users can access the collection without a local account on the storage system. Access is based on permissions granted by an authorized user via Globus. In version 4, these are called "shared endpoints."

Endpoint

Endpoint (Changed from version 4): A deployment of Globus Connect Server version 5, optionally across multiple data transfer nodes. The endpoint provides the interface for management and configuration. An endpoint can be configured with more than one storage connector to allow the endpoint to use multiple storage systems simultaneously (e.g., POSIX file system, Google Drive).

Endpoint Definition

This term refers to the metadata about the endpoint, stored as an object in the Globus.org database, used to simplify using and referring to the endpoint for users. Much of the information in the endpoint definition is sent to Globus when the globus-connect-server-setup command is run.

Globus Account

A Globus Account is the set of a user’s linked identities in Globus. The first identity in the user’s identity set in the Globus account is the primary identity, and all subsequent identities added to the Globus account will be linked identities. For example, if Bob has a "bob@globusid.org" identity as their primary identity, and also has a "bob@abc.edu" identity as a linked identity, then Bob’s Globus account contains both the "bob@globusid.org" identity and "bob@abc.edu" identity in its identity set. A user can view and manage the identities in their Globus account here.

GridFTP

GridFTP is an extension of the standard File Transfer Protocol (FTP) for high-speed, reliable, and secure data transfer. See the GridFTP documents for more information.

Managed Endpoint

A managed endpoint is an endpoint that is covered under a subscription and allows advanced features to be enabled. To convert an existing endpoint into a managed endpoint see this writeup.

Storage Connector

A plug-in installed on a Globus Connect Server version 5 node that allows it to support a particular storage type. E.g. Google Drive connector, HPSS connector, etc.

Storage Gateway

Storage gateways provide the storage access policies for the endpoint’s connected storage systems. A storage gateway is a named, discoverable interface by which authorized users can create and manage collections on a connected storage system. A connected storage system may have multiple storage gateways.


© 2010- The University of Chicago Legal