Migrating an Endpoint from Globus Connect Server v4 to v5.4

Table of Contents

Last Updated: Oct 31, 2022

1. Introduction

This document describes the procedure to convert an endpoint from Globus Connect Server v4 to v5.4. It assumes you have an existing Globus Connect Server v4 endpoint with configuration state that you want to preserve when converting to the latest Globus Connect Server software. If you are installing Globus Connect Server for the first time please see the GCS v5.4 installation guide.

1.1. Process Overview

1.1.1. Multiple data transfer nodes

For an endpoint with multiple data transfer nodes, you’ll select one of the data transfer nodes on which to perform the migration. This node will be taken out of the endpoint configuration while you perform the migration. You will be able to test out the new configuration on the migration data transfer node while the others continue to service Globus requests. At the final step, the new configuration will replace the old endpoint in the Globus transfer service, and the other data transfer nodes will cease working with Globus. You’ll then need to replace Globus Connect Server v4 with v5.4 and then add the nodes to the endpoint.

1.1.2. Single data transfer node

For an endpoint with a single data transfer node, you’ll need to stop the Globus services on your endpoint for a longer period of time and will not be able to have both the old and new configurations running simultaneously. Other than that, the process is similar.

1.1.3. Downtime

Running gcs-migration on the Globus Connect Server v4 endpoint does not disrupt the operations of the endpoint therefore downtime is unnecessary. Running globus-connect-server endpoint migrate4 will impact services on the Globus Connect Server v5 endpoint, and when using the --finalize option also impacts services on the Globus Connect Server v4 endpoint, so downtime should be planned accordingly. While running globus-connect-server endpoint migrate4, the local node will not be able to service any user requests.

The runtime of globus-connect-server endpoint migrate4 depends on the number of v4 shares that are migrating and the operation being performed:

  • On the first run of globus-connect-server endpoint migrate4 (without --finalize), all components in the migration plan must be created on the Globus Connect Server v5 endpoint. This process takes approximately 10 minutes for 100 v4 shares.

  • On successive runs of globus-connect-server endpoint migrate4 (without --finalize), all components in the migration plan are updated on the Globus Connect Server v5 endpoint. This process takes approximately 2 minutes for 100 v4 shares.

  • On the first run of globus-connect-server endpoint migrate4 --finalize, all Globus Connect Server v4 endpoint IDs must be migrated. This process takes approximately 15 minutes for 100 v4 shares. If this is also the first run of globus-connect-server endpoint migrate4, then the expected time is approximately 20 minutes for 100 v4 shares.

  • On successive runs of globus-connect-server endpoint migrate4 --finalize, the endpoint ID migration is reverified. This process takes approximately 2 minutes for 100 v4 shares.

1.1.4. User impact

All transfers that are active, including paused transfers, during the finalization step will be terminated by the Globus Transfer service, and users will be notified that their jobs were cancelled.

1.1.5. Pause Rules

All pause rules associated with the v4 endpoint will not be transferred to the v5 endpoint.

1.1.6. Deletion of v4 endpoint and configuration

The v4 endpoint configuration in the Globus Transfer service is deleted when migration finalization is completed. The local v4 configuration is also removed when the globus-connect-server-cleanup command is run during migration.

1.2. Prerequisites

To migrate Globus Connect Server you will need to have access to the existing data transfer nodes, as well have access to the globusid account used to create the v4 endpoint. You will be required to log in using those administrator credentials in order to read the state necessary to migrate the endpoint.

You must have root access to the machine on which you will be migrating the endpoint. This is needed to install the Globus Connect Server v5.4 software as well as configure and enable the services.

1.3. Supported Linux distributions

Globus Connect Server version 5 is currently supported on the following Linux distributions:

  • CentOS 7, 8 Stream, 9 Stream

  • Rocky Linux 8, 9

  • AlmaLinux 8, 9

  • Springdale Linux 8, 9

  • Oracle Linux 8, 9

  • Debian 11, Debian 12

  • Fedora 37, 38

  • Red Hat Enterprise Linux 7, 8, 9

  • Ubuntu 20.04 LTS, 22.04 LTS, 22.10, 23.04

2. Migration Steps

The steps to migrate an endpoint are

  • Install migration tools and use them to create endpoint configuration migration documents.

  • Uninstall Globus Connect v4 from the endpoint you will be performing the migration on.

  • Install a new Globus Connect v5.4 endpoint.

  • Configure identity mappings to use with your Globus Connect Server v5.4 endpoint.

  • Use the globus-connect-server migrate4 tool to transfer the endpoint configuration to your new endpoint.

  • Finalize the migration by resetting the Globus Transfer endpoint record to point to your new endpoint.

  • Update other data transfer nodes to Globus Connect Server v5.4 and add them as endpoint nodes.

These steps are described in more detail in the following sections.

2.1. Install the gcs-migration package

Install the following package to be able to create and configure your migration plan. This command sequence creates a python virtual environment, activates it, and installs the gcs-migration tool in it.

Install gcs-migration
python3 -mvenv gcs-migration
. ./gcs-migration/bin/activate
pip install --upgrade pip
pip install --extra-index-url \
    "https://downloads.globus.org/globus-connect-server/stable/wheels/" \
    gcs-migration

2.2. Create the migration plan

2.2.1. Log in to Globus with gcs-migration

Log in to Globus with the migration tool to obtain a token so that the tool can access the configuration of the existing v4 endpoint. Use the command gcs-migration login to perform this step.

This command displays an https link to the Globus login page to perform authentication. Open that link in a browser, then use the Globus ID that was used to create the v4 endpoint to log in. Paste the resulting authentication code to the gcs-migration command to complete the login process.

Important

You MUST log in with the Globus ID used to create the v4 endpoint. If you use your normal login identity you may not be able to parse all of the configuration state of the endpoint and the migration may be incomplete. It may be helpful to follow the link in a private browsing tab to avoid linking your current Globus identity with the v4 endpoint globusid.
gcs-migration login
Please authenticate with Globus here:
------------------------------------
https://auth.sandbox.globuscs.info/v2/oauth2/authorize?client_id=bce18c17-43c6-42cd-9158-b0706f481fb6&redirect_uri=https%3A%2F%2Fauth.sandbox.globuscs.info%2Fv2%2Fweb%2Fauth-code&scope=openid+profile+email+urn%3Aglobus%3Aauth%3Ascope%3Atransfer.api.globus.org%3Aall&state=_default&response_type=code&access_type=offline&prompt=login&session_required_single_domain=globusid.org
------------------------------------

Enter the resulting Authorization Code here: Equ2ooyayu8BooitaeK3ohOdah8Xie

You can later remove the access token from the local cache by using the gcs-migration logout command.

2.2.2. Create migration plan documents

Create the migration plan documents for your endpoint. This step parses the Globus configuration as well as the local configuration of your host and endpoint and all shared endpoints created on that endpoint. This step parses the local globus-connect-server.conf and GridFTP configuration files and also obtains information about the endpoints from the Globus Transfer service.

gcs-migration create
Parsing endpoint state...
Parsing endpoint roles...
Enumerating shared endpoints...
Parsing shared endpoint state...
Writing endpoint state to disk...

At any point after this is completed, you can create a snapshot of the migration plan so you can roll back any changes you make. See snapshots for information.

2.3. Update Sharing State Files

When you create the migration plan, you may see the following diagnostic in the output:

Unable to read sharing state files. In most cases this is not a problem, but
please read the documentation at
https://docs.globus.org/globus-connect-server/v5.4/migration4-guide/#sharing-state-files
to determine if you need to update your migration plan to set the collection
base paths for your guest collections

If you did not, then you may skip to the next section.

Sharing state files are created on the endpoint when a shared endpoint is created. It contains the path that acts as the root of the shared endpoint. This information is also stored in the Globus Transfer service. In most cases, these two places have the same information. However, in some cases when a directory tree was moved, Globus Support may have instructed a user or admin to change the contents of the sharing state file. This is not a common occurrence, but has happened on some endpoints.

When the gcs-migration create command runs, it tries to read the sharing state files. This will not work if the account running that command does not have access to read the files or they do not exist. In that case, the migration tools will fall back to the value which is stored in the Globus Transfer service. If a shared collection’s root has been changed manually, then the migrated values will not be correct.

To mitigate this, you can use the gcs-migration sharing-state load to load the sharing state files for shared endpoints.

Depending on the site configuration, these files may be located in a user’s home directory or in a shared configuration directory, and will be owned by users who have created shares.

2.3.1. Sharing state is readable by root

If the sharing state files are on a filesystem that is readable by the root user, you can use the command:

sudo -E ./gcs-migration-virtualenv/bin/gcs-migration sharing-state load
Loaded sharing state files for shared endpoints:
    aab540c0-a5bf-4544-a9a7-c3ba98bd1da4
    22142700-de5a-4354-901a-23cfd9082d23

2.3.2. Sharing state is not readable by root

If the filesystem is not readable by root, you can use the following command to enumerate the unparsed shared state files:

gcs-migration sharing-state list
User      | ID                                   | Path                                                             | Base Path
--------- | ------------------------------------ | ---------------------------------------------------------------- | ---------------
username1 | aab540c0-a5bf-4544-a9a7-c3ba98bd1da4 | $HOME/.globus/sharing/share-aab540c0-a5bf-4544-a9a7-c3ba98bd1da4 | /home/username1

This displays the local account that owns the shared endpoint, the shared endpoint id and the path to the share state file. You can use either of the two following methods to parse the contents of the sharing state file.

Sudo to user

You can then run the command gcs-migration sharing-state read as the local user to obtain the values for the base path of those endpoints. The output of the command is the value of the base path. Note that since this is done as another user, it can not update the migration plan, so you’ll need to perform that step next.

sudo -E -u username1 ./gcs-migration-virtualenv/bin/gcs-migration sharing-state read aab540c0-a5bf-4544-a9a7-c3ba98bd1da4
/project1
Parse manually

Or alternately, you can somehow out of band read the file at the given path (noting that $HOME, ~, and $USER must be interpreted based on the posix home directory and username of the collection owner. The contents of the file should contain some comments and then a line containing the string share_path followed by a quoted string containing the actual path.

From the previous example, this would look like

#
# This file is required in order to enable GridFTP file sharing.
# If you remove this file, file sharing will no longer work.
#

share_path "/project1"
Update Migration plan

After obtaining the correct value for the base path, you can then use command gcs-migration sharing-state set-base-path to set the base path for a collection in the migration plan.

gcs-migration sharing-state set-base-path aab540c0-a5bf-4544-a9a7-c3ba98bd1da4 /project1
Important

Prior to migrating an endpoint with a single data transfer node, you should make backups of your system configuration, especially the files in

  • /etc/gridftp.d

  • /etc/grid-security

  • /etc/myproxy.d

  • /var/lib/globus-connect-server

You can restore a v4 service using those backups prior to the finalization step.

2.4. Remove data transfer node from the endpoint

On the data transfer node, run the following command to remove it from the endpoint configuration. If you use a non-standard location for your globus-connect-server.conf file, replace the path in the command with the path your configuration file. You MUST NOT use the -d option to globus-connect-server-cleanup as that will completely delete your endpoint and make migration impossible.

Remove data transfer node
sudo globus-connect-server-cleanup -c /etc/globus-connect-server.conf

2.5. Uninstall Globus Connect Server v4

On that node, uninstall the globus packages on your system and install the gcs_migration package.

CentOS, Red Hat Enterprise Linux, and Fedora
sudo yum remove '*globus*' 'myproxy*'
Debian and Ubuntu
sudo apt-get remove '.*globus.*' '^myproxy.*'

2.6. Install Globus Connect Server v5.4 software

Skip to the appropriate section for your Linux distribution and follow the instructions to install Globus Connect Server v5.4 on your system.

sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo yum install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm

Install Globus Connect Server:

sudo yum install globus-connect-server54
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm

Ensure the mod_auth_openidc module stream is disabled, as it will conflict with packages in the Globus repository:

sudo dnf module disable mod_auth_openidc

Install the DNF config manager:

sudo dnf install 'dnf-command(config-manager)'

Install Globus Connect Server:

sudo dnf install globus-connect-server54
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
sudo dnf install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm

Install the DNF config manager:

sudo dnf install 'dnf-command(config-manager)'

Install Globus Connect Server:

sudo dnf install globus-connect-server54
sudo dnf install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm

install Globus Connect Server:

sudo dnf install globus-connect-server54
curl -LOs https://downloads.globus.org/globus-connect-server/stable/installers/repo/deb/globus-repo_latest_all.deb
sudo dpkg -i globus-repo_latest_all.deb
sudo apt-key add /usr/share/globus-repo/RPM-GPG-KEY-Globus
sudo apt update
sudo apt install globus-connect-server54
curl -LOs https://downloads.globus.org/globus-connect-server/stable/installers/repo/deb/globus-repo_latest_all.deb
sudo dpkg -i globus-repo_latest_all.deb
sudo apt-key add /usr/share/globus-repo/RPM-GPG-KEY-Globus
sudo apt update
sudo apt install globus-connect-server54

2.7. Create the endpoint

To create the endpoint, run the globus-connect-server endpoint setup command.

Example
globus-connect-server endpoint setup "My GCSv5.4 Endpoint" \
    --organization "Example Organization" \
    --owner admin@example.edu \
    --contact-email support@example.edu

The command returns information about the endpoint that may be useful for additional configuration later, including the domain name of the endpoint, a link to send to subscription managers to set the endpoint as managed, and the redirect URI needed if Google Drive or Cloud connectors will be used with this endpoint.

2.8. Set up services on the data transfer node

Run the globus-connect-server node setup command to configure and start the Globus services on the data transfer node. This command must be done as the root user, as it enables and starts systemd services. The deployment-key.json file from the previous step will be used by this command.

Example
sudo globus-connect-server node setup

2.9. Set the endpoint as managed

If your v4 endpoint is managed, you (or your organization’s subscription manager) can use the Globus web application to set the v5.4 endpoint as managed. This should be done now, especially if you have roles for your endpoint or have shared endpoints created on it.

Have your subscription manager visit the endpoint page at https://app.globus.org/file-manager/collections/ENDPOINT-ID/overview/subscription with ENDPOINT-ID being the value printed after running globus-connect-server endpoint setup. If you do not know who your subscription manager is, please email support@globus.org.

2.10. (Optional) Configure the Globus OIDC Server

If your Globus Connect Server v4 endpoint used either the MyProxy or MyProxy OAuth identity providers, and you want your users to be able to authenticate with your endpoint using PAM login configuration on the data transfer node, you will probably want to configure the Globus OIDC server. This service provides similar functionality to MyProxy CA functionality, but uses the standard OpenID Connect protocol.

You may also register an existing OIDC server to allow access to your endpoint from its asserted domains.

If there is another way your users log into Globus --- for example, your institutional login page is available from the auth.globus.org login page or your users log in to Globus using Google, then you may not need to configure this service.

If you are unsure whether you need to configure this service, please contact support@globus.org.

To install this service, follow the steps in the Globus OIDC Installation Guide.

2.11. Configure Identity Mapping

Globus Connect Server v5.4 uses Globus Auth for all authentication, instead of supporting the mix of CILogon, MyProxy, MyProxy OAuth, and per-user X.509 certificates. It also uses a different method for mapping identities from the Globus Auth identity namespace to the local system identities. To migrate the endpoint, you’ll need to configure how identities are mapped. First, you use the command gcs-migration identity-mapping list to list the types of identities you had configured on your v4 endpoint. Each of those will need to be handled in slightly different ways.

2.11.1. CILogon

If your endpoint is configured to use a CILogon provider, the output of the command gcs-migration identity-mapping list will look something like this:

gcs-migration identity-mapping list
mapping 0: (requires attention)
    type: CILogon
    subject: /DC=org/DC=cilogon/C=US/O=University of Example
    organization: University of Example
    domains: [example.edu*]

This example shows a v4 endpoint that uses the CILogon provider for "University of Example", which uses the domain example.edu. The domain is marked with a * which means that the domain is not currently configured to be used by your v5 endpoint, so you’ll need to configure the endpoint to allow users from this domain to access the endpoint. There may be one or more domains listed.

Use the command gcs-migration identity-mapping cilogon to configure this mapping.

Note

If there is only one CILogon mapping listed and it has only one organization domain, there is no need to pass any other options to this command.
gcs-migration identity-mapping cilogon

If there are multiple CILogon mappings listed or the CILogon provides multiple domains, you can use the following command-line options to customize how the identity mapping is configured:

--organization ORGANIZATION, -o ORGANIZATION

This option sets the CILogon organization to add a mapping for. This option is only needed if your Globus Connect Server v4 endpoint is manually configured with multiple CILogon callouts.

--domain DOMAIN, -d DOMAIN

This option may be passed multiple times to select which domains used by an organization to allow access to your endpoint. This is only needed if the organization supports multiple domains. In that case, you can select a subset of those domains by using this option. If there is only one domain used by the organization, this is not needed.

--identity-mapping JSON_OR_FILE, -i JSON_OR_FILE

Add an identity mapping document to the storage gateway configuration. This may be needed if you have a complicated configuration with multiple domains that have overlapping identities or if you need a special mapping between Globus Auth usernames and the local system usernames. The argument to this option is either a path to a json file containing the mapping, or the text of the mapping itself. This option may be passed multiple times to add multiple mappings to the storage gateway.

--dont-update-guest-collections, -D

This option is a flag to skip updating the guest collections to set their owners to identities from the CILogon domain.

2.11.2. MyProxy and MyProxy OAuth

If your endpoint is configured to use MyProxy CA or MyProxy OAuth to issue certificates for your users, you the output of the command gcs-migration identity-mapping list will look something like this:

gcs-migration identity-mapping list
mapping 0: (requires attention)
    type: MyProxy
    subject: /O=Globus Connect Server/CN=endpoint.example.org
    domains: []

This example shows a v4 endpoint that uses the myproxy server with the subject /O=Globus Connect Server/CN=endpoint.example.org. Unlike CILogon, where the migration tool can normally determine the organization and domains used by the identity provider, for a MyProxy or MyProxy OAuth configuration, you’ll need to pass information about the domains to use for authentication to the migration tools. In this case, you see that the domain list is empty, so you’ll need to configure this mapping.

The domain may come from an existing Globus Auth identity provider, or from a Globus OIDC server running either on this or another endpoint operated by your organization. In either case, you’ll need to ensure that you know following:

  • The domain or domains which the identity provider issues identities for.

    • A provider may issue identities from multiple domains. For example, a university’s identity provider may have distinct domains for faculty and students. In these cases, you’ll need to determine which domain or domains are appropriate to map for your endpoint.

  • The mapping from the identity data to the storage system’s user namespace. In some cases, there are trivial mappings (username@example.edu maps to username) and in some cases you’ll need to supply either an expression-based mapping or provide a callout program to provide the mapping for a given identity. See the Identity Mapping Guide for information about configuring identity mappings in Globus Connect Server v5.4.

If you are using a new identity provider (such as a Globus OIDC server), some of the user identities may not yet be created in the Globus Auth system. In that case, you can use the command line option --provision to ensure that identities are issued for the usernames used by the endpoint’s shares.

As an example, if we wanted to configure our endpoint to use a Globus OIDC server we set up which provides identities in the example.org domain in place of the MyProxy or MyProxy OAuth server configured in the previous example, we can use the following command:

gcs-migration identity-mapping myproxy --domain example.org --provision

If there are multiple MyProxy OAuth servers configured on your endpoint, or you need to use multiple domains, you can use the following options to customize the configuration:

--subject SUBJECT, -s SUBJECT

The X.509 Subject name of the MyProxy CA. This is used to determine which MyProxy configuration from your Globus Connect Server v4 endpoint to replace the identity mappings for. This may be omitted if your configuration only supports a single MyProxy CA.

--domain DOMAIN, -d DOMAIN

This option sets the domain or domains that user identities will be from in place of the MyProxy CA. For example, if you are replacing the MyProxy configuration with a organization-wide identity provider that is already registered with Globus Auth, you would provide the domain name of identities issued by the provider. Likewise, if you are using the Globus OIDC service, you would provide the domain name of that service. If you are replacing MyProxy with multiple identity providers, or if your organization’s identity provider issues tokens from multiple domains, you can pass this option multiple times. In that case, you may need to use the identity mapping option to disambiguate identity usernames.

--identity-mapping JSON_OR_FILE, -i JSON_OR_FILE

Add an identity mapping document to the storage gateway configuration. This may be needed if you have a complicated configuration with multiple domains that have overlapping identities or if you need a special mapping between Globus Auth usernames and the local system usernames. The argument to this option is either a path to a json file containing the mapping, or the text of the mapping itself. This option may be passed multiple times to add multiple mappings to the storage gateway.

--dont-update-guest-collections, -D

This option is a flag to skip updating the guest collections to set their owners to identities from the CILogon domain.

--provision, -p

Provision IDs for users which have not yet accessed the identity provider.

2.11.3. Gridmap Entries

Globus Connect Server v4 also supported explicitly mapping users based on the X.509 certificate subjects to particular local accounts.

gcs-migration identity-mapping list
mapping 0: (requires attention)
    type: Gridmap
        "/O=Example Organization/CN=Some User" "local_user1"
        "/O=Example Organization/CN=Some Other User" "local_user2"
    allowed_domains: []
    v5 mappings: []

To migrate this type of configuration to v5.4, you’ll need to know the Globus identity that the users owning those certificates use to log into Globus and then create identity mappings for those users.

The gridmap file can contain multiple mappings for a given subject name. If that is the case in your gridmap file, you’ll need to use the --full-mapping option and supply subject name, local username, and the Globus identity username to create the new mapping.

If each gridmap entry is for a single local username, you can use the --subject-mapping to option, so that you don’t need to manually specify the local username for each mapping.

As an example, if we know that Some User’s Globus account is some_user@example.org and that Some Other User’s Globus account is s.o.user@example.org, you can use the command

gcs-migration identity-mapping gridmap \
    --subject-mapping "/O=Example Organization/CN=Some User" some_user@example.org \
    --subject-mapping "/O=Example Organization/CN=Some Other User" s.o.user@exmaple.org

The gcs-migration identity-mappings gridmap command takes several options:

--full-mapping SUBJECT GLOBUS_USERNAME LOCAL_USERNAME, -f SUBJECT GLOBUS_USERNAME LOCAL_USERNAME

Add an identity mapping for the gridmap entry mapping SUBJECT to LOCAL_USERNAME, using GLOBUS_USERNAME as the Globus Auth identity username. For example, if your gridmap has an entry

"/C=US/O=Example/CN=Joe User" juser,joe

You can use the command-line option -f "/C=US/O=Example/CN=Joe User" joe.user@example.org joe to configure a mapping for the Globus Auth identity joe.user@example.org to the local user joe. This will also be able to update the guest collections created from that user’s shared endpoints to be owned by the joe.user@example.org identity.

--subject-mapping SUBJECT GLOBUS_USERNAME, -s SUBJECT GLOBUS_USERNAME

Add an identity mapping for the gridmap entry for SUBJECT, using GLOBUS_USERNAME as the Globus Auth identity username. This form works when there is only a single mapping for the subject in the gridmap file.

--dont-update-guest-collections, -D

This option is a flag to skip updating the guest collections to set their owners to identities from the CILogon domain.

2.11.4. Custom Identity Mapping

If your endpoint uses a custom identity mapping callout or identity configuration, you may need to provide a custom identity mapping configuration. This allows you to add a mapping configuration into the migration plan which will be used by the v5.4 storage gateway.

As an example, if you know that there is a Globus Auth identity domain custom-identity.example.edu that represents the same user namespace as a GSI authorization callout, you can use the following to create a new mapping in the storage gateway configuration. For this you’ll need to specify the domain associated with the custom identity mapping, as well as an identity mapping document (either a filename containing a json document or the json value inline).

Contents of identity-map.json
{
  "DATA_TYPE": "expression_identity_mapping#1.0.0",
  "mappings": [
    {
      "source": "{username}",
      "match": "(.*)@custom-identity\\.example\\.edu",
      "output": "{0}"
    }
  ]
}
gcs-migration identity-mapping custom \
    --domain custom-domain.example.edu --identity-mapping identity-map.json

The gcs-migration identity-mappings custom command takes several options:

--domain DOMAIN, -d DOMAIN

Identity Provider username domain to map to local accounts. This may be provided multiple times in a single invocation of this command.

--identity-mapping JSON_OR_FILE, -i JSON_OR_FILE

Either the path to a JSON file containing the identity mapping document or an identity-mapping document.

For more information about identity mapping, see the GCSv5 Identity Mapping guide.

2.12. Check migration plan

Assuming you have added mappings for all of the types of identities you will be using on your endpoint, you can now use the gcs-migration check command to check the migration plan for completeness. If there are any issues, such as missing domains or identity mappings, or unsupported custom GridFTP configuration, they will be displayed along with a diagnostic code. The following sections describe solutions for those diagnostic codes.


AC01 (Missing principal)
AC02 (Invalid principal)
AC03 (Missing principal_type)
AC04 (Invalid principal_type)
AC05 (Missing path)
AC06 (Missing permissions)
AC05 (Invalid permissions)

The ACL has a missing or invalid value. Run gcs-migration update as an endpoint administrator to update the plan with a well-formed ACL. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the ACL file.


GC01 (Guest Collection definition is empty)
GC02 (Guest Collection definition is missing owner_id)
GC03 (Guest Collection definition has an invalid owner_id)
GC05 (Guest Collection definition is missing a user credential)
GC05 (Guest Collection definition is missing the host_path property)

The guest collection has a missing or invalid value. Run gcs-migration update as an endpoint administrator to update the plan with a well-formed guest collection. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the guest collection configuration.

GC04 (Guest Collection definition has an invalid owner_id)

The owner of the guest collection is mapped from a Globus Auth identity that is no longer valid. Either delete the v4 share and then run gcs-migration update to remove it from the migration plan, or contact support@globus.org for instructions on changing the ownership of the collection during migration.


MC01 (Mapped collection definition is empty)
MC02 (Mapped collection is missing the display_name property)
MC03 (Mapped collection is missing the force_verify property)
MC04 (Mapped collection is missing the disable_verify property)
MC05 (Mapped collection is missing the enable_https property)
MC06 (Mapped collection is missing the allow_guest_collections property)

The mapped collection has a missing or invalid value. Run gcs-migration update as an endpoint administrator to update the plan with a well-formed mapped collection. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the mapped collection configuration.


MC07 (Mapped collection has sharing disabled but sharing_users_allow is set)
MC08 (Mapped collection has sharing disabled but sharing_users_deny is set)
MC09 (Mapped collection has sharing disabled but sharing_groups_allow is set)
MC10 (Mapped collection has sharing disabled but sharing_groups_deny is set)

The mapped collection configuration is inconsistent. Run gcs-migration update as an endpoint administrator to update the plan with a well-formed mapped collection. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the mapped collection configuration.


MC11 (Mapped collection is allowing a custom GridFTP client certificate)

The GridFTP configuration specified a custom sharing certificate subject. This is not supported in GCSv5.


MC12 (Mapped collection uses an unsupported connector)

The GridFTP configuration specified a connector that is not supported for migration at this time. This is not supported by the migration tool. Contact support@globus.org to receive information on when migrating that connector will become available.


MC13 (GridFTP configuration uses functionality that is changed in GCSv5)

The GridFTP configuration used a feature that is not supported in GCSv5, but there is similar functionality that can be implemented using the GCS Manager API. This diagnostic is non-fatal and migration can continue, but you will need to manually configure these features if you want them to be available in your GCSv5 endpoint.

Feature Configuration options New feature

banner

banner, banner_file, banner_terse, login_msg, login_msg_file

After migration is completed, use the globus-connect-server collection update command to set the user_message and/or user_message_link properties on the mapped collection.

network

control_interface, data_interface, hostname, port_range, $GLOBUS_TCP_SOURCE_RANGE

You can set these per data transfer node either at setup via command-line options to globus-connect-server node setup or after the node is configured via command-line options to globus-connect-server node update.


MC14 (GridFTP configuration uses functionality that is not supported in GCSv5)

The gridftp configuration used a feature that is not supported in GCSv5, but should not conflict with GCSv5 operation. This diagnostic is non-fatal and migration can continue, but you will need to manually configure these features if you want them to be available in your GCSv5 endpoint.

If you want to use these features with GCSv5, you can enable them manually but their configuration will not be automatically synchronized between data transfer nodes by Globus Connect Server.

To enable them, write the configuration directives in to new files in the /etc/gridftp.d/ directory after setting up the GCSv5 data transfer nodes. Thse files must not have names that start with the string globus-connect-server or they may be overwritten by GCSv5.

Feature Configuration options

Network manager

xnetmgr

Logging configuration

log_module, log_unique, log_transfer, log_filemode

Debugging configuration

exec, fork, fork_fallback, single, debug, ignore_bad_threads, bad_signal_exit, $*_DEBUG

Custom usagestats target

usagestats

UDT protocol support

udt

Allowing users to be mapped to the root user

allow_root

Allowing users with disabled accounts to log in

allow_disabled_login

Tuning network connections

connections_max, control_preauth_timeout, control_idle_timeout, connections_disabled, offline_msg


MC15 (GridFTP configuration uses functionality that conflicts with GCSv5 configuration)

The gridftp configuration used a feature that conflicts with the rest of the GCSv5 system and will not be available in the migrated endpoint. You can continue to migrate your endpoint, but the functionality will be different than your GCSv4 endpoint. Contact support@globus.org if you have questions.

Feature Configuration options

SSH FTP

ssh

Running the GridFTP server in a chroot

chroot_path

Configuring the GridFTP server as a striped server

ipc_interface, ipc_allow_from, ipc_deny_from, secure_ipc, ipc_auth_mode, ipc_user_name, ipc_subject, ipc_credential, ipc_cookie, ipc_port, ipc_idle_timeout, ipc_connect_timeout, remote_nodes, hybrid, data_node, stripe_blocksize, stripe_count, brain, stripe_layout, stripe_blocksize_locked, stripe_layout_locked, stripe_mode

Allowing anonymous FTP access to the GridFTP server

allow_anonymous, anonymous_names_allowed, anonymous_user, anonymous_group

Password authentication to the GridFTP server

pw_file

Control-channel management of sharing policies and commands

sharing_control, disable_command_list

Custom network and driver stacks

allowed_modules, dc_whitelist, fs_whitelist, popen_whitelist, dc_default, fs_default

Custom authorization modules

cas, acl

Custom process environment and user

auth_level, process_user, process_group, threads, inetd, daemon, detach, pidfile, use_home_dirs, home_dir

Custom allow/deny IP ranges for control connections

allow_from, deny_from

Allowing symlinks to leave the path restrictions for a collection

rp_follow_symlinks

Setting a non-default globus location

globus_location


MC16 (GridFTP configuration uses process_user which conflicts with GCSv5 configuration)

The gridftp configuration for the BlackPearl endpoints specifies a process_user for the gridftp process. This functionality changes in GCSv5 and the process_user setting is no longer necessary. This is a non-fatal warning for informational purposes.


MP01 (Migration plan is missing)
MP02 (Storage gateway is missing from migration plan)
MP03 (Mapped collection is missing from migration plan)

Run gcs-migration update. If these errors persists, contact support@globus.org.


MP04 (Guest collections owned by identity not in allowed domains)

One or more guest collections in the migration plan have been configured with an owner identity that is not in the storage gateway’s allowed domains. Run gcs-migration set-guest-collection-owner to change the ownership of the guest collection or use gcs-migration identity-mapping to add the owner’s domain to the storage gateway’s allowed domains.


MP05 (Shared endpoints are not supported)

Migrating shared endpoints is not supported with this version of the migration tools.


MP06 (Migration plan has disabled guest collections)

The migration plan contains guest collections that were disabled using an older version of gcs-migration. That feature has been removed; all v4 shares must either be migrated or deleted before migration. Run gcs-migration update which will re-enable those guest collections. Optionally, you can delete the v4 share before running gcs-migration update which will remove the guest collection from the migration plan.


MP07 (Migration plan has greater than 100 guest collections)

This version of gcs-migration only supports migration of up to 100 GCSv4 shared endpoints.


RO01 (Role is missing a principal)
RO02 (Role has an invalid principal)
RO03 (Role is missing a principal_type)
RO04 (Role has an invalid principal_type)
RO05 (Role is missing a role type)
RO06 (Role has an invalid role type)

The given role is invalid. Run gcs-migration update as an endpoint administrator to update the plan with a well-formed role list. If that fails to rectify the problem, contact support@globus.org for instructions on creating role definitions in your migration plan.


SG01 (Storage Gateway is empty)
SG02 (Storage Gateway is missing display_name)
SG03 (Storage Gateway has empty display_name)

The Storage Gateway is missing. Run gcs-migration update as an endpoint administrator to update the plan with a well-formed storage gateway. If that fails to rectify the problem, contact support@globus.org for instructions on creating the credential file.


SG04 (Storage Gateway is missing allowed_domains)
SG05 (Storage Gateway has empty allowed_domains)
SG06 (Storage Gateway has multiple allowed domains without configured mappings)

The Storage Gateway does not have any allowed domains configured. See the section Configure Identity Mapping to configure the identity mapping.


SG07 (Storage Gateway uses an unsupported connector

The endpoint’s connector is not supported by the migration tool. Contact support@globus.org for information on when that connector will become available for migration.


SG08 (Storage Gateway is missing s3_endpoint)
SG09 (Storage Gateway is missing s3_user_credential_required)

The S3 configuration is missing a value for one of its configuration variables. Contact support@globus.org for help fixing this issue.


SG10 (Storage Gateway is missing bp_access_id_file)
SG11 (BlackPearl AccessIDFile is missing from the system)

The BlackPearl AccessIDFile is either does not exist or its path has not been set in the migration plan. Make sure the AccessIDFile exists and run gcs-migration update.

SG12 (Storage Gateway ceph_admin_bucket is unsupported)

The ceph_admin_bucket property is set in the v4 connector configuraton, but this is not supported yet on GCSv5. Contact support@globus.org if you need this supported and we can add it to our future plans.

SG13 (Storage Gateway is missing ceph_admin_key_id)
SG14 (Storage Gateway ceph_admin_key_id is empty)
SG15 (Storage Gateway is missing ceph_admin_secret_key)
SG16 (Storage Gateway ceph_admin_key_id is empty)

The Ceph configuration file is missing a value for one of its configuration variables. Contact support@globus.org for help fixing this issue. '''

SG12 (Storage Gateway is missing login_name)
SG13 (Storage Gateway login_name is not valid)

Valid values for the HPSS login name changed since GCSv4. In GCSv5, the connector will use 'hpssftp' which is not configurable. However, if the value is not set in the migration plan, there may be an issue with the plan. Run gcs-migration update to fix it. If the GCSv4 endpoint used a different login_name, this message is only a warning.


SG14 (Storage Gateway is missing authentication_mech)
SG15 (Storage Gateway authentcation_mech is not valid)

The authentication mechanism could not be determined. Set AuthenticationMech in /var/hpss/etc/gridftp_hpss_dsi.conf or set HPSS_API_AUTHN_MECH or HPSS_PRIMARY_AUTHN_MECH in /var/hpss/etc/env.conf and run gcs-migration update to fix it. The only supported authentication mechs in GCSv5 are 'unix' and 'krb5'. If your GCSv4 endpoint uses a different authentication mechanism, contact support@globus.org.


SG16 (Storage Gateway is missing authenticator)
SG17 (Storage Gateway authenticator is not valid)
SG18 (Storage Gateway authenticator is missing from the system)

The authenticator must use on of the values 'auth_keyfile:/<path>' or 'auth_keytab:<path>' and '<path>' must exist on the local system. Set Authenticator in /var/hpss/etc/gridftp_hpss_dsi.conf or set HPSS_PRIMARY_AUTHENTICATOR in /var/hpss/etc/env.conf and run gcs-migration update to fix it.


SG19 (Storage Gateway is missing uda_checksum)

The configuration for UDA checksums could not be determined. Check UDAChecksumSupport in /var/hpss/etc/gridftp_hpss_dsi.conf and run gcs-migration update to fix it.


UC01 (Credential is empty)
UC02 (Credential is missing username)

The User Credential is missing. Run gcs-migration update as an endpoint administrator to update the plan with a well-formed credential. If that fails to rectify the problem, contact support@globus.org for instructions on creating the credential file.


UC03 (Credential is missing identity_id)
UC04 (Credential is missing identity_username)

The User Credential does not match any configured mapping. Review the steps in Configure Identity Mapping to configure a mapping for the guest collection.


UC05 (Credential maps to an invalid local user)

The User Credential is mapped to a local user account which does not exist. This may be because the user mapping is incorrect, or the user who created the guest collection no longer has an account on the system. Either disable the guest collection to prevent migrating it, add the local user, or contact support@globus.org for instructions on how to manually edit the credential for a user.


UC06 (Credential is missing s3_key_id)
UC07 (Credential is missing s3_secret_key)

The User Credential object for an S3 endpoint is missing a value for its key information. Use the command gcs-migration s3-credential-create to update the credential.


UC08 (Credential is missing access_id)
UC09 (Credential is missing secret_key)

The User Credential object for a BlackPearl endpoint is missing a value for its key information. Make sure the user has an entry in the AccessIDFile and use the command gcs-migration update.


If the check command completes without any diagnostic codes or with only diagnostics MC13, MC14, MC15 or MC16, you can now migrate your endpoint.

2.13. Update the migration plan

Prior to performing Apply migration plan, if there are any changes to the endpoint, you can run the command gcs-migration update to fetch any changes, such as new roles or metadata updates. If you are performing the migration during a maintenance period where there are no changes, this can be skipped. We recommend that you run gcs-migration snapshot create prior to running update so that you have a version to roll back to in the case of errors. See Snapshots for more information on creating and rolling back to previous states.

Important

This command must be run on the GCSv4 node.

2.14. Apply migration plan

Now, the plan is ready to be applied to your endpoint. Use the following command to create the collections, roles and ACLs to on your v5 endpoint.

If you are migrating to a different machine, you must now copy the migration plan to the GCSv5 endpoint. Ensure that the migration_plan directory and all of its contents are readable by the root user in order to be able to complete the next step.

sudo globus-connect-server endpoint migrate4

When this completes, your v5 endpoint now has all of the configuration properties transferred to it and should contain a mapped collection that has similar behavior to the v4 endpoint.

2.15. Test endpoint

You should be able to list directories and perform transfers against this endpoint. Test things out and make sure that things behave as expected. If not, you can revisit the previous steps to adjust the configuration and then rerun the commands in Apply migration plan. This will replace the configuration of the v5.4 endpoint with the updated configuration, adding or deleting components from your endpoint as needed.

2.16. Finalize the Globus Connect Server v5.4 migration

The final step in the migration replaces the temporary endpoint ID created during the migration with the ID of the old Globus Connect Server v4 endpoint. This will allow web bookmarks to continue to work against the endpoint. To do this, run the following command on the migrated data transfer node.

Finalize migration
sudo globus-connect-server endpoint migrate4 --finalize

2.17. Update other data transfer nodes

Now, the Globus Connect Server v5.4 endpoint is working and replaced the endpoint in the Globus Transfer service. The other data transfer nodes will no longer be used as part of the endpoint. To configure them to be a part of the new Globus Connect Server v5.4 endpoint, follow the steps in Uninstall Globus Connect Server v4 and Install Globus Connect Server v5.4 software on those nodes. Then copy the deployment key file from your migrated data transfer node to the other data transfer nodes and follow the steps in Set up services on the data transfer node on those nodes.

3. Post Migration

This section includes steps that can not be automated during migration. They can be completed manually after the migration is finalized.

3.1. Set Collection Advertised Owner

A collection’s advertised owner allows users to find the collection more easily when using the search API in the globus transfer service. The advertised owner string can only be set to a user with an admin role on the collection and the string can only be set to an identity belonging to the caller. This can be done either via the web app or command line.

Web App

Visit the Overview page for the collection, then click on Edit Attributes. Select an identity from your identity set to update the Advertised Owner attribute.

Command Line

Use the globus-connect-server set-owner-string command to set the advertised owner on the migrated mapped collection.

Example globus-connect-server collection set-owner-string
$ globus-connect-server collection list
ID                                   | Display Name                    | Owner                   | Collection Type | Storage Gateway ID
------------------------------------ | ------------------------------- | ----------------------- | --------------- | ------------------------------------
26dca09d-4fc1-4ea1-8232-fd18432cae68 | Migration: v4_migrated_endpoint | joe.admin@example.org   | mapped          | e1c68cf2-f7b7-4d1b-98c8-5b8cff84cb5a
$ globus-connect-server collection set-owner-string 26dca09d-4fc1-4ea1-8232-fd18432cae68 joe.admin@example.org
Message: Updated collection owner_string to joe.admin@example.org (7bb58e97-a695-4bbb-a8b5-36b709c12ab6)

4. Advanced Migrations

4.1. Combining Multiple Endpoints

Since Globus Connect Server v5.4 supports having multiple storage systems on a single physical endpoint, it is possible to combine multiple v4 endpoints with different configurations into collections on a single v5 endpoint. This allows you to have a single endpoint containing multiple storage gateways with different policies.

After you’ve migrated the first v4 endpoint, you can then create a new migration plan for another v4 endpoint. You can use the same commands to configure identity mappings and check the migration plan. Once that is complete, copy the migration plan to the v5 endpoint and run the globus-connect-server endpoint migrate4 command as before to apply the plan to the v5 endpoint.

This creates a new independent storage gateway and mapped collection for the v4 endpoint, and as well as guest collections, roles, and ACLs if there were shares on the original v4 endpoint.

Each migration plan can be finalized independently and in any order.

4.2. Migrating Endpoints using the S3 Connector

When migrating an endpoint which uses the S3 connector, the gcs-migration create command parses the S3 map file and creates migration state for keys used by owners of shared endpoints. Credentials which are not used by owners of shared endpoints are not added to the migration plan by default, since the globus identities of the owners of those keys are not easily discovered in the absence of a shared endpoint.

Use the gcs-migration s3-credential-create command to add these other keys to the migration plan. This must be done after the identity mappings are configured for the migration plan by using the gcs-migration identity-mapping commands.

This command creates credential objects in the migration plan for one or more credentials located in the s3 map file or files. By default, it creates credentials for all users in the map file. Limit this to a subset of those entries by passing the local user names as arguments to the command.

These credentials can be associated with any globus account using a domain which is allowed by the migration plan. If you are allowing multiple domains to access the migrated endpoint, you’ll need to supply the --domain option to indicate which domain to associate with the user and credentials.

For example, if an endpoint is configured to map both example.org and department.example.org domains, you can use the command

gcs-migration s3-credential-create --domain example.org user1 user2 user3

to create credentials for the user1@example.org, user2@example.org, and user3@example.org accounts, provided there are credentials in the S3 map file.

Appendix A: Authentication and Authorization Changes in Globus Connect Server v5.4

Globus Connect Server v5.4 uses different technologies to implement user authentication and authorization than Globus Connect Server v4. Determining how to handle these differences is the most complicated part of the migration. First, let’s provide an overview of these changes.

Most significantly, users who want to access Globus Connect Server v4 use different authentication methods when accessing Globus Transfer than when they want to activate endpoints.

Globus Connect Server v5.4 uses Globus Auth for all authentication operations, so the user identity information is consistent across all services.

The way identities are mapped in the services is also different due to the difference in the type of identity information provided to the endpoint.

A.1. Globus Connect Server v4

To access a Globus Connect Server v4 endpoint, users will first activate the endpoint, which delegates a X.509 certificate to Globus Transfer. This certificate is transmitted as part of the TLS protocol to the Globus Connect Server v4 endpoint which then maps the identity of the certificate to to a local user.

The mapping may be done programmatically using a callout which parses the certificate or a mapping file which matches the subject name of the certificate with a local username.

These different mapping types are described in more detail in the following sections.

A.1.1. MyProxy Callout

The administrator configures the MyProxy plugin on the endpoint to trust the X.509 certificates issued by specific certificate authority. The plugin parses the user certificate and extracts the Subject Name attribute and parses the Common Name relative distinguished name as the local user name. This works well with the MyProxy service which is included as a part of Globus Connect Server v4, but could be used with any certificate authority that supports this subject naming convention.

A.1.2. CILogon Callout

The administrator configures the CILogon plugin on the endpoint to trust the X.509 certificates issued by the CILogon certificate authority which contain a specific Organization relative distinguished name. The plugin extracts the eduPersonPrincipalName extension which is interpreted as the local user name.

A.1.3. Gridmap File

The administrator manages a mapping between the Subject Name object of the X.509 certificate to a local user name. This is done on a per-certificate basis. The administrator must ensure that all Globus users have an entry for their certificate subject in the gridmap file in order to have access.

A.2. Globus Connect Server v5.4

Globus Connect Server v5.4 uses Globus Auth to authorize users. The endpoint receives an identity token after validating the user’s access token, and processes that token to determine the local user name.

The mapping can be performed either by performing expression matching on the identity information, or by running a program that consumes the identity information and returns the local user information.

Globus Connect Server v5.4 has additional policy support to require users to log in using certain identity providers or to restrict access based on local user name or group membership that is returned from the mapping. These additional policies will default to the same policies as in the Globus Connect Server v4 endpoint but can be edited later using the globus-connect-server storage-gateway update command.

When migrating a Globus Connect Server v4 endpoint to Globus Connect Server v5.4, you will need to keep the following in mind:

  • Whether you will need to use the Globus OIDC service to implement authorization, register your own OIDC server, or whether you will use existing identity provider(s).

  • What Globus Auth identity provider(s) you will require your users to have identities from.

  • How to map identities from the chosen domains to the user namespace used by your endpoint.

Appendix B: Snapshots

At any time, after creating the migration plan you can use the command gcs-migration snapshot command to create a snapshot of the migration plan. This may be useful if you make a mistake in one of the following steps, or if you need to change the Globus identity provider to be used for one of your Globus Connect Server v4 identity sources.

Example gcs-migration snapshot create
# gcs-migration snapshot create --description "Initial migration plan"
Created snapshot 1

You can list existing snapshots using the command gcs-migration snapshot list

Example gcs-migration snapshot list
# gcs-migration snapshot list
ID | Timestamp                        | Description
-- | -------------------------------- | ----------------------
 1 | 2021-06-15 10:02:32.356692-04:00 | Initial migration plan

You can restore the migration plan directory to the state of snapshot by using the gcs-migration snapshot restore command. If given a numeric argument, it restores that particular snapshot. Otherwise, it restores the snapshot with the largest ID.

Example gcs-migration snapshot restore
# gcs-migration snapshot restore 1
Restored migration plan to "Initial migration plan"