Migrating an Endpoint from Globus Connect Server v4 to v5.4
- 1. Introduction
-
2. Migration Steps
- 2.1. Install the gcs-migration package
- 2.2. Create the migration plan
- 2.3. Update Sharing State Files
- 2.4. Remove data transfer node from the endpoint
- 2.5. Uninstall Globus Connect Server v4
- 2.6. Install Globus Connect Server v5.4 software
- 2.7. Create the endpoint
- 2.8. Set up services on the data transfer node
- 2.9. Set the endpoint as managed
- 2.10. (Optional) Configure the Globus OIDC Server
- 2.11. Configure Identity Mapping
- 2.12. Check migration plan
- 2.13. Update the migration plan
- 2.14. Apply migration plan
- 2.15. Test endpoint
- 2.16. Finalize the Globus Connect Server v5.4 migration
- 2.17. Update other data transfer nodes
- 3. Post Migration
- 4. Advanced Migrations
- Appendix A: Authentication and Authorization Changes in Globus Connect Server v5.4
- Appendix B: Snapshots
Last Updated: Oct 31, 2022
1. Introduction
This document describes the procedure to convert an endpoint from Globus Connect Server v4 to v5.4. It assumes you have an existing Globus Connect Server v4 endpoint with configuration state that you want to preserve when converting to the latest Globus Connect Server software. If you are installing Globus Connect Server for the first time please see the GCS v5.4 installation guide.
1.1. Process Overview
1.1.1. Multiple data transfer nodes
For an endpoint with multiple data transfer nodes, you’ll select one of the data transfer nodes on which to perform the migration. This node will be taken out of the endpoint configuration while you perform the migration. You will be able to test out the new configuration on the migration data transfer node while the others continue to service Globus requests. At the final step, the new configuration will replace the old endpoint in the Globus transfer service, and the other data transfer nodes will cease working with Globus. You’ll then need to replace Globus Connect Server v4 with v5.4 and then add the nodes to the endpoint.
1.1.2. Single data transfer node
For an endpoint with a single data transfer node, you’ll need to stop the Globus services on your endpoint for a longer period of time and will not be able to have both the old and new configurations running simultaneously. Other than that, the process is similar.
1.1.3. Downtime
Running gcs-migration
on the Globus Connect Server v4 endpoint does not disrupt the operations
of the endpoint therefore downtime is unnecessary. Running globus-connect-server
endpoint migrate4
will impact services on the Globus Connect Server v5 endpoint, and when using
the --finalize
option also impacts services on the Globus Connect Server v4 endpoint, so downtime
should be planned accordingly. While running globus-connect-server endpoint migrate4
,
the local node will not be able to service any user requests.
The runtime of globus-connect-server endpoint migrate4
depends on the number of v4
shares that are migrating and the operation being performed:
-
On the first run of
globus-connect-server endpoint migrate4
(without--finalize
), all components in the migration plan must be created on the Globus Connect Server v5 endpoint. This process takes approximately 10 minutes for 100 v4 shares. -
On successive runs of
globus-connect-server endpoint migrate4
(without--finalize
), all components in the migration plan are updated on the Globus Connect Server v5 endpoint. This process takes approximately 2 minutes for 100 v4 shares. -
On the first run of
globus-connect-server endpoint migrate4 --finalize
, all Globus Connect Server v4 endpoint IDs must be migrated. This process takes approximately 15 minutes for 100 v4 shares. If this is also the first run ofglobus-connect-server endpoint migrate4
, then the expected time is approximately 20 minutes for 100 v4 shares. -
On successive runs of
globus-connect-server endpoint migrate4 --finalize
, the endpoint ID migration is reverified. This process takes approximately 2 minutes for 100 v4 shares.
1.1.4. User impact
All transfers that are active, including paused transfers, during the finalization step will be terminated by the Globus Transfer service, and users will be notified that their jobs were cancelled.
1.2. Prerequisites
To migrate Globus Connect Server you will need to have access to the existing data transfer nodes, as well have access to the globusid account used to create the v4 endpoint. You will be required to log in using those administrator credentials in order to read the state necessary to migrate the endpoint.
You must have root access to the machine on which you will be migrating the endpoint. This is needed to install the Globus Connect Server v5.4 software as well as configure and enable the services.
1.3. Supported Linux distributions
Globus Connect Server version 5 is currently supported on the following Linux distributions:
-
CentOS 7, 8 Stream, 9 Stream
-
Rocky Linux 8, 9
-
AlmaLinux 8, 9
-
Springdale Linux 8, 9
-
Oracle Linux 8, 9
-
Debian 11, Debian 12
-
Fedora 37, 38
-
Red Hat Enterprise Linux 7, 8, 9
-
Ubuntu 20.04 LTS, 22.04 LTS, 22.10, 23.04
2. Migration Steps
The steps to migrate an endpoint are
-
Install migration tools and use them to create endpoint configuration migration documents.
-
Uninstall Globus Connect v4 from the endpoint you will be performing the migration on.
-
Install a new Globus Connect v5.4 endpoint.
-
Configure identity mappings to use with your Globus Connect Server v5.4 endpoint.
-
Use the
globus-connect-server migrate4
tool to transfer the endpoint configuration to your new endpoint. -
Finalize the migration by resetting the Globus Transfer endpoint record to point to your new endpoint.
-
Update other data transfer nodes to Globus Connect Server v5.4 and add them as endpoint nodes.
These steps are described in more detail in the following sections.
2.1. Install the gcs-migration package
Install the following package to be able to create and configure your migration plan. This command sequence creates a python virtual environment, activates it, and installs the gcs-migration tool in it.
python3 -mvenv gcs-migration
. ./gcs-migration/bin/activate
pip install --upgrade pip
pip install --extra-index-url \
"https://downloads.globus.org/globus-connect-server/stable/wheels/" \
gcs-migration
2.2. Create the migration plan
2.2.1. Log in to Globus with gcs-migration
Log in to Globus with the migration tool to obtain a token so that the tool can
access the configuration of the existing v4 endpoint. Use the command
gcs-migration login
to perform this step.
This command displays an https link to the Globus login page to perform
authentication. Open that link in a browser, then use the Globus ID that was
used to create the v4 endpoint to log in. Paste the resulting authentication
code to the gcs-migration
command to complete the login process.
gcs-migration login
Please authenticate with Globus here:
------------------------------------
https://auth.sandbox.globuscs.info/v2/oauth2/authorize?client_id=bce18c17-43c6-42cd-9158-b0706f481fb6&redirect_uri=https%3A%2F%2Fauth.sandbox.globuscs.info%2Fv2%2Fweb%2Fauth-code&scope=openid+profile+email+urn%3Aglobus%3Aauth%3Ascope%3Atransfer.api.globus.org%3Aall&state=_default&response_type=code&access_type=offline&prompt=login&session_required_single_domain=globusid.org
------------------------------------
Enter the resulting Authorization Code here: Equ2ooyayu8BooitaeK3ohOdah8Xie
You can later remove the access token from the local cache by using the
gcs-migration logout
command.
2.2.2. Create migration plan documents
Create the migration plan documents for your endpoint. This step parses the Globus configuration as well as the local configuration of your host and endpoint and all shared endpoints created on that endpoint. This step parses the local globus-connect-server.conf and GridFTP configuration files and also obtains information about the endpoints from the Globus Transfer service.
gcs-migration create
Parsing endpoint state...
Parsing endpoint roles...
Enumerating shared endpoints...
Parsing shared endpoint state...
Writing endpoint state to disk...
At any point after this is completed, you can create a snapshot of the migration plan so you can roll back any changes you make. See snapshots for information.
2.3. Update Sharing State Files
When you create the migration plan, you may see the following diagnostic in the output:
Unable to read sharing state files. In most cases this is not a problem, but please read the documentation at https://docs.globus.org/globus-connect-server/v5.4/migration4-guide/#sharing-state-files to determine if you need to update your migration plan to set the collection base paths for your guest collections
If you did not, then you may skip to the next section.
Sharing state files are created on the endpoint when a shared endpoint is created. It contains the path that acts as the root of the shared endpoint. This information is also stored in the Globus Transfer service. In most cases, these two places have the same information. However, in some cases when a directory tree was moved, Globus Support may have instructed a user or admin to change the contents of the sharing state file. This is not a common occurrence, but has happened on some endpoints.
When the gcs-migration create
command runs, it tries to read the sharing
state files. This will not work if the account running that command does not
have access to read the files or they do not exist. In that case, the
migration tools will fall back to the value which is stored in the Globus
Transfer service. If a shared collection’s root has been changed manually, then
the migrated values will not be correct.
To mitigate this, you can use the gcs-migration sharing-state load
to load the sharing state files for shared endpoints.
Depending on the site configuration, these files may be located in a user’s home directory or in a shared configuration directory, and will be owned by users who have created shares.
2.3.1. Sharing state is readable by root
If the sharing state files are on a filesystem that is readable by the root user, you can use the command:
sudo -E ./gcs-migration-virtualenv/bin/gcs-migration sharing-state load
Loaded sharing state files for shared endpoints:
aab540c0-a5bf-4544-a9a7-c3ba98bd1da4
22142700-de5a-4354-901a-23cfd9082d23
2.3.2. Sharing state is not readable by root
If the filesystem is not readable by root, you can use the following command to enumerate the unparsed shared state files:
gcs-migration sharing-state list
User | ID | Path | Base Path
--------- | ------------------------------------ | ---------------------------------------------------------------- | ---------------
username1 | aab540c0-a5bf-4544-a9a7-c3ba98bd1da4 | $HOME/.globus/sharing/share-aab540c0-a5bf-4544-a9a7-c3ba98bd1da4 | /home/username1
This displays the local account that owns the shared endpoint, the shared endpoint id and the path to the share state file. You can use either of the two following methods to parse the contents of the sharing state file.
Sudo to user
You can then run the command gcs-migration sharing-state read
as the local
user to obtain the values for the base path of those endpoints. The output of
the command is the value of the base path. Note that since this is done as
another user, it can not update the migration plan, so you’ll need to perform
that step next.
sudo -E -u username1 ./gcs-migration-virtualenv/bin/gcs-migration sharing-state read aab540c0-a5bf-4544-a9a7-c3ba98bd1da4
/project1
Parse manually
Or alternately, you can somehow out of band read the file at the given path
(noting that $HOME, ~
, and $USER must be interpreted based on the posix home
directory and username of the collection owner. The contents of the file should
contain some comments and then a line containing the string share_path
followed by a quoted string containing the actual path.
From the previous example, this would look like
#
# This file is required in order to enable GridFTP file sharing.
# If you remove this file, file sharing will no longer work.
#
share_path "/project1"
Update Migration plan
After obtaining the correct value for the base path, you can then use command
gcs-migration sharing-state set-base-path
to set the base path for a
collection in the migration plan.
gcs-migration sharing-state set-base-path aab540c0-a5bf-4544-a9a7-c3ba98bd1da4 /project1
Prior to migrating an endpoint with a single data transfer node, you should make backups of your system configuration, especially the files in
-
/etc/gridftp.d
-
/etc/grid-security
-
/etc/myproxy.d
-
/var/lib/globus-connect-server
You can restore a v4 service using those backups prior to the finalization step.
2.4. Remove data transfer node from the endpoint
On the data transfer node, run the following command to remove it from the endpoint
configuration. If you use a non-standard location for your
globus-connect-server.conf
file, replace the path in the command with
the path your configuration file. You MUST NOT use the -d
option to
globus-connect-server-cleanup
as that will completely delete your endpoint
and make migration impossible.
sudo globus-connect-server-cleanup -c /etc/globus-connect-server.conf
2.5. Uninstall Globus Connect Server v4
On that node, uninstall the globus packages on your system and install the gcs_migration package.
sudo yum remove '*globus*' 'myproxy*'
sudo apt-get remove '.*globus.*' '^myproxy.*'
2.6. Install Globus Connect Server v5.4 software
Skip to the appropriate section for your Linux distribution and follow the instructions to install Globus Connect Server v5.4 on your system.
sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo yum install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm
Install Globus Connect Server:
sudo yum install globus-connect-server54
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm
Ensure the mod_auth_openidc module stream is disabled, as it will conflict with packages in the Globus repository:
sudo dnf module disable mod_auth_openidc
Install the DNF config manager:
sudo dnf install 'dnf-command(config-manager)'
Install Globus Connect Server:
sudo dnf install globus-connect-server54
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
sudo dnf install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm
Install the DNF config manager:
sudo dnf install 'dnf-command(config-manager)'
Install Globus Connect Server:
sudo dnf install globus-connect-server54
sudo dnf install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-latest.noarch.rpm
install Globus Connect Server:
sudo dnf install globus-connect-server54
curl -LOs https://downloads.globus.org/globus-connect-server/stable/installers/repo/deb/globus-repo_latest_all.deb
sudo dpkg -i globus-repo_latest_all.deb
sudo apt-key add /usr/share/globus-repo/RPM-GPG-KEY-Globus
sudo apt update
sudo apt install globus-connect-server54
curl -LOs https://downloads.globus.org/globus-connect-server/stable/installers/repo/deb/globus-repo_latest_all.deb
sudo dpkg -i globus-repo_latest_all.deb
sudo apt-key add /usr/share/globus-repo/RPM-GPG-KEY-Globus
sudo apt update
sudo apt install globus-connect-server54
2.7. Create the endpoint
To create the endpoint, run the globus-connect-server endpoint setup command.
globus-connect-server endpoint setup "My GCSv5.4 Endpoint"
\
--organization "Example Organization"
\
--owner admin@example.edu
\
--contact-email support@example.edu
The command returns information about the endpoint that may be useful for additional configuration later, including the domain name of the endpoint, a link to send to subscription managers to set the endpoint as managed, and the redirect URI needed if Google Drive or Cloud connectors will be used with this endpoint.
2.8. Set up services on the data transfer node
Run the globus-connect-server node setup command to configure and start the
Globus services on the data transfer node. This command must be done as the root user,
as it enables and starts systemd
services. The deployment-key.json
file from the previous step will be used by this command.
sudo globus-connect-server node setup
2.9. Set the endpoint as managed
If your v4 endpoint is managed, you (or your organization’s subscription manager) can use the Globus web application to set the v5.4 endpoint as managed. This should be done now, especially if you have roles for your endpoint or have shared endpoints created on it.
Have your subscription manager visit the endpoint page at
https://app.globus.org/file-manager/collections/ENDPOINT-ID/overview/subscription
with ENDPOINT-ID being the value printed after running globus-connect-server endpoint setup
.
If you do not know who your subscription manager is, please email support@globus.org.
2.10. (Optional) Configure the Globus OIDC Server
If your Globus Connect Server v4 endpoint used either the MyProxy or MyProxy OAuth identity providers, and you want your users to be able to authenticate with your endpoint using PAM login configuration on the data transfer node, you will probably want to configure the Globus OIDC server. This service provides similar functionality to MyProxy CA functionality, but uses the standard OpenID Connect protocol.
You may also register an existing OIDC server to allow access to your endpoint from its asserted domains.
If there is another way your users log into Globus --- for example, your
institutional login page is available from the auth.globus.org
login page or
your users log in to Globus using Google, then you may not need to configure
this service.
If you are unsure whether you need to configure this service, please contact support@globus.org.
To install this service, follow the steps in the Globus OIDC Installation Guide.
2.11. Configure Identity Mapping
Globus Connect Server v5.4 uses Globus Auth for all authentication, instead of supporting the mix
of CILogon, MyProxy, MyProxy OAuth, and per-user X.509 certificates. It also
uses a different method for mapping identities from the Globus Auth identity
namespace to the local system identities. To migrate the endpoint, you’ll need
to configure how identities are mapped. First, you use the command
gcs-migration identity-mapping list
to list the types of identities you had
configured on your v4 endpoint. Each of those will need to be handled in
slightly different ways.
2.11.1. CILogon
If your endpoint is configured to use a CILogon provider, the output of the
command gcs-migration identity-mapping list
will look something like this:
gcs-migration identity-mapping list
mapping 0: (requires attention)
type: CILogon
subject: /DC=org/DC=cilogon/C=US/O=University of Example
organization: University of Example
domains: [example.edu*]
This example shows a v4 endpoint that uses the CILogon provider for
"University of Example", which uses the domain example.edu. The domain is
marked with a *
which means that the domain is not currently configured to
be used by your v5 endpoint, so you’ll need to configure the endpoint
to allow users from this domain to access the endpoint. There may be
one or more domains listed.
Use the command gcs-migration identity-mapping cilogon to configure this mapping.
gcs-migration identity-mapping cilogon
If there are multiple CILogon mappings listed or the CILogon provides multiple domains, you can use the following command-line options to customize how the identity mapping is configured:
- --organization ORGANIZATION, -o ORGANIZATION
-
This option sets the CILogon organization to add a mapping for. This option is only needed if your Globus Connect Server v4 endpoint is manually configured with multiple CILogon callouts.
- --domain DOMAIN, -d DOMAIN
-
This option may be passed multiple times to select which domains used by an organization to allow access to your endpoint. This is only needed if the organization supports multiple domains. In that case, you can select a subset of those domains by using this option. If there is only one domain used by the organization, this is not needed.
- --identity-mapping JSON_OR_FILE, -i JSON_OR_FILE
-
Add an identity mapping document to the storage gateway configuration. This may be needed if you have a complicated configuration with multiple domains that have overlapping identities or if you need a special mapping between Globus Auth usernames and the local system usernames. The argument to this option is either a path to a json file containing the mapping, or the text of the mapping itself. This option may be passed multiple times to add multiple mappings to the storage gateway.
- --dont-update-guest-collections, -D
-
This option is a flag to skip updating the guest collections to set their owners to identities from the CILogon domain.
2.11.2. MyProxy and MyProxy OAuth
If your endpoint is configured to use MyProxy CA or MyProxy OAuth to issue
certificates for your users, you the output of the
command gcs-migration identity-mapping list
will look something like this:
gcs-migration identity-mapping list
mapping 0: (requires attention)
type: MyProxy
subject: /O=Globus Connect Server/CN=endpoint.example.org
domains: []
This example shows a v4 endpoint that uses the myproxy server with the subject
/O=Globus Connect Server/CN=endpoint.example.org
. Unlike CILogon, where the
migration tool can normally determine the organization and domains used by the
identity provider, for a MyProxy or MyProxy OAuth configuration, you’ll need to
pass information about the domains to use for authentication to the migration
tools. In this case, you see that the domain list is empty, so you’ll need
to configure this mapping.
The domain may come from an existing Globus Auth identity provider, or from a Globus OIDC server running either on this or another endpoint operated by your organization. In either case, you’ll need to ensure that you know following:
-
The domain or domains which the identity provider issues identities for.
-
A provider may issue identities from multiple domains. For example, a university’s identity provider may have distinct domains for faculty and students. In these cases, you’ll need to determine which domain or domains are appropriate to map for your endpoint.
-
-
The mapping from the identity data to the storage system’s user namespace. In some cases, there are trivial mappings (
username@example.edu
maps tousername
) and in some cases you’ll need to supply either an expression-based mapping or provide a callout program to provide the mapping for a given identity. See the Identity Mapping Guide for information about configuring identity mappings in Globus Connect Server v5.4.
If you are using a new identity provider (such as a Globus OIDC server), some
of the user identities may not yet be created in the Globus Auth system. In
that case, you can use the command line option --provision
to ensure that
identities are issued for the usernames used by the endpoint’s shares.
As an example, if we wanted to configure our endpoint to use a Globus OIDC
server we set up which provides identities in the example.org
domain in place
of the MyProxy or MyProxy OAuth server configured in the previous example, we
can use the following command:
gcs-migration identity-mapping myproxy --domain example.org --provision
If there are multiple MyProxy OAuth servers configured on your endpoint, or you need to use multiple domains, you can use the following options to customize the configuration:
- --subject SUBJECT, -s SUBJECT
-
The X.509 Subject name of the MyProxy CA. This is used to determine which MyProxy configuration from your Globus Connect Server v4 endpoint to replace the identity mappings for. This may be omitted if your configuration only supports a single MyProxy CA.
- --domain DOMAIN, -d DOMAIN
-
This option sets the domain or domains that user identities will be from in place of the MyProxy CA. For example, if you are replacing the MyProxy configuration with a organization-wide identity provider that is already registered with Globus Auth, you would provide the domain name of identities issued by the provider. Likewise, if you are using the Globus OIDC service, you would provide the domain name of that service. If you are replacing MyProxy with multiple identity providers, or if your organization’s identity provider issues tokens from multiple domains, you can pass this option multiple times. In that case, you may need to use the identity mapping option to disambiguate identity usernames.
- --identity-mapping JSON_OR_FILE, -i JSON_OR_FILE
-
Add an identity mapping document to the storage gateway configuration. This may be needed if you have a complicated configuration with multiple domains that have overlapping identities or if you need a special mapping between Globus Auth usernames and the local system usernames. The argument to this option is either a path to a json file containing the mapping, or the text of the mapping itself. This option may be passed multiple times to add multiple mappings to the storage gateway.
- --dont-update-guest-collections, -D
-
This option is a flag to skip updating the guest collections to set their owners to identities from the CILogon domain.
- --provision, -p
-
Provision IDs for users which have not yet accessed the identity provider.
2.11.3. Gridmap Entries
Globus Connect Server v4 also supported explicitly mapping users based on the X.509 certificate subjects to particular local accounts.
gcs-migration identity-mapping list
mapping 0: (requires attention)
type: Gridmap
"/O=Example Organization/CN=Some User" "local_user1"
"/O=Example Organization/CN=Some Other User" "local_user2"
allowed_domains: []
v5 mappings: []
To migrate this type of configuration to v5.4, you’ll need to know the Globus identity that the users owning those certificates use to log into Globus and then create identity mappings for those users.
The gridmap file can contain multiple mappings for a given subject name. If
that is the case in your gridmap file, you’ll need to use the --full-mapping
option and supply subject name, local username, and the Globus identity username to create the new mapping.
If each gridmap entry is for a single local username, you can use the
--subject-mapping
to option, so that you don’t need to manually specify the
local username for each mapping.
As an example, if we know that Some User’s Globus account is some_user@example.org and that Some Other User’s Globus account is s.o.user@example.org, you can use the command
gcs-migration identity-mapping gridmap \
--subject-mapping "/O=Example Organization/CN=Some User" some_user@example.org \
--subject-mapping "/O=Example Organization/CN=Some Other User" s.o.user@exmaple.org
The gcs-migration identity-mappings gridmap command takes several options:
- --full-mapping SUBJECT GLOBUS_USERNAME LOCAL_USERNAME, -f SUBJECT GLOBUS_USERNAME LOCAL_USERNAME
-
Add an identity mapping for the gridmap entry mapping SUBJECT to LOCAL_USERNAME, using GLOBUS_USERNAME as the Globus Auth identity username. For example, if your gridmap has an entry
"/C=US/O=Example/CN=Joe User" juser,joe
You can use the command-line option
-f "/C=US/O=Example/CN=Joe User" joe.user@example.org joe
to configure a mapping for the Globus Auth identityjoe.user@example.org
to the local userjoe
. This will also be able to update the guest collections created from that user’s shared endpoints to be owned by thejoe.user@example.org
identity. - --subject-mapping SUBJECT GLOBUS_USERNAME, -s SUBJECT GLOBUS_USERNAME
-
Add an identity mapping for the gridmap entry for SUBJECT, using GLOBUS_USERNAME as the Globus Auth identity username. This form works when there is only a single mapping for the subject in the gridmap file.
- --dont-update-guest-collections, -D
-
This option is a flag to skip updating the guest collections to set their owners to identities from the CILogon domain.
2.11.4. Custom Identity Mapping
If your endpoint uses a custom identity mapping callout or identity configuration, you may need to provide a custom identity mapping configuration. This allows you to add a mapping configuration into the migration plan which will be used by the v5.4 storage gateway.
As an example, if you know that there is a Globus Auth identity domain
custom-identity.example.edu
that represents the same user namespace as a GSI
authorization callout, you can use the following to create a new mapping in the
storage gateway configuration. For this you’ll need to specify the domain
associated with the custom identity mapping, as well as an identity mapping
document (either a filename containing a json document or the json value
inline).
identity-map.json
{
"DATA_TYPE": "expression_identity_mapping#1.0.0",
"mappings": [
{
"source": "{username}",
"match": "(.*)@custom-identity\\.example\\.edu",
"output": "{0}"
}
]
}
gcs-migration identity-mapping custom \
--domain custom-domain.example.edu --identity-mapping identity-map.json
The gcs-migration identity-mappings custom command takes several options:
- --domain DOMAIN, -d DOMAIN
-
Identity Provider username domain to map to local accounts. This may be provided multiple times in a single invocation of this command.
- --identity-mapping JSON_OR_FILE, -i JSON_OR_FILE
-
Either the path to a JSON file containing the identity mapping document or an identity-mapping document.
For more information about identity mapping, see the GCSv5 Identity Mapping guide.
2.12. Check migration plan
Assuming you have added mappings for all of the types of identities you will be
using on your endpoint, you can now use the gcs-migration check
command to
check the migration plan for completeness. If there are any issues, such as
missing domains or identity mappings, or unsupported custom GridFTP configuration,
they will be displayed along with a diagnostic code. The following sections
describe solutions for those diagnostic codes.
- AC01 (Missing principal)
- AC02 (Invalid principal)
- AC03 (Missing principal_type)
- AC04 (Invalid principal_type)
- AC05 (Missing path)
- AC06 (Missing permissions)
- AC05 (Invalid permissions)
-
The ACL has a missing or invalid value. Run
gcs-migration update
as an endpoint administrator to update the plan with a well-formed ACL. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the ACL file.
- GC01 (Guest Collection definition is empty)
- GC02 (Guest Collection definition is missing owner_id)
- GC03 (Guest Collection definition has an invalid owner_id)
- GC05 (Guest Collection definition is missing a user credential)
- GC05 (Guest Collection definition is missing the host_path property)
-
The guest collection has a missing or invalid value. Run
gcs-migration update
as an endpoint administrator to update the plan with a well-formed guest collection. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the guest collection configuration. - GC04 (Guest Collection definition has an invalid owner_id)
-
The owner of the guest collection is mapped from a Globus Auth identity that is no longer valid. Either delete the v4 share and then run
gcs-migration update
to remove it from the migration plan, or contact support@globus.org for instructions on changing the ownership of the collection during migration.
- MC01 (Mapped collection definition is empty)
- MC02 (Mapped collection is missing the display_name property)
- MC03 (Mapped collection is missing the force_verify property)
- MC04 (Mapped collection is missing the disable_verify property)
- MC05 (Mapped collection is missing the enable_https property)
- MC06 (Mapped collection is missing the allow_guest_collections property)
-
The mapped collection has a missing or invalid value. Run
gcs-migration update
as an endpoint administrator to update the plan with a well-formed mapped collection. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the mapped collection configuration.
- MC07 (Mapped collection has sharing disabled but sharing_users_allow is set)
- MC08 (Mapped collection has sharing disabled but sharing_users_deny is set)
- MC09 (Mapped collection has sharing disabled but sharing_groups_allow is set)
- MC10 (Mapped collection has sharing disabled but sharing_groups_deny is set)
-
The mapped collection configuration is inconsistent. Run
gcs-migration update
as an endpoint administrator to update the plan with a well-formed mapped collection. If that fails to rectify the problem, contact support@globus.org for instructions on fixing the mapped collection configuration.
- MC11 (Mapped collection is allowing a custom GridFTP client certificate)
-
The GridFTP configuration specified a custom sharing certificate subject. This is not supported in GCSv5.
- MC12 (Mapped collection uses an unsupported connector)
-
The GridFTP configuration specified a connector that is not supported for migration at this time. This is not supported by the migration tool. Contact support@globus.org to receive information on when migrating that connector will become available.
- MC13 (GridFTP configuration uses functionality that is changed in GCSv5)
-
The GridFTP configuration used a feature that is not supported in GCSv5, but there is similar functionality that can be implemented using the GCS Manager API. This diagnostic is non-fatal and migration can continue, but you will need to manually configure these features if you want them to be available in your GCSv5 endpoint.
Feature Configuration options New feature banner
banner, banner_file, banner_terse, login_msg, login_msg_file
After migration is completed, use the globus-connect-server collection update command to set the user_message and/or user_message_link properties on the mapped collection.
network
control_interface, data_interface, hostname, port_range, $GLOBUS_TCP_SOURCE_RANGE
You can set these per data transfer node either at setup via command-line options to globus-connect-server node setup or after the node is configured via command-line options to globus-connect-server node update.
- MC14 (GridFTP configuration uses functionality that is not supported in GCSv5)
-
The gridftp configuration used a feature that is not supported in GCSv5, but should not conflict with GCSv5 operation. This diagnostic is non-fatal and migration can continue, but you will need to manually configure these features if you want them to be available in your GCSv5 endpoint.
If you want to use these features with GCSv5, you can enable them manually but their configuration will not be automatically synchronized between data transfer nodes by Globus Connect Server.
To enable them, write the configuration directives in to new files in the
/etc/gridftp.d/
directory after setting up the GCSv5 data transfer nodes. Thse files must not have names that start with the stringglobus-connect-server
or they may be overwritten by GCSv5.Feature Configuration options Network manager
xnetmgr
Logging configuration
log_module, log_unique, log_transfer, log_filemode
Debugging configuration
exec, fork, fork_fallback, single, debug, ignore_bad_threads, bad_signal_exit, $*_DEBUG
Custom usagestats target
usagestats
UDT protocol support
udt
Allowing users to be mapped to the root user
allow_root
Allowing users with disabled accounts to log in
allow_disabled_login
Tuning network connections
connections_max, control_preauth_timeout, control_idle_timeout, connections_disabled, offline_msg
- MC15 (GridFTP configuration uses functionality that conflicts with GCSv5 configuration)
-
The gridftp configuration used a feature that conflicts with the rest of the GCSv5 system and will not be available in the migrated endpoint. You can continue to migrate your endpoint, but the functionality will be different than your GCSv4 endpoint. Contact support@globus.org if you have questions.
Feature Configuration options SSH FTP
ssh
Running the GridFTP server in a chroot
chroot_path
Configuring the GridFTP server as a striped server
ipc_interface, ipc_allow_from, ipc_deny_from, secure_ipc, ipc_auth_mode, ipc_user_name, ipc_subject, ipc_credential, ipc_cookie, ipc_port, ipc_idle_timeout, ipc_connect_timeout, remote_nodes, hybrid, data_node, stripe_blocksize, stripe_count, brain, stripe_layout, stripe_blocksize_locked, stripe_layout_locked, stripe_mode
Allowing anonymous FTP access to the GridFTP server
allow_anonymous, anonymous_names_allowed, anonymous_user, anonymous_group
Password authentication to the GridFTP server
pw_file
Control-channel management of sharing policies and commands
sharing_control, disable_command_list
Custom network and driver stacks
allowed_modules, dc_whitelist, fs_whitelist, popen_whitelist, dc_default, fs_default
Custom authorization modules
cas, acl
Custom process environment and user
auth_level, process_user, process_group, threads, inetd, daemon, detach, pidfile, use_home_dirs, home_dir
Custom allow/deny IP ranges for control connections
allow_from, deny_from
Allowing symlinks to leave the path restrictions for a collection
rp_follow_symlinks
Setting a non-default globus location
globus_location
- MC16 (GridFTP configuration uses process_user which conflicts with GCSv5 configuration)
-
The gridftp configuration for the BlackPearl endpoints specifies a process_user for the gridftp process. This functionality changes in GCSv5 and the process_user setting is no longer necessary. This is a non-fatal warning for informational purposes.
- MP01 (Migration plan is missing)
- MP02 (Storage gateway is missing from migration plan)
- MP03 (Mapped collection is missing from migration plan)
-
Run
gcs-migration update
. If these errors persists, contact support@globus.org.
- MP04 (Guest collections owned by identity not in allowed domains)
-
One or more guest collections in the migration plan have been configured with an owner identity that is not in the storage gateway’s allowed domains. Run
gcs-migration set-guest-collection-owner
to change the ownership of the guest collection or usegcs-migration identity-mapping
to add the owner’s domain to the storage gateway’s allowed domains.
- MP05 (Shared endpoints are not supported)
-
Migrating shared endpoints is not supported with this version of the migration tools.
- MP06 (Migration plan has disabled guest collections)
-
The migration plan contains guest collections that were disabled using an older version of gcs-migration. That feature has been removed; all v4 shares must either be migrated or deleted before migration. Run
gcs-migration update
which will re-enable those guest collections. Optionally, you can delete the v4 share before runninggcs-migration update
which will remove the guest collection from the migration plan.
- MP07 (Migration plan has greater than 100 guest collections)
-
This version of gcs-migration only supports migration of up to 100 GCSv4 shared endpoints.
- RO01 (Role is missing a principal)
- RO02 (Role has an invalid principal)
- RO03 (Role is missing a principal_type)
- RO04 (Role has an invalid principal_type)
- RO05 (Role is missing a role type)
- RO06 (Role has an invalid role type)
-
The given role is invalid. Run
gcs-migration update
as an endpoint administrator to update the plan with a well-formed role list. If that fails to rectify the problem, contact support@globus.org for instructions on creating role definitions in your migration plan.
- SG01 (Storage Gateway is empty)
- SG02 (Storage Gateway is missing display_name)
- SG03 (Storage Gateway has empty display_name)
-
The Storage Gateway is missing. Run
gcs-migration update
as an endpoint administrator to update the plan with a well-formed storage gateway. If that fails to rectify the problem, contact support@globus.org for instructions on creating the credential file.
- SG04 (Storage Gateway is missing allowed_domains)
- SG05 (Storage Gateway has empty allowed_domains)
- SG06 (Storage Gateway has multiple allowed domains without configured mappings)
-
The Storage Gateway does not have any allowed domains configured. See the section Configure Identity Mapping to configure the identity mapping.
- SG07 (Storage Gateway uses an unsupported connector
-
The endpoint’s connector is not supported by the migration tool. Contact support@globus.org for information on when that connector will become available for migration.
- SG08 (Storage Gateway is missing s3_endpoint)
- SG09 (Storage Gateway is missing s3_user_credential_required)
-
The S3 configuration is missing a value for one of its configuration variables. Contact support@globus.org for help fixing this issue.
- SG10 (Storage Gateway is missing bp_access_id_file)
- SG11 (BlackPearl AccessIDFile is missing from the system)
-
The BlackPearl AccessIDFile is either does not exist or its path has not been set in the migration plan. Make sure the AccessIDFile exists and run
gcs-migration update
. - SG12 (Storage Gateway ceph_admin_bucket is unsupported)
-
The ceph_admin_bucket property is set in the v4 connector configuraton, but this is not supported yet on GCSv5. Contact support@globus.org if you need this supported and we can add it to our future plans.
- SG13 (Storage Gateway is missing ceph_admin_key_id)
- SG14 (Storage Gateway ceph_admin_key_id is empty)
- SG15 (Storage Gateway is missing ceph_admin_secret_key)
- SG16 (Storage Gateway ceph_admin_key_id is empty)
-
The Ceph configuration file is missing a value for one of its configuration variables. Contact support@globus.org for help fixing this issue. '''
- SG12 (Storage Gateway is missing login_name)
- SG13 (Storage Gateway login_name is not valid)
-
Valid values for the HPSS login name changed since GCSv4. In GCSv5, the connector will use 'hpssftp' which is not configurable. However, if the value is not set in the migration plan, there may be an issue with the plan. Run
gcs-migration update
to fix it. If the GCSv4 endpoint used a different login_name, this message is only a warning.
- SG14 (Storage Gateway is missing authentication_mech)
- SG15 (Storage Gateway authentcation_mech is not valid)
-
The authentication mechanism could not be determined. Set
AuthenticationMech
in/var/hpss/etc/gridftp_hpss_dsi.conf
or setHPSS_API_AUTHN_MECH
orHPSS_PRIMARY_AUTHN_MECH
in/var/hpss/etc/env.conf
and rungcs-migration update
to fix it. The only supported authentication mechs in GCSv5 are 'unix' and 'krb5'. If your GCSv4 endpoint uses a different authentication mechanism, contact support@globus.org.
- SG16 (Storage Gateway is missing authenticator)
- SG17 (Storage Gateway authenticator is not valid)
- SG18 (Storage Gateway authenticator is missing from the system)
-
The authenticator must use on of the values 'auth_keyfile:/<path>' or 'auth_keytab:<path>' and '<path>' must exist on the local system. Set
Authenticator
in/var/hpss/etc/gridftp_hpss_dsi.conf
or setHPSS_PRIMARY_AUTHENTICATOR
in/var/hpss/etc/env.conf
and rungcs-migration update
to fix it.
- SG19 (Storage Gateway is missing uda_checksum)
-
The configuration for UDA checksums could not be determined. Check
UDAChecksumSupport
in/var/hpss/etc/gridftp_hpss_dsi.conf
and rungcs-migration update
to fix it.
- UC01 (Credential is empty)
- UC02 (Credential is missing username)
-
The User Credential is missing. Run
gcs-migration update
as an endpoint administrator to update the plan with a well-formed credential. If that fails to rectify the problem, contact support@globus.org for instructions on creating the credential file.
- UC03 (Credential is missing identity_id)
- UC04 (Credential is missing identity_username)
-
The User Credential does not match any configured mapping. Review the steps in Configure Identity Mapping to configure a mapping for the guest collection.
- UC05 (Credential maps to an invalid local user)
-
The User Credential is mapped to a local user account which does not exist. This may be because the user mapping is incorrect, or the user who created the guest collection no longer has an account on the system. Either disable the guest collection to prevent migrating it, add the local user, or contact support@globus.org for instructions on how to manually edit the credential for a user.
- UC06 (Credential is missing s3_key_id)
- UC07 (Credential is missing s3_secret_key)
-
The User Credential object for an S3 endpoint is missing a value for its key information. Use the command
gcs-migration s3-credential-create
to update the credential.
- UC08 (Credential is missing access_id)
- UC09 (Credential is missing secret_key)
-
The User Credential object for a BlackPearl endpoint is missing a value for its key information. Make sure the user has an entry in the AccessIDFile and use the command
gcs-migration update
.
If the check command completes without any diagnostic codes or with only diagnostics MC13, MC14, MC15 or MC16, you can now migrate your endpoint.
2.13. Update the migration plan
Prior to performing Apply migration plan, if there are any changes to the endpoint, you can run the command gcs-migration update to fetch any changes, such as new roles or metadata updates. If you are performing the migration during a maintenance period where there are no changes, this can be skipped. We recommend that you run gcs-migration snapshot create prior to running update so that you have a version to roll back to in the case of errors. See Snapshots for more information on creating and rolling back to previous states.
2.14. Apply migration plan
Now, the plan is ready to be applied to your endpoint. Use the following command to create the collections, roles and ACLs to on your v5 endpoint.
If you are migrating to a different machine, you must now copy the migration
plan to the GCSv5 endpoint. Ensure that the migration_plan
directory and all
of its contents are readable by the root
user in order to be able to complete
the next step.
sudo globus-connect-server endpoint migrate4
When this completes, your v5 endpoint now has all of the configuration properties transferred to it and should contain a mapped collection that has similar behavior to the v4 endpoint.
2.15. Test endpoint
You should be able to list directories and perform transfers against this endpoint. Test things out and make sure that things behave as expected. If not, you can revisit the previous steps to adjust the configuration and then rerun the commands in Apply migration plan. This will replace the configuration of the v5.4 endpoint with the updated configuration, adding or deleting components from your endpoint as needed.
2.16. Finalize the Globus Connect Server v5.4 migration
The final step in the migration replaces the temporary endpoint ID created during the migration with the ID of the old Globus Connect Server v4 endpoint. This will allow web bookmarks to continue to work against the endpoint. To do this, run the following command on the migrated data transfer node.
sudo globus-connect-server endpoint migrate4 --finalize
2.17. Update other data transfer nodes
Now, the Globus Connect Server v5.4 endpoint is working and replaced the endpoint in the Globus Transfer service. The other data transfer nodes will no longer be used as part of the endpoint. To configure them to be a part of the new Globus Connect Server v5.4 endpoint, follow the steps in Uninstall Globus Connect Server v4 and Install Globus Connect Server v5.4 software on those nodes. Then copy the deployment key file from your migrated data transfer node to the other data transfer nodes and follow the steps in Set up services on the data transfer node on those nodes.
3. Post Migration
This section includes steps that can not be automated during migration. They can be completed manually after the migration is finalized.
3.1. Set Collection Advertised Owner
A collection’s advertised owner allows users to find the collection more easily when using the search API in the globus transfer service. The advertised owner string can only be set to a user with an admin role on the collection and the string can only be set to an identity belonging to the caller. This can be done either via the web app or command line.
Visit the Overview page for the collection, then click on Edit Attributes. Select an identity from your identity set to update the Advertised Owner attribute.
Use the globus-connect-server set-owner-string command to set the advertised owner on the migrated mapped collection.
globus-connect-server collection set-owner-string
$ globus-connect-server collection list ID | Display Name | Owner | Collection Type | Storage Gateway ID ------------------------------------ | ------------------------------- | ----------------------- | --------------- | ------------------------------------ 26dca09d-4fc1-4ea1-8232-fd18432cae68 | Migration: v4_migrated_endpoint | joe.admin@example.org | mapped | e1c68cf2-f7b7-4d1b-98c8-5b8cff84cb5a $ globus-connect-server collection set-owner-string 26dca09d-4fc1-4ea1-8232-fd18432cae68 joe.admin@example.org Message: Updated collection owner_string to joe.admin@example.org (7bb58e97-a695-4bbb-a8b5-36b709c12ab6)
4. Advanced Migrations
4.1. Combining Multiple Endpoints
Since Globus Connect Server v5.4 supports having multiple storage systems on a single physical endpoint, it is possible to combine multiple v4 endpoints with different configurations into collections on a single v5 endpoint. This allows you to have a single endpoint containing multiple storage gateways with different policies.
After you’ve migrated the first v4 endpoint, you can then create a new migration plan for another v4 endpoint. You can use the same commands to configure identity mappings and check the migration plan. Once that is complete, copy the migration plan to the v5 endpoint and run the globus-connect-server endpoint migrate4 command as before to apply the plan to the v5 endpoint.
This creates a new independent storage gateway and mapped collection for the v4 endpoint, and as well as guest collections, roles, and ACLs if there were shares on the original v4 endpoint.
Each migration plan can be finalized independently and in any order.
4.2. Migrating Endpoints using the S3 Connector
When migrating an endpoint which uses the S3 connector, the gcs-migration create command parses the S3 map file and creates migration state for keys used by owners of shared endpoints. Credentials which are not used by owners of shared endpoints are not added to the migration plan by default, since the globus identities of the owners of those keys are not easily discovered in the absence of a shared endpoint.
Use the gcs-migration s3-credential-create command to add these other keys to the migration plan. This must be done after the identity mappings are configured for the migration plan by using the gcs-migration identity-mapping commands.
This command creates credential objects in the migration plan for one or more credentials located in the s3 map file or files. By default, it creates credentials for all users in the map file. Limit this to a subset of those entries by passing the local user names as arguments to the command.
These credentials can be associated with any globus account using a domain
which is allowed by the migration plan. If you are allowing multiple domains
to access the migrated endpoint, you’ll need to supply the --domain
option
to indicate which domain to associate with the user and credentials.
For example, if an endpoint is configured to map both example.org
and
department.example.org
domains, you can use the command
gcs-migration s3-credential-create --domain example.org user1 user2 user3
to create credentials for the user1@example.org
, user2@example.org
, and
user3@example.org
accounts, provided there are credentials in the S3 map file.
Appendix A: Authentication and Authorization Changes in Globus Connect Server v5.4
Globus Connect Server v5.4 uses different technologies to implement user authentication and authorization than Globus Connect Server v4. Determining how to handle these differences is the most complicated part of the migration. First, let’s provide an overview of these changes.
Most significantly, users who want to access Globus Connect Server v4 use different authentication methods when accessing Globus Transfer than when they want to activate endpoints.
Globus Connect Server v5.4 uses Globus Auth for all authentication operations, so the user identity information is consistent across all services.
The way identities are mapped in the services is also different due to the difference in the type of identity information provided to the endpoint.
A.1. Globus Connect Server v4
To access a Globus Connect Server v4 endpoint, users will first activate the endpoint, which delegates a X.509 certificate to Globus Transfer. This certificate is transmitted as part of the TLS protocol to the Globus Connect Server v4 endpoint which then maps the identity of the certificate to to a local user.
The mapping may be done programmatically using a callout which parses the certificate or a mapping file which matches the subject name of the certificate with a local username.
These different mapping types are described in more detail in the following sections.
A.1.1. MyProxy Callout
The administrator configures the MyProxy plugin on the endpoint to trust the X.509 certificates issued by specific certificate authority. The plugin parses the user certificate and extracts the Subject Name attribute and parses the Common Name relative distinguished name as the local user name. This works well with the MyProxy service which is included as a part of Globus Connect Server v4, but could be used with any certificate authority that supports this subject naming convention.
A.1.2. CILogon Callout
The administrator configures the CILogon plugin on the endpoint to trust the X.509 certificates issued by the CILogon certificate authority which contain a specific Organization relative distinguished name. The plugin extracts the eduPersonPrincipalName extension which is interpreted as the local user name.
A.1.3. Gridmap File
The administrator manages a mapping between the Subject Name object of the X.509 certificate to a local user name. This is done on a per-certificate basis. The administrator must ensure that all Globus users have an entry for their certificate subject in the gridmap file in order to have access.
A.2. Globus Connect Server v5.4
Globus Connect Server v5.4 uses Globus Auth to authorize users. The endpoint receives an identity token after validating the user’s access token, and processes that token to determine the local user name.
The mapping can be performed either by performing expression matching on the identity information, or by running a program that consumes the identity information and returns the local user information.
Globus Connect Server v5.4 has additional policy support to require users to log in using certain
identity providers or to restrict access based on local user name or group
membership that is returned from the mapping. These additional policies will
default to the same policies as in the Globus Connect Server v4 endpoint but can be edited later
using the globus-connect-server storage-gateway update
command.
When migrating a Globus Connect Server v4 endpoint to Globus Connect Server v5.4, you will need to keep the following in mind:
-
Whether you will need to use the Globus OIDC service to implement authorization, register your own OIDC server, or whether you will use existing identity provider(s).
-
What Globus Auth identity provider(s) you will require your users to have identities from.
-
How to map identities from the chosen domains to the user namespace used by your endpoint.
Appendix B: Snapshots
At any time, after creating the migration plan you can use the command
gcs-migration snapshot
command to create a snapshot of the migration plan.
This may be useful if you make a mistake in one of the following steps, or
if you need to change the Globus identity provider to be used for one of your
Globus Connect Server v4 identity sources.
gcs-migration snapshot create
# gcs-migration snapshot create --description "Initial migration plan" Created snapshot 1
You can list existing snapshots using the command gcs-migration snapshot
list
gcs-migration snapshot list
# gcs-migration snapshot list ID | Timestamp | Description -- | -------------------------------- | ---------------------- 1 | 2021-06-15 10:02:32.356692-04:00 | Initial migration plan
You can restore the migration plan directory to the state of snapshot
by using the gcs-migration snapshot restore
command. If given a numeric
argument, it restores that particular snapshot. Otherwise, it restores the
snapshot with the largest ID.
gcs-migration snapshot restore
# gcs-migration snapshot restore 1 Restored migration plan to "Initial migration plan"