Last Updated: Nov 05, 2019

The HPSS connector can be used for accessing and sharing of data on an HPSS storage system. The connector is available as an add-on subscription to organizations with a Globus Standard subscription - please contact us for pricing.

This document describes the steps needed to install an endpoint and the HPSS connector needed to access the storage system. This installation should be done by a system administrator, and once completed, users can use the endpoint to access HPSS storage via Globus to transfer, share and publish data on the system.

Preinstallation Checklist

In order for the HPSS DSI to function properly on the HPSS client node, please verify the following items.

Review latest version of the release notes

Recently discovered issues and workarounds will be documented in the GitHub repository prior to inclusion in this document. See the repo Readme for details.

Local user accounts must match user accounts in HPSS

When a user accesses HPSS via GridFTP, home directory lookups and translations between usernames and user IDs are performed using the OS Name Service (ex /etc/passwd, ldap, nis, etc). The HPSS password file (ie HPSS_UNIX_AUTH_PASSWD) is not used by Globus. This has a direct impact on authentication and file access. Verify that HPSS users have the same UID on the local system and within HPSS. For example, given an HPSS user hpssuser1, the UID and GID returned from the following command should match the UID and GID of the same account within HPSS:

$ getent passwd hpssuser1
hpssuser1:x:12345:1000:HPSS User:/home/hpssuser1:/bin/bash
Warning

It is not necessary, and discouraged, to grant HPSS users the capability to log directly into the HPSS client node (ex using SSH).
Note

Although Globus does not make use of HPSS_UNIX_AUTH_PASSWD, the HPSS client API implementation does use the HPSS passwd file during the authenticaiton process. See the required HPSS files.

hpssftp credentials must be accessible by local, unprivileged accounts

Globus uses the account hpssftp to access HPSS initially then changes user ID to the authenticated HPSS user (ie hpssuser1). This removes the need to maintain per-user keytab files on the HPSS client node. However, this requires that the Globus process have access to the hpssftp keytab entry during the authentication phase which runs under the authenticating user’s UID.

Assuming the keytab for hpssftp is stored in /var/hpss/etc/hpss.keytab:

$ chmod 644 /var/hpss/etc/hpss.keytab

HPSS installations configured for Kerberos authentication must also allow non privileged users write access to HPSS temporary kerberos ticket cache, typically /var/hpss/cred:

$ chmod 1777 /var/hpss/cred
Note

The hpssftp keytab file must not be exposed to unprivileged users. Prevent local shell access by non privileged HPSS users (ex. PAM).

Verify hpssftp access via scrub

As a non privileged HPSS user on the local node, verify that the local account is able to authenticate successfully to HPSS. For example:

$ /opt/hpss/bin/scrub -a krb5 -p hpssftp -k -t /var/hpss/etc/hpss.keytab
scrub> quit

Verify basic operations via scrub

As a non privileged HPSS user, log into HPSS and perform some basic directory and file operations. Unlike the previous step, make sure these operations are performed as a non privileged user:

$ /opt/hpss/bin/scrub
/hpss/home/testuser1
scrub> mkdir testdir
scrub> rmdir testdir
scrub> open testfile wc
File created using COS 1 (Small File COS)
scrub> write 5k
.done (144.981 KB/sec)
scrub> close
scrub> unlink testfile
scrub> quit

Supported Linux Distributions

The HPSS DSI is compatible with the following Linux distributions:

  • RHEL/CentOS

Note

HPSS is supported on multiple minor versions of RHEL. Support of the Globus Connector is limited to the HPSS customer configurations that Globus has access to for validating new releases.

Supported Globus Connect Server Versions

The Globus Connect Server Installation Guide provides detailed documentation on the steps for installing and configuring a Globus endpoint. The HPSS DSI should be used with the latest version of Globus Connect Server 4.x.

Supported HPSS Versions

This connector has been verified against HPSS versions 7.3, 7.4 and 7.5. Building HPSS is beyond the scope of this guide though you should have a working HPSS installation. The DSI requires either a full HPSS build or a clnt HPSS build.

Required Files

The following HPSS files located in /var/hpss/etc are known to be required for operation of the HPSS DSI:

  • auth.conf

  • authz.conf

  • env.conf

  • ep.conf

  • group

  • HPSS.conf

  • hpss.keytab (or hpss.unix.keytab)

  • ieee_802_addr

  • passwd

  • site.conf

These HPSS issues severely impact performance so the patches are highly recommended.

BZ2819 - PIO 60 second delay impacts small file performance. There is a small percentage chance that, after a transfer completes, HPSS PIO will wait 60 seconds before informing the client that the transfer has completed. This fix has been implemented in 7.3.3p9, 7.3.4, 7.4.1p1 and 7.4.2.

BZ2856 - Enabling HPSS_API_REUSE_CONNECTIONS returns address already in use. This one sets a limit on how many active connections we can have. GridFTP and HPSS make considerable use of ephemeral TCP ports. Quick, successive file transfers can lead the system to run out of available ports. There is no fix for this HPSS issue at this time. The number of ephemeral ports can be increased and the amount of time a socket spends in timed wait can be decreased to help avoid this issue.

BZ7772 - PIO 5 second delay impacts small file performance. There is a high percentage chance that, after a transfer completes, HPSS PIO will wait 5 seconds before informing the client that the transfer has completed. This greatly impacts the performance of file retrieves and checksum operations. This fix has been implemented in 7.5.3+.

BZ7883 - Prevents successful transfers of files over 4GiB on HPSS versions 7.5.2+. Due to what appears to be a transfer length calculation error, transfer of files larger than 4GiB generate an EIO error at the 4GiB mark and the transfer terminates. This bug impacts all HPSS clients using the HPSS PIO interface. Upgrade to HPSS 7.5.2u5 / HPSS 7.5.3u1 to resolve this issue.

Performance

GridFTP installations benefit from and take full advantage of classes of service that use fixed length classic style allocation. In short, you’ll get the best performance from the GridFTP interface (actually any HPSS interface) if the segment count is below 32.

HPSS has multiple disk/tape allocation algorithms used to allocate space for incoming data. Fixed length allocation gives you equal size chunks to store data in. This was deemed wasteful because the last block was most certainly never filled. Variable length allocation was created to solve this problem; it will give you increasingly larger segments as data is stored and truncates the last block. This is a win for most situations when HPSS is unsure how much data is to be stored for the given file.

Using either of these allocation mechanisms (any variable length allocation or fixed w/o knowing the file size), HPSS is free to continue to allocate segments until all the data is stored. This has a definite performance impact because internally HPSS retrieves data in 32-segment chunks. This means when you request a file from HPSS, internally it breaks it up into multiple transfers, each of which is ⇐ 32 segments. Functionally, this is transparent to the client. In terms of performance, the client will see a high load followed by a pause followed by a high load, etc.

In order to avoid the performance hit, you can use fixed length allocation with segment counts < 32 and take advantage of the fact that any WELL-BEHAVED GridFTP client will inform HPSS of the size of the incoming file before the transfer begins. In fact, the DSI is designed to require this. If a GridFTP client is not well behavad, the DSI will act as though a zero length transfer is about to occur and will handle it as such. So you’ll know if the client is not doing the right thing.

How to Upgrade from Version 2.8 and Earlier

As of version 2.9, the HPSS Connector is installed from RPM instead of building from source. Because of this, several changes to the system for installation of previous versions need to be reversed so that they do not conflict with the RPM installation. This includes:

/etc/gridftp.d/hpss

Make sure this file does not exist. Also remove any other files within the same directory that supply configuration options for hpss_local.

/etc/ld.so.conf.d/gridftp_hpss_dsi.conf

Remove this file if it exists so that the GridFTP server can find the new HPSS DSI in /usr/lib64/. Be sure to run ldconfig to update system paths.

/var/hpss/etc/gridftp_hpss_dsi.conf

Delete or move this file to save it to avoid conflict with the RPM-managed version of the file.

Installing the HPSS Connector

The HPSS Connector is installed by RPM in order to simplify installation and enforce version requirements with requisite software. Unfortunately, Globus does not have access to all HPSS and RHEL version combinations to provide RPMs for all installations. If an RPM is not available for your version, instructions below explain how to obtain the source and generate the RPM.

Note

If you are interested in supporting this development by providing access to a HPSS build environment, please email support@globus.org.

Visit the release page and find the latest release. Note that release candidates are also available from this page and are indicated with the Pre-release badge and have a Release Candidate name. For production use, make sure to choose the most recent official release which will have the Latest Release badge.

Go to the Assets section under the latest release and you’ll see that you have two choices for installation. When possible, RPMs for supported platforms will be available for your convenience. If an RPM is not available, you’ll need to create it using the instructions below.

The RPM has a naming scheme of:

globus-gridftp-server-hpss-7.5-2.9-1.el6.x86_64.rpm
globus-gridftp-server-hpss

package name

-7.5-

This package is for HPSS version 7.5.X

-2.9-1

This is the first release of this connector version 2.9

.el6.

This package is for RHEL 6.X.

Download the RPM and matching .asc file which will allow you to verify that the RPM has not changed since creation. Using a recent version of gpg (>= 2.0), import the public key used for signing:

$ gpg --keyserver hkp://keys.gnupg.net --recv-keys 1EA106A24003C353
gpg: requesting key 4003C353 from hkp server keys.gnupg.net
gpg: key 4003C353: "Jason Alt <jasonalt@globus.org>" imported
gpg: Total number processed: 1
gpg:               imported: 1

And then verify the downloaded RPM:

$ gpg --verify globus-gridftp-server-hpss-7.5-2.9-1.el7.x86_64.rpm.asc globus-gridftp-server-hpss-7.5-2.9-1.el7.x86_64.rpm
gpg: Signature made Wed 06 Nov 2019 09:45:26 PM UTC using RSA key ID 4003C353
gpg: Good signature from "Jason Alt <jasonalt@gmail.com>"
gpg:                 aka "Jason Alt <jasonalt@globus.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: C36C 826C 18ED 73C3 38DC  FA53 1EA1 06A2 4003 C353

And finally, install the downloaded RPM using YUM:

$ sudo yum install ./globus-gridftp-server-hpss-7.5-2.9-1.el7.x86_64.rpm
Note

RPM distribution for this connector is complicated by HPSS distribution restrictions. We hope to have these worked out eventually so that we can support distribution through normal Globus channels.

If an RPM is not available for your platform, you will need to build it for your self. Under Assets, download and un-tar the assest name Source Code(tar.gz).

Install the development prerequisite libraries:

$ sudo yum install openssl-devel \
                   globus-gridftp-server-devel \
                   hpss-lib-devel

From within the un-tar’ed source code directory, use the release make target to build the RPMs for your platform:

$ make -f Makefile.bootstrap release

Once the RPM is built, install it:

$ sudo yum install ./globus-gridftp-server-hpss-7.5-2.9-1.el7.x86_64.rpm

Configure the DSI to communicate with HPSS

Review /var/hpss/etc/gridftp_hpss_dsi.conf in the source directory for any changes you may wish to make for your site. You will likely leave most of these options commented out to use their default values.

LoginName <user>

(optional) This is the HPSS service user used to initially authenticate with HPSS. GridFTP requires a privileged user with control permission on the core server’s client interface in order to log into HPSS and then change its credentials to that of the connecting user. Defaults to hpssftp which is also handled special by HPSS with regards to gate keeper operations.

AuthenticationMech [unix|krb5|gsi|spkm]

(optional) Defines the type of authentication that the DSI will perform when logging into HPSS. Note that this is not the authentication mechanism the GridFTP users will use; they always use GSI. Defaults to HPSS_API_AUTHN_MECH or HPSS_PRIMARY_AUTHN_MECH.

Authenticator [auth_keytab|auth_keyfile|auth_key|auth_passwd][:<file>]

(optional) Defines the location of credentials to be used by the DSI to authenticate to HPSS as LoginName. Defaults to HPSS_PRIMARY_AUTHENTICATOR. When this option points to a file, that file’s contents must be accessible to

  • For unix authentication, you can put the LoginName account credentials into its own file using hpss_unix_keytab and point Authenticator to that file instead of giving the GridFTP process read access to the target of HPSS_PRIMARY_AUTHENTICATOR.

  • For sites using kerberos authentication with HPSS, you’ll need to create a kerberos keytab file using the kerberos utility ktutil if you wish to seperate the LoginName credentials from the target of HPSS_PRIMARY_AUTHENTICATOR.

QuotaSupport [on|off]

(optional) This option is deprecated and will be removed in a future release.

UDAChecksumSupport [on|off]

(optional) Causes checksums to be stored within UDAs so that the checksum can be recalled later without bringing the file back from tape. It is recommended that you set this option to on to avoid uncessary tape recalls. The default is off.

Async Stage Requests Cause Red-Ball-of-Doom

Recent changes to make use of the async stage request API for HPSS in order to avoid inundating the core server with duplicate stage requests has exposed a deficiency for the DSI use case of HPSS. The HPSS async stage API expects the call to be available long term in order to receive stage completion messages. However, the GridFTP/DSI use case is a short-lived transient environment; the GridFTP process can not wait minutes/hours/days for stage completion messages. Users of DSI versions 2.6+ will see the impact as a red-ball-of-doom indicator in the HPSS GUI console. The warning is innocuous and can be ignored. IBM is aware of this issue and a change request has been created.

As a work around, users of 2.6 should update to 2.7 and all users of 2.7+ can use the blackhole sync method. This configures nc (netcat) to listen for stage completion messages intended for the DSI and discard whatever it receives. nc should be launched on a highly-available server reachable by the HPSS core servers (preferably run it directly on the core servers). Choose a port to use for receiving callback notifications on and run this command:

admin@hpss-core $ nc -v -v -k -l <port>

Once nc is running, add this to /etc/gridftp.d/hpss_issue_35 on the GridFTP nodes running the HPSS DSI:

$ASYNC_CALLBACK_ADDR <host>:<port>

HPSS Configuration Options

These options are generally set in /var/hpss/etc/env.conf and affect operation of data transfers. Some of these options may be required depending upon your configuration.

HPSS_API_HOSTNAME

This option selects the network interface used for data transfers between the Globus services you are configuring and the HPSS mover machine(s). If this is unset, data transfers use the default network interface. This option is generally necessary on multihomed nodes. It should be set on the node running GridFTP.

MVR_CLIENT_TIMEOUT

This controls the amount of time before a mover process will stop waiting for data from the Globus service in order to reclaim network resources. Default is 15 minutes. In very large file transfers, it is possible that movers may timeout before the transfer reaches data offsets which those movers are responsible for. This option is set on the mover nodes.

Note on Kerberos Configurations

Kerberos must be configured for access to the proper Kerberos realm that contains HPSS. This file is usually kept in /etc/krb5.conf. You may need to enable the allow_weak_crypto option in the [libdefaults] section if the DSI module can not talk to the HPSS servers.

Basic Endpoint Functionality Test

After completing the installation, you should do some basic transfer tests with your endpoint to ensure that it is working. We document a process for basic endpoint functionality testing here.

Troubleshooting

Below are some common issues encountered while using the Globus Transfer service with an endpoint running the HPSS connector along with possible resolutions to each problem.

Login Failure: No such file or directory

This error message indicates that hpss_LoadDefaultThreadState() has returned ENOENT causing the login procedure to fail. This is occurs when the UID of the authenticating user as known to the GridFTP process does not match the user’s ID as known by HPSS. See Local user accounts must match user accounts in HPSS.

Command Failed: Error (login)
Endpoint: xxxx
Server: xxxx
Message: Login Failed ---
Details: 530-Login incorrect. : GlobusError: v=1 c=PATH_NOT_FOUND\r\n530-GridFTP-Errno: 2\r\n530-GridFTP-Reason: System error in hpss_LoadDefaultThreadState()\r\n530-GridFTP-Error-String: No such file or directory\r\n530 End.\r\n

Login Failure: Operation not permitted

This error message indicates that hpss_SetLoginCred() failed with EPERM during the login procedure. This step in the login process accesses the keytab defined in AuthenticationMech so that the DSI can connect to HPSS as user LoginName. The error value indicates that the GridFTP process was unable to access the keytab file. See hpssftp credentials must be accessible by local unprivileged accounts.

Command Failed: Error (login)
Endpoint: xxxx
Server: xxxx
Message: Login Failed ---
Details: 530-Login incorrect. : GlobusError: v=1 c=INTERNAL_ERROR\r\n530-GridFTP-Errno: 1\r\n530-GridFTP-Reason: System error in hpss_SetLoginCred()\r\n530-GridFTP-Error-String: Operation not permitted\r\n530 End.\r\n

Login Failure: Cannot access config file

The following error implies that /var/hpss/etc/gridftp_hpss_dsi.conf does not exist.

 Error (login)
 Endpoint: XXX
 Server: XXX
 Message: Login Failed
---
Details: 530-Login incorrect. : GlobusError: v=1 c=PATH_NOT_FOUND\r\n530-GridFTP-Errno: 2\r\n530-GridFTP-Reason: System error in Can not access config file\r\n530-GridFTP-Error-String: No such file or directory\r\n530 End.\r\n

Login Failure: Invalid argument

If you receive this message, it is likely that /var/hpss/etc/site.conf is invalid.

 Error (login)
 Endpoint: XXX
 Server: XXX
 Message: Login Failed
---
Details: 530-Login incorrect. : GlobusError: v=1 c=INTERNAL_ERROR\r\n530-GridFTP-Errno: 22\r\n530-GridFTP-Reason: System error in hpss_LoadDefaultThreadState()\r\n530-GridFTP-Error-String: Invalid argument\r\n530 End.\r\n

Transfer Error: Data channel authentication failed

When transfers of zero-length files fail frequently as shown below, it can generally be traced back to an invalid threads setting in /etc/gridftp.d/hpss. The HPSS DSI must be configured with a threads value of 2 or greater. See Configure GridFTP to Use the HPSS Connector for the proper configuration.

 Error (transfer)
 Endpoint: XXX
 Server: XXX
 File: /~/zero_length_file
 Command: RETR ~/zero_length_file
 Message: Data channel authentication failed
---
Details: 500-Command failed. : globus_xio: The GSI XIO driver failed to establish a secure connection. The failure occured during a handshake read.\r\n500-globus_xio: Operation was canceled\r\n500-globus_xio: Operation timed out\r\n500 End.\r\n

Transfer Error: Operation timed out

Large file transfers to/from HPSS tend to span multiple sets of HPSS mover processes. Each set is responsible for a large contiguous chunk of the file transfer. First set transfers offsets 0-N, second set transfers (N+1)-M, and so on. These mover sets are all initialized at the beginning of the transfer.

Any mover will timeout after MVR_CLIENT_TIMEOUT seconds (defaults to 15 minutes). If a mover set does not start the transfer within this timeout, the entire transfer aborts. This is an HPSS issue, not a DSI issue.

This error condition is usually obvious from the following errors issued in MVR_CLIENT_TIMEOUT seconds + 5 minute intervals. See MVR_CLIENT_TIMEOUT for more details.

 2019-06-12 14:29:39
 Error (transfer)
 Endpoint: XXXX HPSS Archive (e38ee901-6d04-11e5-ba46-22000b92c6ec)
 Server: XXXX:2811
 Command: STOR ~/scratch_backups/XXXX
 Message: The operation timed out
---
Details: Timeout waiting for response
 2019-06-12 14:49:47
 Error (transfer)
 Endpoint: XXXX HPSS Archive (e38ee901-6d04-11e5-ba46-22000b92c6ec)
 Server: XXXX:2811
 File: /~/scratch_backups/XXXX
 Command: STOR ~/scratch_backups/XXX
 Message: Fatal FTP response
---
Details: 451-GlobusError: v=1 c=INTERNAL_ERROR\r\n451-GridFTP-Errno: 5011\r\n451-GridFTP-Reason: System error in hpss_PIOExecute\r\n451-GridFTP-Error-String: \r\n451 End.\r\n

Debug Log Collection

If errors occur, the following information can be collected in order to better diagnose the issue.

Collect HPSS Debug Information

Add /etc/gridftp.d/hpss_debug and include the following values which will cause the HPSS API to send verbose output to the file specified by <path>.

$HPSS_API_DEBUG 7
$HPSS_API_DEBUG_PATH <path>

Make sure the file exists prior to collecting debug information and is writable by the end user:

$ sudo touch <path>
$ sudo chmod 600 <path>

The server begin to write out debug information on the next connection to the GridFTP server.

Mailing List

Releases, upcoming features and discussions take place on the mailing list: hpss-discuss@globus.org

© 2010- The University of Chicago Legal