Last Updated: Sept 20, 2018
The HPSS connector can be used for accessing and sharing of data on an HPSS storage system. The connector is available as an add-on subscription to organizations with a Globus Standard subscription - please contact us for pricing.
This document describes the steps needed to install an endpoint and the HPSS connector needed to access the storage system. This installation should be done by a system administrator, and once completed, users can use the endpoint to access HPSS storage via Globus to transfer, share and publish data on the system.
In order for the HPSS DSI to function properly on the HPSS client node, please verify the following items.
Local user accounts must match user accounts in HPSS
When a user accesses HPSS via GridFTP, home directory lookups and translations between usernames and user IDs are performed using the OS Name Service (ex /etc/passwd, ldap, nis, etc). The HPSS password file (ie HPSS_UNIX_AUTH_PASSWD) is not used by Globus. This has a direct impact on authentication and file access. Verify that HPSS users have the same UID on the local system and within HPSS. For example, given an HPSS user hpssuser1, the UID and GID returned from the following command should match the UID and GID of the same account within HPSS:
$ getent passwd hpssuser1 hpssuser1:x:12345:1000:HPSS User:/home/hpssuser1:/bin/bash
hpssftp credentials must be accessible by local, unprivileged accounts
Globus uses the account
hpssftp to access HPSS initially then changes user ID to the authenticated HPSS user (ie hpssuser1). This removes the need to maintain per-user keytab files on the HPSS client node. However, this requires that the Globus process have access to the
hpssftp keytab entry during the authentication phase which runs under the authenticating user’s UID.
Assuming the keytab for
hpssftp is stored in
# chmod 644 /var/hpss/etc/hpss.keytab
HPSS installations configured for Kerberos authentication must also allow non privileged users write access to HPSS temporary kerberos ticket cache, typically
# chmod 1777 /var/hpss/cred
hpssftpkeytab file must not be exposed to unprivileged users. Prevent local shell access by non privileged HPSS users (ex. PAM).
Verify hpssftp access via scrub
As a non privileged HPSS user on the local node, verify that the local account is able to authenticate successfully to HPSS. For example:
$ /opt/hpss/bin/scrub -a krb5 -p hpssftp -k -t /var/hpss/etc/hpss.keytab scrub> quit
Verify basic operations via scrub
As a non privileged HPSS user, log into HPSS and perform some basic directory and file operations. Unlike the previous step, make sure these operations are performed as a non privileged user:
$ /opt/hpss/bin/scrub /hpss/home/testuser1 scrub> mkdir testdir scrub> rmdir testdir scrub> open testfile wc File created using COS 1 (Small File COS) scrub> write 5k .done (144.981 KB/sec) scrub> close scrub> unlink testfile scrub> quit
Functional HPSS and Globus Connect Server installations are required for installation and use of the HPSS connector. Globus Connect Server can be hosted on any machine that can connect to the HPSS core server and data movers.
Specifically, the latest version of these packages are required:
hpss-lib-devel (replace with hpss-clnt-devel for HPSS 7.5)
Supported Linux Distributions
The HPSS DSI is compatible with the following Linux distributions:
Supported Globus Connect Server Versions
The Globus Connect Server Installation Guide provides detailed documentation on the steps for installing and configuring a Globus endpoint. The HPSS DSI should be used with the latest version of Globus Connect Server 4.x.
Supported HPSS Versions
This connector has been verified against HPSS versions 7.3 and 7.4. Building HPSS is beyond the scope of this guide though you should have a working HPSS installation. The DSI requires either a full HPSS build or a clnt HPSS build.
The following HPSS files located in
/var/hpss/etc are known to be required for operation of the HPSS DSI:
hpss.keytab (or hpss.unix.keytab)
Recommended HPSS Patches
These HPSS issues severely impact performance so the patches are highly recommended.
BZ2819 - PIO 60 second delay impacts small file performance. There is a small percentage chance that, after a transfer completes, HPSS PIO will wait 60 seconds before informing the client that the transfer has completed. This fix has been implemented in 7.3.3p9, 7.3.4, 7.4.1p1 and 7.4.2.
BZ2856 - Enabling HPSS_API_REUSE_CONNECTIONS returns address already in use. This one sets a limit on how many active connections we can have. GridFTP and HPSS make considerable use of ephemeral TCP ports. Quick, successive file transfers can lead the system to run out of available ports. There is no fix for this HPSS issue at this time. The number of ephemeral ports can be increased and the amount of time a socket spends in timed wait can be decreased to help avoid this issue.
GridFTP installations benefit from and take full advantage of classes of service that use fixed length classic style allocation. In short, you’ll get the best performance from the GridFTP interface (actually any HPSS interface) if the segment count is below 32.
HPSS has multple disk/tape allocation algorithms used to allocate space for incoming data. Fixed length allocation gives you equal size chunks to store data in. This was deemed wasteful because the last block was most certainly never filled. Variable length allocation was created to solve this problem; it will give you increasingly larger segments as data is stored and truncates the last block. This is a win for most situations when HPSS is unsure how much data is to be stored for the given file.
Using either of these allocation mechanisms (any variable length allocation or fixed w/o knowing the file size), HPSS is free to continue to allocate segments until all the data is stored. This has a definite performance impact because internally HPSS retrieves data in 32-segment chunks. This means when you request a file from HPSS, internally it breaks it up into multiple transfers, each of which is ⇐ 32 segments. Functionally, this is transparent to the client. In terms of performance, the client will see a high load followed by a pause followed by a high load, etc.
In order to avoid the performance hit, you can use fixed length allocation with segment counts < 32 and take advantage of the fact that any WELL-BEHAVED GridFTP client will inform HPSS of the size of the incoming file before the transfer begins. In fact, the DSI is designed to require this. If a GridFTP client is not well behavad, the DSI will act as though a zero length transfer is about to occur and will handle it as such. So you’ll know if the client is not doing the right thing.
Build and Install the DSI
Install the development prerequisite libraries:
$ sudo yum install openssl-devel \ globus-gridftp-server-devel \ globus-gridftp-server-control-devel \ hpss-clnt-devel \ libtirpc-devel
Configure the HPSS DSI with the following options:
$ ./configure --with-hpss=<hpss> -libdir=<dsi_target_directory>
Where <dsi_target_directory> is location to install the DSI into, typically /usr/local/girdftp_hpss_dsi. <hpss> is the location of the HPSS client build, typically /opt/hpss.
Build and install the DSI into libdir:
$ make $ make install
Finally, since the DSI has been installed into a non system location, it is necessary to configure the runtime linker to find the DSI when needed. Create
/etc/ld.so.conf.d/gridftp_hpss_dsi.conf with a single line that is the location of the library within the DSI installation. For example, assuming the DSI was installed into /usr/local/gridftp_hpss_dsi,
/etc/ld.so.conf.d/gridftp_hpss_dsi.conf would contain:
ldconfig to update the runtime linker:
$ sudo ldconfig
Configure GridFTP to Use the HPSS Connector
Add these lines to /etc/gridftp.d/hpss:
load_dsi_module hpss_local disable_command_list SCKS,APPE,REST threads 2
Configure the DSI to communicate with HPSS
Review gridftp_hpss_dsi.conf in the source directory for any changes you may wish to make for your site.
This is the HPSS service user used to initially authenticate with HPSS. GridFTP requires a privileged user with control permission on the core server’s client interface in order to log into HPSS and then change its credentials to that of the connecting user. HPSS has a special privileged user named hpssftp which has the necessary permissions. Thus we recommend that you use hpssftp as the value of LoginName.
Define the type of authentication HPSS has been configured for; this is not related to user authentication to the GridFTP server.
Defines the location of credentials to be used by the DSI to authenticate to HPSS as LoginName. <auth_file> must point to a file containing the credentials necessary for the DSI to connecto to HPSS.
For unix authentication sites, make sure the HPSS account hpssftp is valid and included within a HPSS keytab file (typically located in /var/hpss/etc/).
$ /opt/hpss/bin/hpss_unix_keytab -f /var/hpss/etc/gridftp.keytab add hpssftp
For sites using kerberos authentication with HPSS, you’ll need to create and use a kerberos keytab file (rather than a unix keytab). The kerberos utility ktutil can be used for that purpose.
Copy gridftp_hpss_dsi.conf into place on the target system hosting Globus Connect Server. The DSI will use the following search order for locating the configuration file:
1) $HPSS_PATH_ETC/gridftp_hpss_dsi.conf 2) /var/hpss/etc/gridftp_hpss_dsi.conf
Make sure the configuration file’s permissions allow for the GridFTP process to read it.
Note on Kerberos Configurations
Kerberos must be configured for access to the proper Kerberos realm that contains HPSS. This file is usually kept in /etc/krb5.conf. You may need to enable the allow_weak_crypto option in the [libdefaults] section if the DSI module can not talk to the HPSS servers.
Basic Endpoint Functionality Test
After completing the installation, you should do some basic transfer tests with your endpoint to ensure that it is working. We document a process for basic endpoint functionality testing here.
Below are some common issues encountered while using the Globus Transfer service with an endpoint running the HPSS connector along with possible resolutions to each problem.
Login Failure: No such file or directory
This error message indicates that hpss_LoadDefaultThreadState() has returned
ENOENT causing the login procedure to fail. This is occurs when the UID of the authenticating user as known to the GridFTP process does not match the user’s ID as known by HPSS. See Local user accounts must match user accounts in HPSS.
Command Failed: Error (login) Endpoint: xxxx Server: xxxx Message: Login Failed --- Details: 530-Login incorrect. : GlobusError: v=1 c=PATH_NOT_FOUND\r\n530-GridFTP-Errno: 2\r\n530-GridFTP-Reason: System error in hpss_LoadDefaultThreadState()\r\n530-GridFTP-Error-String: No such file or directory\r\n530 End.\r\n
Login Failure: Operation not permitted
This error message indicates that hpss_SetLoginCred() failed with
EPERM during the login procedure. This step in the login process accesses the keytab defined in
AuthenticationMech so that the DSI can connect to HPSS as user
LoginName. The error value indicates that the GridFTP process was unable to access the keytab file. See hpssftp credentials must be accessible by local unprivileged accounts.
Command Failed: Error (login) Endpoint: xxxx Server: xxxx Message: Login Failed --- Details: 530-Login incorrect. : GlobusError: v=1 c=INTERNAL_ERROR\r\n530-GridFTP-Errno: 1\r\n530-GridFTP-Reason: System error in hpss_SetLoginCred()\r\n530-GridFTP-Error-String: Operation not permitted\r\n530 End.\r\n
Transfer Error: Operation timed out
When transfers of zero-length files fail frequently as shown below, it can generally be traced back to an invalid threads setting in
/etc/gridftp.d/hpss. The HPSS DSI must be configured with a
threads value of 2 or greater. See Configure GridFTP to Use the HPSS Connector for the proper configuration.
Error (transfer) Endpoint: XXX Server: XXX File: /~/zero_length_file Command: RETR ~/zero_length_file Message: Data channel authentication failed --- Details: 500-Command failed. : globus_xio: The GSI XIO driver failed to establish a secure connection. The failure occured during a handshake read.\r\n500-globus_xio: Operation was canceled\r\n500-globus_xio: Operation timed out\r\n500 End.\r\n
Debug Log Collection
If errors occur, the following information can be collected in order to better diagnose the issue.
Collect HPSS Debug Information
/etc/gridftp.d/hpss to include the following values which will cause the HPSS API to send verbose output to the file specified by <path>.
$HPSS_API_DEBUG 7 $HPSS_API_DEBUG_PATH <path>
Make sure the file exists prior to collecting debug information and is writable:
$ sudo touch <path> $ sudo chmod 600 <path>
The server begin to write out debug information on the next connection to the GridFTP server.