Globus Connect Server Streaming Guide
1. Introduction
Globus streaming enables applications to stream data securely across wide area networks (WANs). As a capability of Globus Connect Server (GCS), the streaming connector creates secure tunnels using the same mechanisms used for establishing data transfer channels. Once established, these tunnels support bi-directional data streaming between the two resources.
An example use case is streaming data from scientific instruments to high-performance computing (HPC) centers for real-time processing. The bidirectional nature of these streams also enables feedback loops, supporting scenarios requiring near-real-time steering or control of remote instruments.
Administrators deploy stream gateways on the GCS endpoints and configure stream access points with access policies. These access points support the same authentication and authorization mechanisms available for mapped collections used for data transfer, ensuring consistent security posture across these constructs.
Users authenticate to stream access points and establish secure tunnels between them. The tunnel information is then used by applications to seamlessly stream data to each other via the tunnel.
Globus provides a library that transparently handles tunnel communications. Applications simply read from and write to local ports while the library routes traffic through the secure tunnel without requiring application code modifications.
The Globus web application offers interfaces for discovering stream access points, creating tunnels, and monitoring and managing established tunnels.
1.1. Key Highlights
-
Provides secure data streaming across WAN without requiring pre-deployed keys (e.g. SSH keys)
-
Leverages well-established mechanisms for secure wide area network connections used in Globus data transfer
-
Consistent security model, where GCS security configuration (authentication and authorization policies) is applied to streaming capabilities
-
Globus provided tooling for minimal to no code change in the applications that stream data
1.2. High Level Walk Through From a User Perspective
-
The user discovers the stream access points on resources they want to stream between. For example, stream access point on a GCS deployment at an instrument facility, and the HPC center. They then authenticate to meet the policy on each of the stream access points, and create a tunnel between the two.
-
Globus transfer service uses the control channel connection to the GCS deployments at both sites to establish a secure tunnel between them.
-
A tunnel identifier is returned to the user.
-
The user configures their application to use the tunnel to stream data.
-
Globus seamlessly routes application connections through the tunnel.
2. Admin Guide
2.1. Deploying Globus Connect Server with Streaming Support
2.1.1. Installation Instructions
Globus Connect Server with streaming support is currently available as a beta release from our preview repository. The installation procedure follows the standard documentation to create an endpoint and deploy a node, and then continues with the streaming-specific setup.
Set Up a New Endpoint
The initial install and deployment steps are done with the standard install process. Complete Section 4 (at least 4.1 though 4.5) of the Globus Connect Server installation guide before continuing to the next section.
Install Packages from the Streaming Preview Repository
After completing the standard deployment process and successfully logging in to your new endpoint, continue with the following instructions for your distribution.
Enable the preview repository using the stable repository as a template:
sed -e 's#/stable/#/preview/streaming/#' -e 's#Stable#Preview#' \
/etc/yum.repos.d/globus-connect-server-5-stable-*.repo | \
sudo tee /etc/yum.repos.d/globus-connect-server-5-preview.repo
Install the updated streaming packages:
sudo dnf --best install globus-connect-server54 globus-gridftp-server-tunnel
Enable the preview repository using the stable repository as a template:
sed -e 's#/stable/#/preview/streaming/#' \
/etc/apt/sources.list.d/globus-connect-server-stable-*.list | \
sudo tee /etc/apt/sources.list.d/globus-connect-server-preview.list
sudo apt-get update
Install the updated streaming packages:
sudo apt-get install globus-connect-server54 globus-gridftp-server-tunnel
2.2. Creating a Stream Gateway and Stream Access Point
Once the installation and deployment is complete, a stream gateway can be created. The stream gateway controls access to a stream access point much like a storage gateway controls access to a mapped collection. When you create a stream gateway, a stream access point is automatically created.
2.2.1. Stream Gateway Configuration
A stream gateway is created with the globus-connect-server stream-gateway create command. The options include familiar storage-storage authorization policies such as --domain, --identity-mapping, and --user-allow/--user-deny, as well as the streaming-specific policies --lan-name and --lan-secret-required.
We’ll create a basic stream gateway with the default streaming policies.
We’ll call the stream gateway Example Listener and allow access from all users with an example.org identity.
globus-connect-server stream-gateway create "Example Listener" --domain example.org
The stream gateway and stream access point setup is complete.
If you have access to another streaming access point, continue with the User Guide to create a tunnel between that access point and the one you created. Or, you can repeat these steps to create another streaming endpoint for testing.
3. User Guide
To create a tunnel, you will need access on a stream access point on each side of the desired data stream.
3.1. Create a Secure Tunnel Between Two Stream Access Points
Start by visiting https://app.globus.org/streams. In the top right corner of the page, click the (+) Create Tunnel link. This will take you to https://app.globus.org/streams/create:
The first two fields define each end of the tunnel. The Initiator Access Point will be the Stream Access Point ID of the side of your tunnel that will be making the active connection. The Listening Access Point is the side of your tunnel that will be listening for connections. In each box you can search for your stream access point by name. The Label field is an optional friendly name for the tunnel. The Lifetime field is the number of minutes that the tunnel will be available. When the lifetime expires, the tunnel will be automatically stopped.
3.2. Manage Your Tunnels
Once created, your tunnel will be displayed in a list at https://app.globus.org/streams:
Here you can monitor the state of your tunnels, stop your tunnels, and delete your tunnels. You can also add the application listener’s contact string by pressing the Play button. This will activate the tunnel. Tunnel activation can also be done with the Globus Tunnel tools as described in the next section.
3.3. Update Your Streaming Application to Use a Tunnel
Globus provides tooling to help adapt your application to stream data through a Globus tunnel. While this may not always be possible, our goal is to require zero changes to existing applications in order to use a Globus tunnel.
Any streaming workflow that makes use of Globus tunnels will consist of two applications, a listener and an initiator. The listener is the side that passively waits for an incoming TCP connection to accept. The initiator is the side that actively forms TCP connections. While the data flow between the applications can be bidirectional, the initiator must connect to the listener; the listener cannot connect to the initiator.
In the examples here, we will use telnet as the initiator application and netcat (nc -l <port>) as our listener. Telnet will initiate a stream to netcat through a Globus tunnel.
3.3.1. globus-tunnel CLI
The systems where the applications run will require the globus-tunnel CLI. This is a tool that helps set up your application environment for use with Globus tunnels. It logs the user into the Globus services and sets some environment variables. It uses the installed LD_PRELOAD library, which seamlessly intercepts your application’s socket calls and redirects them for use with Globus tunnels.
globus-tunnel can be installed from the preview repository into a python virtual environment using the following steps.
python -mvenv ~/tunnel-cli
. ~/tunnel-cli/bin/activate
pip install --extra-index-url https://downloads.globus.org/globus-connect-server/preview/streaming/wheels/ globus-tunnel
3.3.2. Listener Application
In order to initialize the listener you will need the ID of the tunnel that you created. You may be asked to log in before the listener can be initialized. In order to avoid the possibility of multiple authentication flows, you may optionally provide the ID of the GCS endpoint that hosts the Listener Access Point.
In this example, the nc application will be listening on the IP address 10.0.2.164 on port 8888. The --listener-contact-string option tells Globus this will be the listener application and the contact string where it will be listening.
globus-tunnel environment initialize --listener-contact-string 10.0.2.164:8888 ${TUNNEL_ID}
Now that the environment is initialized, the listening application can be started. To start the application so that it uses the tunnel, run it using the globus-tunnel-run.sh helper script in the following way:
globus-tunnel-run.sh -p 8888 ${TUNNEL_ID} nc -l 8888
This tells the system to run your application in the Globus tunnel environment that you just established. The -p 8888 switch tells Globus that your application intends to listen on port 8888 for any connection request that comes through the tunnel_id. The remaining arguments are the exact arguments that you use to run your application. In this case it is netcat listening on port 8888.
3.3.3. Initiator Application
The initiator application environment is initialized in a similar way as the listener application.
globus-tunnel environment initialize ${TUNNEL_ID}
This time we do not require the listening address because this side is forming active connections out. It will retrieve the contact string at run time via globus-tunnel.
However, note from the output Your contact string is: globus.0e8a675b-6b84-4220-89e4-a6a7a0d823fb:3664. This is very important to record. Whenever your application wants to form a connection through the tunnel, it must use globus.0e8a675b-6b84-4220-89e4-a6a7a0d823fb as the hostname and 3664 as the port.
We will use those values to connect using the telnet application. Start the telnet application in the following way, again using the globus-tunnel-run.sh helper script:
globus-tunnel-run.sh ${TUNNEL_ID} telnet globus.0e8a675b-6b84-4220-89e4-a6a7a0d823fb 3664
You should see a successful connection, and any output you type will be shown in the output of the listener application.
4. Support
For questions on streaming, or to report issues with the preview release, please contact support@globus.org.