The Ceph Connector enables use of a Globus data access interface on a Ceph storage system, via the Ceph Object Gateway. This requires the installation of Globus Connect Server. The connector is available as an add-on subscription to organizations with a Globus Standard subscription - please contact us for pricing.

This document describes how to configure a Ceph Connector Storage Gateway and mapped collection. After these steps are complete, any authorized user can access Ceph buckets via that mapped collection or, if policies allow, create their owned guest collections to share access to those buckets.

The installation must be done by a system administrator, and has the following distinct set of steps:

  • Create a Ceph Admin account to allow Globus to access user keys

  • Create a Ceph storage gateway.

  • Create a Ceph mapped collection.

Please contact us at support@globus.org if you have questions or need help with installation and use of the Ceph Connector.


Ceph Connector Virtual Filesystem

The Ceph Connector provides a distributed object store, where each data object is accessed based on a bucket name and an object name.

The Ceph Connector attempts to make this look like a regular filesystem, by treating the bucket name as the name of a directory in the root of the storage gateway’s file system. For example, if a user has access to buckets bucket1 and bucket2, then those buckets would show up as directories when listing /.

The Ceph Connector also treats the / character as a delimiter in the Ceph API so that it can present something that looks like like subdirectories. For example, the object object1 in bucket1 would appear as /bucket1/object1 to the Ceph connector, and the object object2/object3 in bucket2 would appear as a file called object3 in the directory /bucket2/object2.

Ceph Admin User

The Ceph Connector requires a RADOS Gateway User with the users:read capability in order to map Globus users to Ceph keys.

Create a RADOS Gateway User with users:read capabilities

This identity is used by the Ceph Connector to look up keys associated with the Ceph user_id that the GridFTP session is authorized to run as.

This command must be run on a host with access to the ceph client.admin keyring in order to create the globus Ceph user_id:

$ radosgw-admin user create \
    --uid=globus \
    --display-name "Globus Ceph Connector" \
    --caps="users=read"

Note in the output for this command the access_key and secret_access_key fields of the keys object, as those will be needed in the next step. If you forget to record those, you can use the following command to retrieve the same information:

$ radosgw-admin user info --uid=globus

Ceph Configuration Encryption

All configuration information, including Ceph secrets and user credential information, is encrypted with a secret key on the node servicing the request before storing it locally and uploading it to GCS cloud services for distribution to other nodes in the endpoint. The encryption key is only available locally to the node and is secured such that only the node admin has access.

Storage Gateway

A Ceph Connector Storage Gateway is created with the command globus-connect-server storage-gateway create ceph, and can be updated with the command globus-connect-server storage-gateway update ceph.

Before looking into the policy options specific to the Ceph Connector, please familiarize yourself with the Globus Connect Server v5 Data Access Guide which describes the steps to create and update a storage gateway, using the POSIX connector as an example. The commands to create and update a storage gateway for the Ceph Connector are similar.

Ceph Connector Storage Gateway Policies

The Ceph Connector has policies to manage administrator credentials, to configure the URL of the S3-compatible API endpoint providing access to the Ceph RADOS Gateway, and to control access to an enumerated set of buckets and Ceph projects.

Endpoint

The --s3-endpoint command-line option is used by Globus Connect Server to contact the S3-compatible API to access data on a Ceph system.

Example 1. Selecting an S3 API Endpoint

For our example, we’ll use one running on ceph.example.org (you must of course use the URL of the Ceph that is run by your organization).

--s3-endpoint ceph.example.org

Administrator Credentials

The Ceph Connector uses administrator credentials to look up user credentials to access Ceph data. These credentials must belong to an account that has the users:read capability as described in Ceph Admin User.

The administrator credentials are configured using the --ceph-admin-secret-key and --ceph-admin-key-id command-line options.

Example 2. Set Administrator Credentials

For our example, we’ll assume the output of the commands in <ceph_admin_user>> yielded the secret key id SHOO1OOWOEY9OOJAbAP0 and key id Ang0eeCIAePh4eESaiv8AeB5TeI5ShaEziCe9oow. Use the actual values that were printed on your invocation of those commands. We’ll use these in the storage gateway create command-line options:

--ceph-admin-secret-key SHOO1OOWOEY9OOJAbAP0 \
--ceph-admin-key-id Ang0eeCIAePh4eESaiv8AeB5TeI5ShaEziCe9oow

Bucket Restrictions

The --bucket command-line option argument is the name of a bucket which is allowed access by this storage gateway.

Example 3. Restricting Access to Buckets

For our example, we’ll create a Storage Gateway that restricts access to two buckets owned by our organization: research-data-bucket-1 and and research-data-bucket-2 Users will be restricted to only those buckets when using collections created on this storage gateway, and only if their credential has permissions to do so.

--bucket research-data-bucket-1 --bucket research-data-bucket-2

If no buckets are configured, then any buckets accessible using the user’s key may be accessed by collections on this storage gateway. If any are configured, then they act as restrictions to which buckets are visible and accessible on collections on this storage gateway.

Creating the Storage Gateway

Now that we have decided on all our policies, we’ll use the command to create the storage gateway.

% globus-connect-server storage-gateway create ceph \
    "Ceph Storage Gateway" \
    --domain example.org \
    --s3-endpoint ceph.example.org \
    --ceph-admin-secret-key SHOO1OOWOEY9OOJAbAP0 \
    --ceph-admin-key-id Ang0eeCIAePh4eESaiv8AeB5TeI5ShaEziCe9oow \
    --bucket research-data-bucket-1 --bucket research-data-bucket-2

Storage Gateway Created: 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab

This was successful and outputs the ID of the new storage gateway ( 7187a9a0-68e4-48ea-b3b9-7fd06630f8ab in this case) for our reference. Note that this will always be a unique value if you run the command. If you forget the id of a storage gateway, you can always use the command globus-connect-server storage-gateway list to get a list of the storage gateways on the endpoint.

You can also add other policies to configure additional identity mapping and path restriction policies as described in the Globus Connect Server v5 Data Access Guide.

Note that this creates the storage gateway, but does not yet make it accessible via Globus and HTTPS. You’ll need to follow the steps in the next section.

Collection

A Ceph Collection is created with the command globus-connect-server collection create, and can be updated with the command globus-connect-server collection update.

As the Ceph Connector does not introduce any policies beyond those used by the base collection type, you can follow the sequence in the {data-access-guide-collections}. Recall however, that the paths are interpreted as described above in Ceph Connector Virtual Filesystem.

Appendix A: Document Types for the Ceph Connector

CephStoragePolicies Document

Connector-specific storage gateway policies for the Ceph connector

One of the following schemas:

{
  "DATA_TYPE": "ceph_storage_policies#1.0.0",
  "ceph_admin_key_id": "string",
  "ceph_admin_secret_key": "string",
  "s3_buckets": [
    "string"
  ],
  "s3_endpoint": "string"
}