Globus Connect Server Identity Mapping Guide
1. Identity Mapping Policies
Globus Connect Server v5.4 supports a flexible system for mapping user identity information in Globus to the local account needed to access data on a variety of storage systems. This includes a default mapping for cases where there is only one allowed domain, as well as pattern-based mappings and callouts to external programs for custom mapping algorithms.
1.1. Default Identity to Username Mapping
By default, if the storage gateway is configured to allow identities from a single domain, one of the following mappings are done from the user’s identity in the allowed domain to the storage gateway user namespace:
-
For connectors such as Black Pearl, Ceph, POSIX, when Globus Connect Server maps an identity to an account, it strips off the data after the @ character. So the username user@example.org is mapped to the account user.
-
Some connectors (Box, Google Cloud Storage, Google Drive) require that the account name must be qualified by a domain name. When Globus Connect Server maps an identity to an account, it retains the entire username by default, so the username user@example.org is mapped to the account user@example.org.
1.2. Custom Identity Mapping
Globus Connect Server provides two ways for you to implement a custom Globus identity to account mapping: expression-based and external program.
With expression-based mapping you can write rules that extract data from fields in the Globus identity document to form storage gateway-specific usernames. If there is a regular relationship between most of your users' Identity information to their account names, this is probably the most direct way to accomplish the mapping. See the Expression-Based Account Mapping Reference section of this document for a reference on the syntax and features of this mapping system.
With external program mappings you can use any mechanism you like (static mapping, ldap, database, etc) to look up account information and return the mapped account user name. If you have an account system that has usernames without a simple relationship to your users' Globus identities, or that requires interfacing with an accounting system, this may be necessary. See the External Mapping Programs Reference section of this document for a reference on the expected inputs, outputs and command-line options used in this system.
You can configure a storage gateway to use these custom mappings by passing a JSON document or a reference to a file containing a JSON document as the argument to the --identity-mapping command-line option to globus-connect-server storage-gateway create or globus-connect-server storage-gateway update. The properties of the JSON document are described in Identity Mapping Document Types, but are also illustrated in the following section showing common recipes.
2. Mapping Recipes
This section describes some mapping recipes that will hopefully help you understand the mapping system and provide a useful base for deploying your own custom mappings.
2.1. Map identity username
An administrator wants to map an identity username
to an account name for a POSIX,
Ceph, or BlackPearl system, where the names match the identity username
property if the username
is from the example.org
domain:
{
"DATA_TYPE": "expression_identity_mapping#1.0.0",
"mappings": [
{
"source": "{username}",
"match": "(.*)@example\\.org",
"output": "{0}"
}
]
}
For Storage Gateways using the Google Drive, Google Cloud Storage or Box Connectors, the expressions would be slightly different to include the domain name in the output:
{
"DATA_TYPE": "expression_identity_mapping#1.0.0",
"mappings": [
{
"source": "{username}",
"match": "(.*)@example\\.org",
"output": "{0}@example.org"
}
]
}
2.2. Map identities from different domains
An administrator wants to map identities from domain example.org
as in the
previous example, but also support a list of users from another domain with
explicit account mappings. This might be the situation where there is an
identity provider that matches the local username list, but guests from other
institutions might also have accounts based on their email address.
{
"DATA_TYPE": "expression_identity_mapping#1.0.0",
"mappings": [
{
"source": "{username}",
"match": "(.*)@example\\.org",
"output": "{0}"
},
{
"source": "{email}",
"match": "user@example\\.edu",
"output": "example_edu_user",
"ignore_case": true
},
{
"source": "{email}",
"match": "user@example\\.com",
"output": "example_com_user",
"ignore_case": true
}
]
}
2.3. Map Application identities
An administrator wants to allow an application access to a specific account.
The application is executed using an identity from
https://developers.globus.org, and the administrator knows the client_id
of this identity and wants to map that to the local account globus_app
.
{
"DATA_TYPE": "expression_identity_mapping#1.0.0",
"mappings": [
{
"source": "{id}",
"match": "bcbfd6cf-7a36-4090-bc55-39195db89d94",
"output": "globus_app",
"literal": true
}
]
}
2.4. Map Identities to a GSuite or Box Domain
An administrator wants to grant access to a Google Drive Storage Gateway, but
the GSuite or Box domain does not match the domain used in the username
field
of the identities from the site’s identity provider. For example, a site
example.org
uses gsuite.example.org
as their GSuite domain.
{
"DATA_TYPE": "expression_identity_mapping#1.0.0",
"mappings": [
{
"source": "{username}",
"match": "(.*)@example\\.org",
"output": "{0}@gsuite.example.org"
}
]
}
2.5. Example Mapping Program
This example mapping program loads a mapping file
from disk based on the storage gateway id and then compares the contents
to the identity set to map the user. This example is a configuration for
using this file if it were installed in /opt/globus/mapapp-example.py
and run
with the default Python 3 interpreter:
{
"DATA_TYPE": "external_identity_mapping#1.0.0",
"command": [
"/usr/bin/python3",
"/opt/globus/mapapp-example.py"
]
}
3. Validating Identity Mappings
An administrator can use the globus-idm-validator tool to validate the behavior of an identity mapping configuration document against an identity document. This command takes as input a JSON file containing an array of identity mappings and an identity set.
As an example, to test the Map identity username recipe, do the following:
idmap.json
[
{
"DATA_TYPE": "expression_identity_mapping#1.0.0",
"mappings": [
{
"source": "{username}",
"match": "(.*)@example\\.org",
"output": "{0}"
}
]
}
]
idset.json
{
"identities": [
{
"email": "user@example.org",
"id": "ce986100-ab06-4991-9fa8-5302d2215bee",
"identity_provider": "2eb7f0cf-5c3c-4073-9fc7-8b3df5f0389f",
"identity_type": "login",
"name": "Joe User",
"organization": "Example Organization",
"status": "used",
"username": "user@example.org"
}
]
}
$ globus-idm-validator -c idmap.json -i idset.json
All mapped identities:
[
{
"ce986100-ab06-4991-9fa8-5302d2215bee": [
"user"
]
}
]
This shows that the identity mapping correctly maps user@example.org
to user
.
If there were an error in the mapping file it would be displayed. If the mapping
yielded no result, the output array for the identity uuid would be empty.
4. Expression-Based Account Mapping Reference
Admins may configure one or more expression-based mappings to map identity resources to storage gateway accounts. These mappings consist of three expressions: the source expression, the match expression, and the output expression.
When mapping an identity to a storage gateway account, Globus Connect Server does the following after verifying the identity is in an allowed domain and the login session matches the high assurance policies if applicable.
-
Interpolate the identity data into the source string
-
Apply the match expression to the results
-
If matching succeeds, interpolate the match results into the output string
4.1. Source Expression
A source expression consists of a text string which can have properties from
an identity resource interpolated into the string data. Any value available in the
identity resource may be interpolated into the source string. In most
cases, one of username
, id
, or email
is used. The syntax to interpolate
a property into the string is to surround the property name with curly braces.
4.1.1. Source Expression Example
These examples assume the following as the identity resource being interpolated into the source string:
{
"id": "ce986100-ab06-4991-9fa8-5302d2215bee",
"username": "user@example.org",
"email": "user@email.example.org"
}
Source String | Interpolated Value |
---|---|
|
|
|
|
|
|
4.2. Match Expression
A match expression is a regular expression which is evaluated against the interpolated source string. It is matched against the entire string, implicitly anchored at the beginning and end of that string.
This expression supports the following special characters:
-
.
any single character match -
?
for zero or one of the preceding match -
*
for zero or more of the preceding match -
|
for alternative match branches -
()
to group a match and create a numbered back reference for output interpolation -
\
to escape any of the above,\
, or"
from special processing.
In the JSON serialization of a match expression, all \
characters must be
doubled in order to be included in a string.
4.2.1. Match Expression Examples
These examples assume the following as the interpolated source expression (the final example from the previous section):
"user@example.org:ce986100-ab06-4991-9fa8-5302d2215bee"
Match Expression | Matches Source | Description | Groups |
---|---|---|---|
|
Yes |
Match users in a domain, capturing the username portion as the first group |
0: |
|
Yes |
Match a specific identity ID. |
None |
|
Yes |
Match a user with an identity
from either the
|
0: |
|
Yes |
Match a specific username. |
None |
|
Yes |
Match a username whose domain may contain an optional subdomain |
0: |
|
No |
Attempt to match a domain that doesn’t match the source |
None |
4.3. Matching Flags
There are two matching flags supported by this language to make some common forms of matches easier to write.
4.3.1. Ignore Case
The ignore_case
matching flag causes the match expression to be interpreted
in a case-insensitive way. This may be necessary if user email addresses are
not consistently capitalized by the identity provider.
4.3.2. Literal
The literal
matching flag causes the match expression to be interpreted as a
literal string match. All special characters described in Match Expression
are treated literally and no special processing is done for them. This may be
helpful if no grouping is needed when matching to simplify an expression
containing literal .
or other characters.
4.4. Output Expression
An output expression consists of a text string which can have either
properties from an
identity
resource or groups from the match expression interpolated into the string
data. Any value available in the identity resource may be interpolated into the
source string as in the source expression. Groups are numbered from 0
from the leftmost parenthesis, and are referenced using a similar syntax as
property interpolation. For example the string {0}
is replaced with the first
group match.
4.4.1. Output Examples
These examples assume the following identity and match groups:
{
"id": "ce986100-ab06-4991-9fa8-5302d2215bee",
"username": "user@example.org",
"email": "user@email.example.org"
}
Match Group | Value |
---|---|
0 |
|
1 |
|
Expression | Description | Output |
---|---|---|
|
Map to user without domain |
|
|
Map to user without domain with some prefix character |
|
|
Map to user with domain |
|
|
Map to a fixed user name |
|
|
Map to the email address of the identity |
|
|
Replace the domain in the mapping |
|
5. External Mapping Programs Reference
If the matching language is insufficient to process the Identity to Account mapping for a site, a developer may write a standalone program which Globus Connect Server calls when it needs to map an identity. This may be the case if the mapping requires a database lookup or other interpretation of the identity that can not be implemented using the matching syntax.
The program is configured as a property of the storage gateway and may include a full command path as well as any required command-line options. When Globus Connect Server requires an identity to be mapped, the program is called, passing in a JSON document on stdin and expecting as output a JSON document on stdout in formats described below.
The program will run as the local GCS Manager account gcsweb
. Ensure
that the file and path permissions allow execution by gcsweb
. If SELinux is
enforcing, the program must be labeled bin_t
in a directory also labeled bin_t
.
5.1. Command Line Options
The external program must be able to handle the following command-line options:
Option | Description |
---|---|
|
Map accounts for the given connector. The ID is the UUID of the connector, as defined in the next table. This allows the program to implement different behavior based on the connector in use. This could be used, for example, to include the domain name in results from a Box mapping request. This option will always be present when called from the GCS Manager. |
|
Map accounts for the given storage gateway. This allows the program to support multiple storage gateways and implement different behavior for each. |
|
Return all Matching Accounts. If not present, return the first account which the identity matches. If present, return all accounts which the identity matches. |
Connector | Connector ID |
---|---|
ActiveScale |
7251f6c8-93c9-11eb-95ba-12704e0d6a4d |
Azure Blob |
9436da0c-a444-11eb-af93-12704e0d6a4d |
SpectraLogic BlackPearl |
7e3f3f5e-350c-4717-891a-2f451c24b0d4 |
Box |
7c100eae-40fe-11e9-95a3-9cb6d0d9fd63 |
Ceph |
1b6374b0-f6a4-4cf7-a26f-f262d9c6ca72 |
Dropbox |
49b00fd6-63f1-48ae-b27f-d8af4589f876 |
Google Cloud Storage |
56366b96-ac98-11e9-abac-9cb6d0d9fd63 |
Google Drive |
976cf0cf-78c3-4aab-82d2-7c16adbcc281 |
HPSS |
fb656a17-0f69-4e59-95ff-d0a62ca7bdf5 |
iRODS |
e47b6920-ff57-11ea-8aaa-000c297ab3c2 |
OneDrive |
28ef55da-1f97-11eb-bdfd-12704e0d6a4d |
POSIX |
145812c8-decc-41f1-83cf-bb2a85a2a70b |
POSIX Staging |
052be037-7dda-4d20-b163-3077314dc3e6 |
S3 |
7643e831-5f6c-4b47-a07f-8ee90f401d23 |
5.2. Input Document
The input document is a JSON object which includes the identities which the GCS Manager received from the client. The program receives the document on stdin. The document contains the following properties:
Property | Type | Description |
---|---|---|
DATA_TYPE |
string |
Type of this document. |
identities |
array (object) |
A list of identity resource objects that the caller has provided to Globus Connect Server that are to be mapped. |
{
"DATA_TYPE": "identity_mapping_input#1.0.0",
"identities": [
{
"id": "ce986100-ab06-4991-9fa8-5302d2215bee",
"username": "user@example.org",
"email": "user@email.example.org"
},
{
"id": "2e5a9c8e-4261-4be3-b2ec-c499a4410d3e",
"username": "user@example.edu",
"email": "user@EXAMPLE.EDU"
}
]
}
5.3. Output Document
The output document is a JSON object which includes the results of mapping the
identities. It may include zero or one mappings if the -a
option is not
included in the program invocation, or zero or more mappings if the -a
option
is included. The program sends the document to stdout. The document must contain
only the following properties, missing or unknown fields will result in failure:
Property | Type | Description |
---|---|---|
DATA_TYPE |
string |
Type of this document. |
result |
array (object) |
A list of result objects that the
program has mapped the identity to. These
objects require the following two fields:
|
result[].id |
string (uuid) |
The input identity ID that the result matches. |
result[].output |
string (uuid) |
The account name that the identity maps to. |
{
"DATA_TYPE": "identity_mapping_output#1.0.0",
"result": [
{
"id": "ce986100-ab06-4991-9fa8-5302d2215bee",
"output": "user"
},
{
"id": "2e5a9c8e-4261-4be3-b2ec-c499a4410d3e",
"output": "another_account"
}
]
}
6. Globus Help Resources
6.1. Documentation Website
This website (docs.globus.org) contains a wealth of information about configuring and using the Globus service. Many common issues can be resolved quickly by browsing our frequently asked questions and reading the relevant guides and how-tos. We recommend consulting these resources first when looking for fast resolution to any issue you are having with the Globus service.
6.2. Mailing Lists
If you use Globus, then participating in one or more of the public email lists is an excellent way to keep in touch with your peers in the Globus Community. For questions about managing your Globus deployment, e.g. installing software for a Globus endpoint, configuring your firewall, and integrating your institution’s identity system, subscribe to the admin list. For other inquiries and discussions, try the user or developer lists. For more information on mailing lists and how to subscribe, click here.
6.3. Globus Support
Questions or issues that pertain to Globus Connect Server v5 installation or to any client or service that is used in the Globus software-as-a-service (SaaS) or platform-as-a-service (PaaS) offering can be directed to the Globus support team by submitting a ticket. Subscriptions include a guaranteed support service level.
When submitting a ticket for an issue with Globus Connect Server, please include the endpoint name, a description of your issue, and screenshot/text dumps of any errors you are seeing. Please also include the output of Globus Connect Server’s self-diagnostic command, run as root, from the server hosting the endpoint:
globus-connect-server self-diagnostic
Appendix A: Identity Mapping Document Types
The JSON document types described in this appendix are the current types of data that can be set as the members of the identity_mappings array in a storage gateway configuration document.
7. IdentityMapping Document
Globus Connect Server provides two ways for you to implement a custom Globus identity to account mapping: expression-based and external program
With expression-based mapping you can write rules that extract data from fields in the Globus identity document to form storage gateway-specific usernames. If there is a regular relationship between most of your users' Identity information to their account names, this is probably the most direct way to accomplish the mapping.
With external program mappings you can use any mechanism you like (static mapping, ldap, database, etc) to look up account information and return the mapped account user name. If you have an account system that has usernames without a simple relationship to your users' Globus identities, or that requires interfacing with an accounting system, this may be necessary.
One of the following schemas:
{
"DATA_TYPE": "external_identity_mapping#1.0.0",
"command": [
"string"
]
}