Consents and Resuming Runs
Introduction
The purpose of this document is to provide some background on how Flows interacts with tokens, clients, and consents and to provide guidance for how to work with inactive runs that require additional consents.
If you are new to the Globus ecosystem, you should also read about Clients, Scopes, and Consents.
Running a Flow
Whenever you run a flow for the first time using a particular client application (for example, the Globus Web App or the Globus CLI), you are asked to consent to that flow’s scope before the run can proceed. This consent authorizes the client to start that flow, and it additionally authorizes that flow to interact with specific other Globus services on your behalf.
In some cases, a flow may not know every consent that it needs at the start of a run.
In practice, this is nearly always caused by accessing a GCSv5 mapped collection during the course of a run.
(GCSv5 mapped collections have a particular scope, data_access
, that a flow run often cannot know it needs until it interacts with that collection.)
When a flow encounters an additional required scope during a run, it will make that run inactive and prompt you for the missing consent (for instance, via an email notification).
Resuming a Run
While it may be possible to resume a flow run from a different client application than the one that started it, it’s recommended to resume it from the same application that started that run. The reason behind this recommendation is that when resuming the flow, the new consent you grant will only apply to:
-
The specific run you are resuming
-
Any future runs of the same flow started from the same client application you used to resume it
For example, imagine you have a flow that accesses a GCSv5 mapped collection during its run. You run this flow for the first time from the Globus Web App. The Globus Web App will prompt you to consent to run the flow using the scopes that it knows about in advance.
During the course of the run, this flow performs a transfer involving a GCSv5 mapped collection.
Because you have not previously run this flow from the Globus Web App, the flow does not have your consent to the data_access
scope required to access that collection.
The flow makes the run inactive, notes the missing consent, and sends you an email notification with instructions for resuming the run.
Scenario 1: Return to the Same Client Application
You resume the run from the Globus Web App—the same application that you used to start the run—and consent to the missing data_access
scope.
The run resumes.
This consent is now associated with the Web App and will apply to any future runs of the same flow started from the Web App.
Because of this, your runs started via the Web App will not become inactive when trying to access this particular collection.
Scenario 2: Use a Different Client Application
When you resume the run, you choose to resume from the Globus CLI. The run resumes. This consent is now associated with the Globus CLI, meaning that your runs started via the Globus CLI will not become inactive when trying to access this particular collection. However, this consent is not associated with the Web App, which means that future runs started in the Web App will still become inactive when trying to access this particular collection and prompt you for consent.
Explanation
The reason for this behavior is tied to the definition of a "consent" provided above:
A consent is your authorization for a particular client application to perform a specific action—represented by a scope—on your behalf.
That means that consents are not global, but rather are specific to the particular client that you use to perform a given action.
Returning to the description of the effects of resuming indicated above, your consents apply to:
1. The specific run you are resuming
When the client application resumes the run, the flow replaces the credential (a token) originally used to start the run with the credential used to resume the run, allowing that run to proceed with the new consent.
2. Any future runs of the same flow started from the same client application you used to resume it
Because the consent is associated with the client application that you used to resume the run, future runs started from that client application will already have that consent when they attempt to access the GCSv5 mapped collection.
Developing Your Own Client Application
When you develop your own native client application, it’s important to consider how you will handle required scopes of which the Flows service doesn’t have advance knowledge.
In the scenario in the preceding section, when a user of your application starts a new flow run, your client application uses their access token to communicate with the Flows service. The Flows service in turn takes that access token to get its own refresh token that it stores, and an access token that it can use to interact with the each of the APs in a flow.
This is where things get tricky: When a flow that your application starts requires additional consent to proceed (e.g., due to accessing a GCSv5 mapped collection) that consent is specifically associated with the application used to resume the run. This means that if a user starts a flow run from your application and then resumes it from a different application, the consent they grant will not apply to future runs started from your application. In this case, the suggested options in the notification email they receive won’t work as-is (or rather, they’ll work, but only to resume an individual run; Flows will still prompt them for additional consent every time they run that flow with your application because the run will never start with the additional required consent).
In order to avoid this, you will need to consider this scenario and decide how you want to handle it. There are three potential ways to address this:
1 - Use Guest Collections
If you control the flow definition(s) that your application will run, you can use guest collections instead of GCSv5 mapped collections (at least, so long as you do not accept an arbitrary collection ID as input from your users). This is the simplest solution, but may not be possible in all cases.
2 - Request Additional Consents Prior to Starting the Run
If you know in advance that you will be accessing a particular GCSv5 mapped collection during a flow’s run, you can request consent for that scope at the time you run the flow.
This is represented in the example code below, which modifies the Globus SDK "Run a Flow" example to add a new command line argument, --collection-id
, which can be specified multiple times in order to add consents for those collections.
The area of greatest interest is here:
if collection_ids:
# Build a scope that will give the flow
# access to specific mapped collections on your behalf
transfer_scope = globus_sdk.TransferClient.scopes.make_mutable("all")
transfer_action_provider_scope = MutableScope(
TRANSFER_ACTION_PROVIDER_SCOPE_STRING
)
# If you declared and mapped collections above,
# add them to the transfer scope
for collection_id in collection_ids:
gcs_data_access_scope = GCSCollectionScopeBuilder(
collection_id
).make_mutable(
"data_access",
optional=True,
)
transfer_scope.add_dependency(gcs_data_access_scope)
transfer_action_provider_scope.add_dependency(transfer_scope)
all_scopes.add_dependency(transfer_action_provider_scope)
…at which point you can perform the login flow using all_scopes
.
#!/usr/bin/env python
import argparse
import os
import sys
import globus_sdk
from globus_sdk.scopes import GCSCollectionScopeBuilder, MutableScope
from globus_sdk.tokenstorage import SimpleJSONFileAdapter
MY_FILE_ADAPTER = SimpleJSONFileAdapter(os.path.expanduser("~/.sdk-manage-flow.json"))
# Tutorial client ID
# We recommend replacing this with your own client for any production use-cases
CLIENT_ID = "61338d24-54d5-408f-a10d-66c06b59f6d2"
PAYLOAD = {
"DATA": [
{
"source_path": "/home/globus-shared-user/ada-sandbox/test01/",
"destination_path": "/home/globus-shared-user/ada-sandbox/test02/",
"recursive": True,
}
],
"source_endpoint": "996383e6-0c85-4339-a5ea-c3cd855c2692",
"destination_endpoint": "996383e6-0c85-4339-a5ea-c3cd855c2692",
}
NATIVE_CLIENT = globus_sdk.NativeAppAuthClient(CLIENT_ID)
TRANSFER_ACTION_PROVIDER_SCOPE_STRING = (
"https://auth.globus.org/scopes/actions.globus.org/transfer/transfer"
)
def do_login_flow(scope):
NATIVE_CLIENT.oauth2_start_flow(
requested_scopes=scope,
refresh_tokens=True,
)
authorize_url = NATIVE_CLIENT.oauth2_get_authorize_url()
print(f"Please go to this URL and login:\n\n{authorize_url}\n")
auth_code = input("Then, please enter the code here: ").strip()
tokens = NATIVE_CLIENT.oauth2_exchange_code_for_tokens(auth_code)
return tokens
def get_authorizer(flow_id, collection_ids=None):
flow_scopes = globus_sdk.SpecificFlowClient(flow_id).scopes
all_scopes = flow_scopes.make_mutable("user")
# tag::pre-authorize[]
if collection_ids:
# Build a scope that will give the flow
# access to specific mapped collections on your behalf
transfer_scope = globus_sdk.TransferClient.scopes.make_mutable("all")
transfer_action_provider_scope = MutableScope(
TRANSFER_ACTION_PROVIDER_SCOPE_STRING
)
# If you declared and mapped collections above,
# add them to the transfer scope
for collection_id in collection_ids:
gcs_data_access_scope = GCSCollectionScopeBuilder(
collection_id
).make_mutable(
"data_access",
optional=True,
)
transfer_scope.add_dependency(gcs_data_access_scope)
transfer_action_provider_scope.add_dependency(transfer_scope)
all_scopes.add_dependency(transfer_action_provider_scope)
# end::pre-authorize[]
# Try to load the tokens from the file, possibly returning None
if MY_FILE_ADAPTER.file_exists() and not collection_ids:
tokens = MY_FILE_ADAPTER.get_token_data(flow_id)
else:
tokens = None
if tokens is None:
# Do a login flow, getting back initial tokens
response = do_login_flow(all_scopes)
# Now store the tokens and pull out the correct token
MY_FILE_ADAPTER.store(response)
tokens = response.by_resource_server[flow_id]
return globus_sdk.RefreshTokenAuthorizer(
tokens["refresh_token"],
NATIVE_CLIENT,
access_token=tokens["access_token"],
expires_at=tokens["expires_at_seconds"],
on_refresh=MY_FILE_ADAPTER.on_refresh,
)
def get_flow_client(flow_id, collection_ids=None):
authorizer = get_authorizer(flow_id, collection_ids=collection_ids)
return globus_sdk.SpecificFlowClient(flow_id, authorizer=authorizer)
def run_flow(args):
flow_client = get_flow_client(
args.FLOW_ID,
collection_ids=args.collection_ids,
)
print(flow_client.run_flow(PAYLOAD))
def main():
parser = argparse.ArgumentParser()
parser.add_argument("FLOW_ID", help="Flow ID to run")
parser.add_argument(
"--collection-id",
action="append",
dest="collection_ids",
help="Collection ID to add consent for",
)
args = parser.parse_args()
try:
run_flow(args)
except globus_sdk.FlowsAPIError as e:
print(f"API Error: {e.code} {e.message}")
print(e.text)
sys.exit(1)
if __name__ == "__main__":
main()
3 - Add Resume Support to Your Application
If you don’t know which GCSv5 collection you’ll be accessing or want to be able to generically handle any missing consent encountered during the course of a run, you will want to support some method of resuming via your application (this could, for instance, be integrated as a distinct command when launching your application or performed by a separate script that uses the same client ID).
This is represented in the example script below, which modifies the Globus SDK "Create, Delete, and Run Flows" example to add a new command, resume
, which requires a new --run-id
argument.
Several areas of this script have been modified in important ways, but the area of greatest interest is:
def resume_run(args):
# Get a client for the Flows service
flows_client = get_flows_client()
# Get the run to resume
run = flows_client.get_run(args.run_id)
flow_id = run.data["flow_id"]
# Get the scope that is required in order to resume the run
required_scope = run.data["details"].get("required_scope", None)
# Have the user consent using to the required scope
flow_client = get_specific_flow_client(flow_id, scope=required_scope)
# Resume the run
print(flow_client.resume_run(args.run_id))
#!/usr/bin/env python
import argparse
import os
import sys
import globus_sdk
from globus_sdk.tokenstorage import SimpleJSONFileAdapter
MY_FILE_ADAPTER = SimpleJSONFileAdapter(os.path.expanduser("~/.sdk-manage-flow.json"))
# Tutorial client ID
# We recommend replacing this with your own client for any production use-cases
CLIENT_ID = "61338d24-54d5-408f-a10d-66c06b59f6d2"
NATIVE_CLIENT = globus_sdk.NativeAppAuthClient(CLIENT_ID)
def do_login_flow(scope):
NATIVE_CLIENT.oauth2_start_flow(requested_scopes=scope, refresh_tokens=True)
authorize_url = NATIVE_CLIENT.oauth2_get_authorize_url()
print(f"Please go to this URL and login:\n\n{authorize_url}\n")
auth_code = input("Then, please enter the code here: ").strip()
tokens = NATIVE_CLIENT.oauth2_exchange_code_for_tokens(auth_code)
return tokens
def get_authorizer(flow_id=None, force=False, scope=None):
if flow_id:
resource_server = flow_id
scope = scope if scope else globus_sdk.SpecificFlowClient(flow_id).scopes.user
else:
resource_server = globus_sdk.FlowsClient.resource_server
scope = (
scope
if scope
else [
globus_sdk.FlowsClient.scopes.manage_flows,
globus_sdk.FlowsClient.scopes.run_manage,
]
)
# Try to load the tokens from the file, possibly returning None
if MY_FILE_ADAPTER.file_exists():
tokens = MY_FILE_ADAPTER.get_token_data(resource_server)
else:
tokens = None
if tokens is None or force:
# Do a login flow, getting back initial tokens
response = do_login_flow(scope)
# Now store the tokens and pull out the correct token
MY_FILE_ADAPTER.store(response)
tokens = response.by_resource_server[resource_server]
return globus_sdk.RefreshTokenAuthorizer(
tokens["refresh_token"],
NATIVE_CLIENT,
access_token=tokens["access_token"],
expires_at=tokens["expires_at_seconds"],
on_refresh=MY_FILE_ADAPTER.on_refresh,
)
def get_flows_client():
return globus_sdk.FlowsClient(authorizer=get_authorizer())
def get_specific_flow_client(flow_id, scope=None):
if scope:
authorizer = get_authorizer(flow_id, force=True, scope=scope)
else:
authorizer = get_authorizer(flow_id)
return globus_sdk.SpecificFlowClient(flow_id, authorizer=authorizer)
def create_flow(args):
flows_client = get_flows_client()
print(
flows_client.create_flow(
title=args.title,
definition={
"StartAt": "DoIt",
"States": {
"DoIt": {
"Type": "Action",
"ActionUrl": "https://actions.globus.org/hello_world",
"Parameters": {
"echo_string": "Hello, Asynchronous World!",
},
"End": True,
}
},
},
input_schema={},
subtitle="A flow created by the SDK tutorial",
)
)
def delete_flow(args):
flows_client = get_flows_client()
print(flows_client.delete_flow(args.flow_id))
def list_flows():
flows_client = get_flows_client()
for flow in flows_client.list_flows(filter_role="flow_owner"):
print(f"title: {flow['title']}")
print(f"id: {flow['id']}")
print()
def run_flow(args):
flow_client = get_specific_flow_client(args.flow_id)
print(flow_client.run_flow({}))
# tag::resume-command[]
def resume_run(args):
# Get a client for the Flows service
flows_client = get_flows_client()
# Get the run to resume
run = flows_client.get_run(args.run_id)
flow_id = run.data["flow_id"]
# Get the scope that is required in order to resume the run
required_scope = run.data["details"].get("required_scope", None)
# Have the user consent using to the required scope
flow_client = get_specific_flow_client(flow_id, scope=required_scope)
# Resume the run
print(flow_client.resume_run(args.run_id))
# end::resume-command[]
def logout():
for tokendata in MY_FILE_ADAPTER.get_by_resource_server().values():
for tok_key in ("access_token", "refresh_token"):
token = tokendata[tok_key]
NATIVE_CLIENT.oauth2_revoke_token(token)
os.remove(MY_FILE_ADAPTER.filename)
def main():
parser = argparse.ArgumentParser()
parser.add_argument(
"action", choices=["logout", "create", "delete", "list", "run", "resume"]
)
parser.add_argument("-f", "--flow-id", help="Flow ID for delete and run")
parser.add_argument("-r", "--run-id", help="Run ID for resume")
parser.add_argument("-t", "--title", help="Name for create")
args = parser.parse_args()
try:
if args.action == "logout":
logout()
elif args.action == "create":
if args.title is None:
parser.error("create requires --title")
create_flow(args)
elif args.action == "delete":
if args.flow_id is None:
parser.error("delete requires --flow-id")
delete_flow(args)
elif args.action == "list":
list_flows()
elif args.action == "run":
if args.flow_id is None:
parser.error("run requires --flow-id")
run_flow(args)
elif args.action == "resume":
if args.run_id is None:
parser.error("resume requires --run-id")
resume_run(args)
else:
raise NotImplementedError()
except globus_sdk.FlowsAPIError as e:
print(f"API Error: {e.code} {e.message}")
print(e.text)
sys.exit(1)
if __name__ == "__main__":
main()
In this case, the script first gets the run details from the Flows service to determine which scopes are required, then it uses those scopes to prompt the user for consent, and then those consents will be associated with your client (as you specified in the script, using the CLIENT_ID
).
The advantage of this method is that it could potentially handle scopes that are not known in advance by relying on the Flows service to report the consents it needs in order to continue.
The disadvantage is that it will require a second consent flow the first time that a run becomes inactive (but subsequent runs for the same user with this client will proceed without additional interruption, provided the user does not revoke those consents).