Scheduling a Repeating Transfer¶
The Globus Timers service allows users to schedule tasks to run at a future
date or on a recurring schedule.
In particular, Timers allows scheduled Transfer Tasks
, which can be
submitted very similarly to submissions to the Globus Transfer service.
Creating a Transfer Timer¶
To setup recurring transfers, you will need to create a timer. The timer contains a Transfer Task submission. It will submit that Transfer Task each time it runs.
import datetime
import globus_sdk
from globus_sdk.experimental.globus_app import UserApp
# Tutorial Client ID - <replace this with your own client>
NATIVE_CLIENT_ID = "61338d24-54d5-408f-a10d-66c06b59f6d2"
USER_APP = UserApp("manage-timers-example", client_id=NATIVE_CLIENT_ID)
# Globus Tutorial Collection 1
# https://app.globus.org/file-manager/collections/6c54cade-bde5-45c1-bdea-f4bd71dba2cc
SRC_COLLECTION = "6c54cade-bde5-45c1-bdea-f4bd71dba2cc"
SRC_PATH = "/share/godata/file1.txt"
# Globus Tutorial Collection 2
# https://app.globus.org/file-manager/collections/31ce9ba0-176d-45a5-add3-f37d233ba47d
DST_COLLECTION = "31ce9ba0-176d-45a5-add3-f37d233ba47d"
DST_PATH = "/~/example-timer-destination.txt"
# as with an immediate data transfer, we take our input data and wrap them in
# a TransferData object, representing the transfer task
transfer_request = globus_sdk.TransferData(
source_endpoint=SRC_COLLECTION,
destination_endpoint=DST_COLLECTION,
)
transfer_request.add_item(SRC_PATH, DST_PATH)
# we'll define the timer as one which runs every hour for 3 days
# declare these data in the form of a "schedule" for the timer
#
# a wide variety of schedules are possible here; to setup a recurring timer:
# - you MUST declare an interval for the timer (`interval_seconds`)
# - you MAY declare an end condition (`end`)
schedule = globus_sdk.RecurringTimerSchedule(
interval_seconds=3600,
end={
"condition": "time",
"datetime": datetime.datetime.now() + datetime.timedelta(days=3),
},
)
# create a TimersClient to interact with the service, and register any data_access
# scopes for the collections
timers_client = globus_sdk.TimersClient(app=USER_APP)
# Omit this step if the collections are either
# (1) A guest collection or (2) high assurance.
timers_client.add_app_transfer_data_access_scope((SRC_COLLECTION, DST_COLLECTION))
# submit the creation request to the service, printing out the ID of your new timer
# after it's created -- you can find it in https://app.globus.org/activity/timers
timer = timers_client.create_timer(
globus_sdk.TransferTimer(
name=(
"create-timer-example "
f"[created at {datetime.datetime.now().isoformat()}]"
),
body=transfer_request,
schedule=schedule,
)
)
print("Finished submitting timer.")
print(f"timer_id: {timer['timer']['job_id']}")
Discover Data Access Scopes and Create a Timer¶
Unlike direct Transfer Task
submission, creating a Timer with unknown inputs
won’t give you an error immediately if you need data_access
scopes because
that information isn’t available until the Timer runs.
Therefore, if the input collections are variable, we need to enhance the previous
example to automatically determine whether or not the data_access
scope is
needed.
We’ll do this with a new uses_data_access
helper and a TransferClient
:
transfer_client = globus_sdk.TransferClient(app=USER_APP)
def uses_data_access(collection_id: str) -> bool:
"""
Use the `transfer_client` associated with the app to lookup the given
collection ID.
Having looked up the record, return `True` if it uses a `data_access` scope
and `False` otherwise.
"""
doc = transfer_client.get_endpoint(collection_id)
if doc["entity_type"] != "GCSv5_mapped_collection":
return False
if doc["high_assurance"]:
return False
return True
This will allow us to guard our use of the data_access
scope thusly:
if uses_data_access(SRC_COLLECTION):
timers_client.add_app_transfer_data_access_scope(SRC_COLLECTION)
if uses_data_access(DST_COLLECTION):
timers_client.add_app_transfer_data_access_scope(DST_COLLECTION)
Note
Because the data_access
requirement can’t be detected until after you have
logged in to the app, it is possible for this to result in a “double login”
scenario. First, you login providing consent for Timers and Transfer, but
then a data_access
scope is found to be needed. You then have to login
again to satisfy that requirement.
The UserApp
will track the addition until you use app.logout
,
however, so this only happens the first time the script runs.
With these modifications in place, the resulting script looks like so:
import datetime
import globus_sdk
from globus_sdk.experimental.globus_app import UserApp
# Tutorial Client ID - <replace this with your own client>
NATIVE_CLIENT_ID = "61338d24-54d5-408f-a10d-66c06b59f6d2"
USER_APP = UserApp("manage-timers-example", client_id=NATIVE_CLIENT_ID)
# Globus Tutorial Collection 1
# https://app.globus.org/file-manager/collections/6c54cade-bde5-45c1-bdea-f4bd71dba2cc
SRC_COLLECTION = "6c54cade-bde5-45c1-bdea-f4bd71dba2cc"
SRC_PATH = "/home/share/godata/file1.txt"
# Globus Tutorial Collection 2
# https://app.globus.org/file-manager/collections/31ce9ba0-176d-45a5-add3-f37d233ba47d
DST_COLLECTION = "31ce9ba0-176d-45a5-add3-f37d233ba47d"
DST_PATH = "/~/example-timer-destination.txt"
# we need a TransferClient in this case
# see how it is used in the `uses_data_access` helper
transfer_client = globus_sdk.TransferClient(app=USER_APP)
def uses_data_access(collection_id: str) -> bool:
"""
Use the `transfer_client` associated with the app to lookup the given
collection ID.
Having looked up the record, return `True` if it uses a `data_access` scope
and `False` otherwise.
"""
doc = transfer_client.get_endpoint(collection_id)
if doc["entity_type"] != "GCSv5_mapped_collection":
return False
if doc["high_assurance"]:
return False
return True
# as with an immediate data transfer, we take our input data and wrap them in
# a TransferData object, representing the transfer task
transfer_request = globus_sdk.TransferData(
source_endpoint=SRC_COLLECTION,
destination_endpoint=DST_COLLECTION,
)
transfer_request.add_item(SRC_PATH, DST_PATH)
# we'll define the timer as one which runs every hour for 3 days
# declare these data in the form of a "schedule" for the timer
#
# a wide variety of schedules are possible here; to setup a recurring timer:
# - you MUST declare an interval for the timer (`interval_seconds`)
# - you MAY declare an end condition (`end`)
schedule = globus_sdk.RecurringTimerSchedule(
interval_seconds=3600,
end={
"condition": "time",
"datetime": datetime.datetime.now() + datetime.timedelta(days=3),
},
)
# create a TimersClient to interact with the service, and register any data_access
# scopes for the collections
timers_client = globus_sdk.TimersClient(app=USER_APP)
# Detect on each collection whether or not we need `data_access` scopes and apply
# if necessary
if uses_data_access(SRC_COLLECTION):
timers_client.add_app_transfer_data_access_scope(SRC_COLLECTION)
if uses_data_access(DST_COLLECTION):
timers_client.add_app_transfer_data_access_scope(DST_COLLECTION)
# submit the creation request to the service, printing out the ID of your new timer
# after it's created -- you can find it in https://app.globus.org/activity/timers
timer = timers_client.create_timer(
globus_sdk.TransferTimer(
name=(
"create-timer-example "
f"[created at {datetime.datetime.now().isoformat()}]"
),
body=transfer_request,
schedule=schedule,
)
)
print("Finished submitting timer.")
print(f"timer_id: {timer['timer']['job_id']}")