Paging and Paginators¶
Globus SDK Client objects have paginated methods which return paginators.
A paginated API is one which returns data in multiple API calls. This is used in cases where the the full set of results is too large to return all at once, or where getting all results is slow and a few results are wanted faster.
A good example of paginated data would be search results: the first “page” of data may be the first 10 results, and the next “page” consists of the next 10 results.
The number of results per call is the page size. Each page is an API response with a number of results equal to the page size.
Paging in the Globus SDK can be done by iterating over pages (responses) or by iterating over items (individual results).
Paginators¶
A Paginator
object is an iterable provided by
the Globus SDK.
Paginators support iteration over pages with the method pages()
and iteration
over items with the method items()
.
Paginators have fixed parameters which are set when the paginator is created.
Once a method returns a paginator, you don’t need to pass it any additional
data – pages()
or items()
will operate based on the original parameters to
the paginator.
Making Paginated Calls¶
Globus SDK client objects define paginated variants of methods. The normal
method is said to be “unpaginated”, and returns a single page of results.
The paginated variant, prefixed with paginated.
, returns a paginated.
For example, globus_sdk.TransferClient
has a paginated method,
endpoint_search()
. Once you have
a client object, calls to the unpaginated method are done like so:
import globus_sdk
# for information on getting an authorizer, see the SDK Tutorial
tc = globus_sdk.TransferClient(authorizer=...)
# unpaginated calls can still return iterable results!
# endpoint_search() returns an iterable response
for endpoint_info in tc.endpoint_search("tutorial"):
print("got endpoint_id:", endpoint_info["id"])
The paginated variant of this same method is accessed nearly identically. But
instead of calling endpoint_search(...)
, we’ll invoke
paginated.endpoint_search(...)
.
Here are three variants of code with the same basic effect:
# note the call to `items()` at the end of this line!
for endpoint_info in tc.paginated.endpoint_search("tutorial").items():
print("got endpoint_id:", endpoint_info["id"])
# equivalently, call `pages()` and iterate over the items in each page
for page in tc.paginated.endpoint_search("tutorial").pages():
for endpoint_info in page:
print("got endpoint_id:", endpoint_info["id"])
# iterating on a paginator without calling `pages()` or `items()` is
# equivalent to iterating on `pages()`
for page in tc.paginated.endpoint_search("tutorial"):
for endpoint_info in page:
print("got endpoint_id:", endpoint_info["id"])
Do I need to use pages()? What is it for?¶
If your use-case is satisfied with items()
, then stick with items()
!
pages()
iteration is important when there is useful data in the page other
than the individual items.
For example,
TransferClient.endpoint_search
returns the total number of results for the search as a field on each page.
Most use-cases can be solved with items()
, and pages()
will be
available to you if or when you need it.
Typed Paginators with Paginator.wrap¶
This is an alternate syntax for getting a paginated call. It is more verbose,
but preserves type annotation information correctly. It is therefore preferable
for users who want to type-check their code with mypy
.
Paginator.wrap
converts any client method into a callable which returns a
paginator. Its usage is very similar to the .paginated
syntax.
import globus_sdk
from globus_sdk.paging import Paginator
tc = globus_sdk.TransferClient(...)
# convert `tc.endpoint_search` into a call returning a paginator
paginated_call = Paginator.wrap(tc.endpoint_search)
# now the result is a paginator and we can use `pages()` or `items()` as
# normal
for endpoint_info in paginated_call("tutorial").items():
print("got endpoint_id:", endpoint_info["id"])
However, if using mypy
to run reveal_type
, the results of
tc.paginated.task_successful_transfers
and
Paginator.wrap(tc.task_successful_transfers)
are very different:
# def (task_id: Union[uuid.UUID, builtins.str], *, query_params: Union[builtins.dict[builtins.str, Any], None] =) -> globus_sdk.services.transfer.response.iterable.IterableTransferResponse
reveal_type(tc.task_successful_transfers)
# def [PageT <: globus_sdk.response.GlobusHTTPResponse] (*Any, **Any) -> globus_sdk.paging.base.Paginator[PageT`-1]
reveal_type(tc.paginated.task_successful_transfers)
# def (task_id: Union[uuid.UUID, builtins.str], *, query_params: Union[builtins.dict[builtins.str, Any], None] =) -> globus_sdk.paging.base.Paginator[globus_sdk.services.transfer.response.iterable.IterableTransferResponse*]
reveal_type(Paginator.wrap(tc.task_successful_transfers))
Paginator Types¶
globus_sdk.paging
defines several paginator classes and methods. For the
most part, you do not need to interact with these classes or methods except
through pages()
or items()
.
The paging
subpackage also defines the PaginatorTable
, which is used to
define the paginated
attribute on client objects.
- globus_sdk.paging.has_paginator(paginator_class, items_key=None, **paginator_params)[source]¶
Mark a callable – typically a client method – as having pagination parameters. Usage:
>>> class MyClient(BaseClient): >>> @has_paginator(MarkerPaginator) >>> def foo(...): ...
This will mark
MyClient.foo
as paginated with marker style pagination. It will then be possible to get a paginator forMyClient.foo
via>>> c = MyClient(...) >>> paginator = c.paginated.foo()
- class globus_sdk.paging.Paginator(method, *, items_key=None, client_args, client_kwargs, **kwargs)[source]¶
Bases:
Iterable
[PageT
]Base class for all paginators. This guarantees is that they have generator methods named
pages
anditems
.Iterating on a Paginator is equivalent to iterating on its
pages
.- Parameters:
method (t.Callable[..., t.Any]) – A bound method of an SDK client, used to generate a paginated variant
items_key (str | None) – The key to use within pages of results to get an array of items
client_args (tuple[t.Any, ...]) – Arguments to the underlying method which are passed when the paginator is instantiated. i.e. given
client.paginated.foo(a, b, c=1)
, this will be(a, b)
. The paginator will pass these arguments to each call of the bound method as it pages.client_kwargs (dict[str, t.Any]) – Keyword arguments to the underlying method, like
client_args
above.client.paginated.foo(a, b, c=1)
will pass this as{"c": 1}
. As withclient_args
, it’s passed to each paginated call.
- items()[source]¶
items()
of a paginator is a generator which yields each item in each page of results.items()
may raise aValueError
if the paginator was constructed without identifying a key for use within each page of results. This may be the case for paginators whose pages are not primarily an array of data.
- abstract pages()[source]¶
pages()
yields GlobusHTTPResponse objects, each one representing a page of results.- Return type:
Iterator[PageT]
- classmethod wrap(method)[source]¶
This is an alternate method for getting a paginator for a paginated method which correctly preserves the type signature of the paginated method.
It should be used on instances of clients and only passed bound methods of those clients. For example, given usage
>>> tc = TransferClient() >>> paginator = tc.paginated.endpoint_search(...)
a well-typed paginator can be acquired with
>>> tc = TransferClient() >>> paginated_call = Paginator.wrap(tc.endpoint_search) >>> paginator = paginated_call(...)
Although the syntax is slightly more verbose, this allows mypy and other type checkers to more accurately infer the type of the paginator.
- class globus_sdk.paging.PaginatorTable(client)[source]¶
Bases:
object
A PaginatorTable maps multiple methods of an SDK client to paginated variants. Given a method, client.foo annotated with the has_paginator decorator, the table will gain a function attribute foo (name matching is automatic) which returns a Paginator.
Clients automatically build and attach paginator tables under the
paginated
attribute. That is, if client has two methods foo and bar which are marked as paginated, that will let us call>>> client.paginated.foo() >>> client.paginated.bar()
where
client.paginated
is aPaginatorTable
.Paginators are iterables of response pages, so ultimate usage is like so:
>>> paginator = client.paginated.foo() # returns a paginator >>> for page in paginator: # a paginator is an iterable of pages (response objects) >>> print(json.dumps(page.data)) # you can handle each response object in turn
A
PaginatorTable
is built automatically as part of client instantiation. Creation ofPaginatorTable
objects is considered a private API.
- class globus_sdk.paging.MarkerPaginator(method, *, items_key=None, marker_key='marker', client_args, client_kwargs)[source]¶
Bases:
Paginator
[PageT
]A paginator which uses has_next_page and marker from payloads, sets the marker query param to page.
This is the default method for GCS pagination, so it’s very simple.
- class globus_sdk.paging.NextTokenPaginator(method, *, items_key=None, client_args, client_kwargs)[source]¶
Bases:
Paginator
[PageT
]A paginator which uses next_token from payloads to set the next_token query param to page.
Very similar to GCS’s marker paginator, but only used for Transfer’s get_shared_endpoint_list
- class globus_sdk.paging.LastKeyPaginator(method, *, items_key=None, client_args, client_kwargs)[source]¶
Bases:
Paginator
[PageT
]
- class globus_sdk.paging.HasNextPaginator(method, *, items_key=None, get_page_size, max_total_results, page_size, client_args, client_kwargs)[source]¶
Bases:
_LimitOffsetBasedPaginator
[PageT
]