Layers#

Caution

🚧 IN CONSTRUCTION 🚧

This doc is incomplete, check back soon!

Transformations in subgrounds operate via layers, where one is applied after another. There are two types of transforms:

RequestTransform: This transform operates on a DataRequest and DataResponse which represent the total query being made via one operation of _execute.
DocumentTransform: This transform operates on a Document and DocumentResponse which (ignoring pagination) represent a part of the total query that will be made. Each Document cooresponds to a single subgraph server.

Each layer contains two important functions:

transform_request() / transform_document(): This transforms requests on-the-way towards pagination.
transform_response() / transform_response(): This transforms responses backwards from pagination.

When fully assembled, the layers act like an onion.

A top-level showcase on how a request turns into a response#

The Pipeline#

Subgrounds supports any number of transformations for every query produced — something that ends up being quite complex to fufill! Generally, here are the steps we take when executing the transforms:

Convert all DocumentTransforms to DocumentRequestTransform.
- This greatly simplifies the transform pipeline.
Convert all transformation layers into a generator sandwich.
- These "sandwiches" essentially create an generator that hold the context of DataRequest making it easier for us to transform both DataRequest and DataResponse.
- Essentially, we store these in a stack allowing us to easily iterate through them.
Waterfall the first DataRequest through all of the generators in the stack.
- The first request gets transformed through each generators within the stack.
Forward the final transformed DataRequest to pagination (and eventually execution).
Receive the raw DataResponse from pagination / execution.
In reverse, waterfall this DataResponse up the generator stack.
Return this DataResponse as the result of the transformation pipeline.

See also the implementation of this in apply_transforms()

This function is called within _execute() to assemble and execution the transformation process. We use the generator design to help implement the "sans-io" approach which helps us separate this transformation pipeline from the actual execution of the request (from a call-stack POV).

def apply_transforms(
    request_transforms: list[RequestTransform],
    document_transforms: dict[str, list[DocumentTransform]],
    req: DataRequest,
) -> Generator[None | DataRequest | DataResponse, DataResponse, None]:
    """Apply all `RequestTransforms` and `DocumentTransforms` to a `DataRequest` and a
     corresponding `DataResponse`.

    This function abstractly applies a series of transforms onto a request and response.
    The execution of the request is handled outside this function (ala. sans-io), which
     allows this function to only work with the abstracted components.

    Note: For simplification, all `DocumentTransforms` stored at the subgraph are
     converted to specific `RequestTransforms` when applied.
    """

    unique_doc_urls = {doc.url for doc in req.documents}

    # Iterating through all unique document urls, get each document's transforms from
    #  the subgraph converting all `DocumentTransforms` into `DocumentRequestTransforms`
    # We do this to make it easier for us to only work with one type of transform.
    converted_transforms = (
        DocumentRequestTransform(transform, url)
        for url in unique_doc_urls
        for transform in document_transforms[url]
    )

    # Construct our list of generators from our two sets of transforms.
    # The request transforms are before the (converted) document transforms.
    stack: list[TransformGen] = list(
        request_transforms | chain_with(converted_transforms) | map(handle_transform)
    )

    # Go through every transform in the stack and transform the request
    for transform in stack:
        next(transform)  # ditch `None`
        req = cast(DataRequest, transform.send(req))

    # yield the final transformed request (valid "graphql")
    yield req

    # We enter an infinite loop here to allow multiple documents to be transformed
    #  back up the stack since `execute_iter` produces them page-by-page to be streamed.
    while True:
        # retrieve the response (from the `executor` governing this generator)
        resp = yield

        # Finally, using the response, iterate through the transforms in reverse order,
        #  transforming the raw response up back through the transforms
        for transform in reversed(stack):
            next(transform)  # ditch `None`
            resp = cast(DataResponse, transform.send(resp))

        # Take the final transformed response and send it back to the `executor`
        yield resp

Visual Guide#

This is the general flow of how a request gets tranformed through the pipeline.