Request mapping

When you enable automatic request parameter mapping, be it through transparent mode or for a specific request, some Zyte API parameters are chosen automatically for you, and you can then change them further if you wish.

Automatic mapping

For example, the following Scrapy request:

Request(
    method="POST",
    url="https://httpbin.org/anything",
    headers={"Content-Type": "application/json"},
    body=b'{"foo": "bar"}',
    cookies={"a": "b"},
)

Results in a request to the Zyte API data extraction endpoint with the following parameters:

{
    "customHttpRequestHeaders": [
        {
            "name": "Content-Type",
            "value": "application/json"
        }
    ],
    "experimental": {
        "requestCookies": [
            {
                "name": "a",
                "value": "b",
                "domain": ""
            }
        ],
        "responseCookies": true
    },
    "httpResponseBody": true,
    "httpResponseHeaders": true,
    "httpRequestBody": "eyJmb28iOiAiYmFyIn0=",
    "httpRequestMethod": "POST",
    "url": "https://httpbin.org/anything"
}

Header mapping

When mapping headers, some headers are dropped based on the values of the ZYTE_API_SKIP_HEADERS and ZYTE_API_BROWSER_HEADERS settings. Their default values cause the drop of headers not supported by Zyte API.

Even if not defined in ZYTE_API_SKIP_HEADERS, additional headers may be dropped from HTTP requests (customHttpRequestHeaders):

To force the mapping of these headers, define the corresponding setting (if any), set them in the DEFAULT_REQUEST_HEADERS setting, or set them in Request.headers from a spider callback. They will be mapped even if defined with their default value.

Headers will also be mapped if set to a non-default value elsewhere, e.g. in a custom downloader middleware, as long as it is done before the scrapy-zyte-api downloader middleware, which is responsible for the mapping, processes the request. Here “before” means a lower value than 633 in the DOWNLOADER_MIDDLEWARES setting.

Similarly, you can add any of those headers to the ZYTE_API_SKIP_HEADERS setting to prevent their mapping.

Also note that Scrapy sets the Referer header by default in all requests that come from spider callbacks. To unset the header on a given request, set the header value to None on that request. To unset it from all requests, set the REFERER_ENABLED setting to False. To unset it only from Zyte API requests, add it to the ZYTE_API_SKIP_HEADERS setting and remove it from the ZYTE_API_BROWSER_HEADERS setting.

Unsupported scenarios

To maximize support for potential future changes in Zyte API, automatic request parameter mapping allows some parameter values and parameter combinations that Zyte API does not currently support, and may never support: