Settings

Settings for scrapy-zyte-api.

ZYTE_API_AUTOMAP_PARAMS

Default: {}

dict of parameters to be combined with automatic request parameters.

These parameters are merged with zyte_api_automap parameters. zyte_api_automap parameters take precedence.

This setting has no effect on requests with manual request parameters.

When using transparent mode, be careful of which parameters you define in this setting. In transparent mode, all Scrapy requests go through Zyte API, even requests that Scrapy sends automatically, such as those for robots.txt files when ROBOTSTXT_OBEY is True, or those for sitemaps when using SitemapSpider. Certain parameters, like browserHtml or screenshot, are not meant to be used for every single request.

If zyte_api_default_params in Request.meta is set to False, this setting is ignored for that request.

See Default parameters.

ZYTE_API_BROWSER_HEADERS

Default: {"Referer": "referer"}

Determines headers that can be mapped as requestHeaders.

It is a dict, where keys are header names and values are the key that represents them in requestHeaders.

ZYTE_API_DEFAULT_PARAMS

Default: {}

dict of parameters to be combined with manual request parameters.

You may set zyte_api to an empty dict to only use the parameters defined here for that request.

These parameters are merged with zyte_api parameters. zyte_api parameters take precedence.

This setting has no effect on requests with automatic request parameters.

If zyte_api_default_params in Request.meta is set to False, this setting is ignored for that request.

See Default parameters.

ZYTE_API_ENABLED

Default: True

Can be set to False to disable scrapy-zyte-api.

ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED

Default: False

See Automatic mapping.

ZYTE_API_FALLBACK_REQUEST_FINGERPRINTER_CLASS

Default: scrapy_poet.ScrapyPoetRequestFingerprinter if scrapy-poet is installed, else scrapy.utils.request.RequestFingerprinter

Request fingerprinter to for requests that do not go through Zyte API. See Request fingerprinting.

ZYTE_API_KEY

Default: None

Your Zyte API key.

You can alternatively define an environment variable with the same name.

Tip

On Scrapy Cloud, this setting is defined automatically.

ZYTE_API_LOG_REQUESTS

Default: False

Set this to True and LOG_LEVEL to "DEBUG" to enable the logging of debug messages that indicate the JSON object sent on every Zyte API request.

For example:

Sending Zyte API extract request: {"url": "https://example.com", "httpResponseBody": true}

See also: ZYTE_API_LOG_REQUESTS_TRUNCATE.

ZYTE_API_LOG_REQUESTS_TRUNCATE

Default: 64

Determines the maximum length of any string value in the JSON object logged when ZYTE_API_LOG_REQUESTS is enabled, excluding object keys.

To disable truncation, set this to 0.

ZYTE_API_MAX_COOKIES

Default: 100

If the cookies to be set during request mapping exceed this limit, a warning is logged, and only as many cookies as the limit allows are set for the target request.

To silence this warning, set experimental.requestCookies manually, e.g. to an empty dict.

Alternatively, if experimental.requestCookies starts supporting more than 100 cookies, update this setting accordingly.

ZYTE_API_MAX_REQUESTS

Default: None

When set to an integer value > 0, the spider will close when the number of Zyte API requests reaches it.

Note that requests with error responses that cannot be retried or exceed their retry limit also count here.

ZYTE_API_PROVIDER_PARAMS

Default: {}

Defines additional request parameters to use in Zyte API requests sent by the scrapy-poet integration.

For example:

settings.py
ZYTE_API_PROVIDER_PARAMS = {
    "requestCookies": [
        {"name": "a", "value": "b", "domain": "example.com"},
    ],
}

ZYTE_API_RETRY_POLICY

Default: "zyte_api.aio.retry.zyte_api_retrying"

Determines the retry policy for Zyte API requests.

It must be a string with the import path of a tenacity.AsyncRetrying subclass.

Note

Settings must be picklable, and retry policies are not, so you cannot assign a retry policy class directly to this setting, you must use their import path as a string instead.

See Retries.

ZYTE_API_SKIP_HEADERS

Default: ["Cookie"]

Determines headers that must not be mapped as customHttpRequestHeaders.

ZYTE_API_TRANSPARENT_MODE

Default: False

See Transparent mode.

ZYTE_API_USE_ENV_PROXY

Default: False

Set to True to make Zyte API requests respect system proxy settings. See Using a proxy.