Settings

Settings for scrapy-zyte-api.

ZYTE_API_AUTOMAP_PARAMS

Default: {}

dict of parameters to be combined with automatic request parameters.

These parameters are merged with zyte_api_automap parameters. zyte_api_automap parameters take precedence.

This setting has no effect on requests with manual request parameters.

When using transparent mode, be careful of which parameters you define in this setting. In transparent mode, all Scrapy requests go through Zyte API, even requests that Scrapy sends automatically, such as those for robots.txt files when ROBOTSTXT_OBEY is True, or those for sitemaps when using SitemapSpider. Certain parameters, like browserHtml or screenshot, are not meant to be used for every single request.

If zyte_api_default_params in Request.meta is set to False, this setting is ignored for that request.

See Default parameters.

ZYTE_API_BROWSER_HEADERS

Default: {"Referer": "referer"}

Determines headers that can be mapped as requestHeaders.

It is a dict, where keys are header names and values are the key that represents them in requestHeaders.

ZYTE_API_DEFAULT_PARAMS

Default: {}

dict of parameters to be combined with manual request parameters.

You may set zyte_api to an empty dict to only use the parameters defined here for that request.

These parameters are merged with zyte_api parameters. zyte_api parameters take precedence.

This setting has no effect on requests with automatic request parameters.

If zyte_api_default_params in Request.meta is set to False, this setting is ignored for that request.

See Default parameters.

ZYTE_API_ENABLED

Default: True

Can be set to False to disable scrapy-zyte-api.

ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED

Default: False

See Automatic mapping.

ZYTE_API_FALLBACK_REQUEST_FINGERPRINTER_CLASS

Default: scrapy_poet.ScrapyPoetRequestFingerprinter if scrapy-poet is installed, else scrapy.utils.request.RequestFingerprinter

Request fingerprinter to for requests that do not go through Zyte API. See Request fingerprinting.

ZYTE_API_KEY

Default: None

Your Zyte API key.

You can alternatively define an environment variable with the same name.

Tip

On Scrapy Cloud, this setting is defined automatically.

ZYTE_API_LOG_REQUESTS

Default: False

Set this to True and LOG_LEVEL to "DEBUG" to enable the logging of debug messages that indicate the JSON object sent on every Zyte API request.

For example:

Sending Zyte API extract request: {"url": "https://example.com", "httpResponseBody": true}

ZYTE_API_LOG_REQUESTS_TRUNCATE

Default: 64

Determines the maximum length of any string value in the JSON object logged when ZYTE_API_LOG_REQUESTS is enabled, excluding object keys.

To disable truncation, set this to 0.

ZYTE_API_MAX_COOKIES

Default: 100

If the cookies to be set during request mapping exceed this limit, a warning is logged, and only as many cookies as the limit allows are set for the target request.

To silence this warning, set experimental.requestCookies manually, e.g. to an empty dict.

Alternatively, if experimental.requestCookies starts supporting more than 100 cookies, update this setting accordingly.

ZYTE_API_MAX_REQUESTS

Default: None

When set to an integer value > 0, the spider will close when the number of Zyte API requests reaches it.

Note that requests with error responses that cannot be retried or exceed their retry limit also count here.

ZYTE_API_PROVIDER_PARAMS

Default: {}

Defines additional request parameters to use in Zyte API requests sent by the scrapy-poet integration.

For example:

settings.py

ZYTE_API_PROVIDER_PARAMS = {
    "requestCookies": [
        {"name": "a", "value": "b", "domain": "example.com"},
    ],
}

ZYTE_API_RETRY_POLICY

Default: "zyte_api.aio.retry.zyte_api_retrying"

Determines the retry policy for Zyte API requests.

It must be a string with the import path of a tenacity.AsyncRetrying subclass.

Note

Settings must be picklable, and retry policies are not, so you cannot assign a retry policy class directly to this setting, you must use their import path as a string instead.

See Retries.

ZYTE_API_SKIP_HEADERS

Default: ["Cookie"]

Determines headers that must not be mapped as customHttpRequestHeaders.

ZYTE_API_TRANSPARENT_MODE

Default: False

See Transparent mode.

ZYTE_API_USE_ENV_PROXY

Default: False

Set to True to make Zyte API requests respect system proxy settings. See Using a proxy.

Settings

ZYTE_API_AUTOMAP_PARAMS

ZYTE_API_BROWSER_HEADERS

ZYTE_API_COOKIE_MIDDLEWARE

ZYTE_API_DEFAULT_PARAMS

ZYTE_API_ENABLED

ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED

ZYTE_API_FALLBACK_REQUEST_FINGERPRINTER_CLASS

ZYTE_API_KEY

ZYTE_API_LOG_REQUESTS

ZYTE_API_LOG_REQUESTS_TRUNCATE

ZYTE_API_MAX_COOKIES

ZYTE_API_MAX_REQUESTS

ZYTE_API_PROVIDER_PARAMS

ZYTE_API_RETRY_POLICY

ZYTE_API_SKIP_HEADERS

ZYTE_API_TRANSPARENT_MODE

ZYTE_API_USE_ENV_PROXY