Retries

To make error handling easier, scrapy-zyte-api lets you handle successful Zyte API responses as usual, but implements a more advanced retry mechanism for rate-limiting and unsuccessful responses.

Retrying successful Zyte API responses

When a successful Zyte API response is received, a Scrapy response object is built based on the upstream website response (see Response mapping), and passed to your downloader middlewares and spider callback.

Usually, these responses do not need to be retried. If they do, you can retry them using Scrapy’s built-in retry middleware (RetryMiddleware) or its get_retry_request() function.

Retrying non-successful Zyte API responses

When a rate-limiting or an unsuccessful Zyte API response is received, no Scrapy response object is built. Instead, a retry policy is followed, and if the policy retries are exhausted, a zyte_api.RequestError exception is raised.

That zyte_api.RequestError exception is passed to the process_exception method of your downloader middlewares and to your spider errback if you defined one for the request. And you could have RetryMiddleware retry that request by adding zyte_api.RequestError to the RETRY_EXCEPTIONS setting. But you are better off relying on the default retry policy or defining a custom retry policy instead.

Retry policy

Retry policies are a feature of the Python Zyte API client library, which scrapy-zyte-api uses underneath. See the upstream retry policy documentation to learn about the default retry policy and how to create a custom retry policy, including ready-to-use examples.

In scrapy-zyte-api, use the ZYTE_API_RETRY_POLICY setting or the zyte_api_retry_policy Request.meta key to point to a custom retry policy or to its import path, to override the default retry policy:

settings.py
ZYTE_API_RETRY_POLICY = "project.retry_policies.CUSTOM_RETRY_POLICY"