Retries
To make error handling easier, scrapy-zyte-api lets you handle successful Zyte API responses as usual, but implements a more advanced retry mechanism for rate-limiting and unsuccessful responses.
Retrying successful Zyte API responses
When a successful Zyte API response is received, a Scrapy response object is built based on the upstream website response (see Response mapping), and passed to your downloader middlewares and spider callback.
Usually, these responses do not need to be retried. If they do, you can retry
them using Scrapy’s built-in retry middleware
(RetryMiddleware
) or its
get_retry_request()
function.
Retrying non-successful Zyte API responses
When a rate-limiting or an unsuccessful Zyte API response is received, no Scrapy
response object is built. Instead, a retry policy is
followed, and if the policy retries are exhausted, a
zyte_api.RequestError
exception is raised.
That zyte_api.RequestError
exception is passed to the
process_exception
method of your downloader middlewares and to your spider errback if you defined one for the request. And you could have
RetryMiddleware
retry that request
by adding zyte_api.RequestError
to the RETRY_EXCEPTIONS
setting. But you are better off relying on the
default retry policy or defining a custom retry policy instead.
Retry policy
Retry policies are a feature of the Python Zyte API client library, which scrapy-zyte-api uses underneath. See the upstream retry policy documentation to learn about the default retry policy and how to create a custom retry policy, including ready-to-use examples.
In scrapy-zyte-api, use the ZYTE_API_RETRY_POLICY
setting or the
zyte_api_retry_policy
Request.meta
key to point to a custom retry policy or to its
import path, to override the default retry policy:
ZYTE_API_RETRY_POLICY = "project.retry_policies.CUSTOM_RETRY_POLICY"