The Referer header
By default, Scrapy automatically sets a Referer header on every request
yielded from a callback (see the
RefererMiddleware
).
However, when using transparent mode or automatic request parameters, this behavior is disabled by default for Zyte API requests, and when using manual request parameters, all request headers are always ignored for Zyte API requests.
Why is it disabled by default?
A misuse of the Referer
header can increase the risk of bans.
By not setting the header, your Zyte API requests let Zyte API choose which value to use, if any, to minimize bans.
If you do set the header, while Zyte API might still ignore your value to avoid bans, it may also keep your value regardless of its impact on bans.
How to override?
To set the header anyway when using transparent mode or automatic request parameters, do any of the following:
Set the
ZYTE_API_REFERRER_POLICY
setting or thereferrer_policy
request metadata key to"scrapy-default"
or to some other value supported by theREFERRER_POLICY
setting.Set the header through the
DEFAULT_REQUEST_HEADERS
setting or theRequest.headers
attribute.Set the header through the customHttpRequestHeaders field (for HTTP requests) or the requestHeaders field (for browser requests) through the
ZYTE_API_AUTOMAP_PARAMS
setting or thezyte_api_automap
request metadata key.
When using manual request parameters, you always need to set
the header through the customHttpRequestHeaders or
requestHeaders field through the
ZYTE_API_DEFAULT_PARAMS
setting or the zyte_api
request
metadata key.