Your Accept-Language Redirects Could Be Blocking Search Engines and AI Crawlers

R&D Director

March 3, 2026 · 6 min read

Locale-adaptive redirects are one of those things that “worked fine” until the crawler ecosystem changed.

Search engines generally don’t use Accept-Language in crawl requests but some AI crawlers do, often with default English US values. If your site redirects HTML requests based on Accept-Language, you can accidentally funnel bots into the wrong locale, reduce coverage of non-English pages, and make debugging harder (especially when rendering is involved).

This post explains what we observed in testing.

TL;DR

Most search engine bots don’t send Accept-Language, so these redirects often didn’t fire historically.
Many AI crawlers send browser-like defaults (often en-US,en;q=0.9), which is usually not user intent.
Redirecting HTML based on Accept-Language can skew discovery and indexing toward the wrong locale, and gets messier during rendering. </aside>

Accept-Language redirects used to be “fine”

For years, technical SEO has relied on a stable rule: don't do locale-adaptive redirects for search engines.

In practice, that meant avoiding redirects based on IP geolocation, cookie state, or "smart" locale detection. Instead, create clean, crawlable URLs for each locale and add hreflang annotations when needed.

Accept-Language wasn't part of this conversation for a long time - because major search engine bots simply don't send it in crawl requests. Google states this explicitly in their locale-adaptive pages documentation.

But search engines aren't the only crawlers anymore.

AI crawlers are increasingly common, and many behave differently. They're often less mature than major search engines, more experimental, and more likely to trigger edge cases we never had to consider.

The edge case we’ll consider in this post: AI bots may send Accept-Language headers.

If your platform’s redirects rules are based on Accept-Language, you risk creating redirect loops or blocking certain bots from accessing specific language content.

Quick refresher on the Accept-Language header

Accept-Language is an HTTP request header used for content negotiation: the client tells the server which natural languages it prefers. Browsers typically set it based on UI language and user preferences.

The HTTP semantics are standardised in the RFC 9110.

It’s a preference signal, not a command. Treat it as “this is what the client might prefer,” not “always redirect me.”

The “old” best practices

Google's documentation on locale-adaptive pages is clear and consistent:

If you return different content based on perceived country or language preference, Google might not crawl, index, or rank all variants.
Googlebot's default crawling IPs appear US-based.
Googlebot sends HTTP requests without Accept-Language.
Google recommends separate locale URLs and hreflang annotations.

Other major search engines follow similar rules.

In the old SEO world, Accept-Language redirects were rarely a problem since there was no header to trigger them. IP-based detection was the more obvious real risk.

What changed (and what we tested)

We tested major search engines, AI platforms, and web-access LLMs to see how they fetch HTML documents in the wild and specifically which request headers they send and how they handle redirect chains.

Scope

What we tested	Details
Content type	HTML requests only
Signals observed	Accept-Language and redirect chains

Why focus on HTML?

HTML is where localisation decisions are usually enforced (often through redirects). It’s also the primary content search engines index and the main input LLM retrieval systems use for grounding.

What we observed

Across tested platforms, requests were typically sent either:

without an Accept-Language header, or
with a default “automated Chrome” header, usually: Accept-Language: en-US,en;q=0.9

Crucially, AI crawlers and retrieval systems didn’t actively adapt Accept-Language based on:

prompt language
user locale
browser language settings
conversation context

In other words: when the header exists, it’s rarely user intent — it’s an implementation default. And in some cases, even Googlebot can end up presenting the same default Accept-Language when it follows redirects during rendering.

Practical takeaway: redirecting HTML based on Accept-Language can reduce indexing quality and create “wrong language” retrieval for both search engines and LLM-driven systems.

Notable exceptions: Applebot and PetalBot behaved differently in our tests, based on internal custom logic.

Data Export

Below is the raw header behaviour we observed across platforms.

Platform / Agent	Accept-Language header
Googlebot ****(also used by Gemini)	Absent. en-US is present when a JS redirected is followed during rendering.
Adsbot-Google	Absent. en-US is present when a JS redirected is followed during rendering.
Mediapartners-Google (Adsense)	Absent. en-US is present when a JS redirected is followed during rendering.
Bingbot (also used by Microsoft / Copilot)	Absent.
Adidxbot (Microsoft Ads)	Absent.
YandexBot	Absent or en, *;q=0.01.
Yeti (Naver)	Absent.
Baiduspider	Can be absent or zh-cn,zh-tw. zh-CN,en;q=0.9,en-GB;q=0.8,en-US;q=0.7,fr;q=0.6 is present when a JS redirected is followed during rendering.
Sogou	Can be absent, en-US,en;q=0.9, or zh-CN.
DuckDuckBot	Absent or en-US,en;q=0.8,zh;q=0.6,es;q=0.4
DuckDuckBot-Https	Absent or en,*
Applebot (also used by Apple Intelligence / Spotlight)	Applebot appears to use Accept-Language values that match the domain's country code top-level domain (ccTLD). For .com domains, the Accept-Language header is absent. For localised domains like .de, it sends de-DE, and for .co.jp, it sends ja-JP.
OpenAI GPTBot	Absent.
OpenAI OAI-SearchBot	Absent.
OpenAI ChatGPT-User	Can be en-US,en;q=0.9 or absent
Anthropic ClaudeBot	Absent.
Anthropic Claude-User	Absent.
Anthropic Claude-SearchBot	Absent.
Perplexity PerplexityBot	Absent.
Perplexity Perplexity‑User	Absent.
MistralAI	Absent or en-US,en;q=0.9.
Bytespider	Can be absent, en-US,en;q=0.5 or zh,zh-CN;q=0.9.
DuckAssistBot	Absent.
meta-externalagent	Absent.
PetalBot	Absent most of the time. Use Accept-Language values that match the domain's country code top-level domain (ccTLD) in a non consistent way. For .com domains, the Accept-Language header is absent. For localised domains like .fr sends fr,en;q=0.8, but .at sends en.
Amazonbot	Absent.
TikTokSpider	Absent or en-US,en;q=0.5.
Pinterestbot	Absent.
CCbot	Absent or en-US,en;q=0.5

Note: This table reflects data from February 2026 and may change over time. After we identified redirects linked to the Accept-Language header a few months ago, a subset of AI crawlers stopped including that header in their requests.

“Helpful” redirects that harm discovery

Accept-Language redirects are typically implemented to help users reach the right content immediately, bypassing language selector screens and additional interactions. For users, it's possible to measure the success of having such redirects through conversion rates and interaction metrics. For bots, the effects are more complex and subtle.

Here’s the typical pattern:

Bot requests canonical URL (e.g., /product)
Server redirects based on Accept-Language
Bot lands on /en/product (or worse: a generic homepage)
Indexing and retrieval now skew toward English, even if better alternatives exist

This creates downstream problems:

Partial indexing: If English is the default redirect target, you're training both search engines and LLM retrieval systems to prefer English - regardless of user intent. This can also influence answer content and citations.
Crawl inefficiency: Every redirect adds an extra hop, consuming time and resources.
Complex debugging: Not all teams have access to request headers in logs. This adds an extra layer of complexity and uncertainty.

Accept-Language might matter for Googlebot during rendering

In our tests, Googlebot's initial crawl fetch did not include Accept-Language (as expected). However, when a redirect was triggered during rendering, the follow-up requests inherited some request headers from the browser instance, including the default language preference (Accept-Language: en-US).

This creates a tricky edge case: when a platform redirects based on Accept-Language, the redirect doesn't trigger during HTML fetches but may trigger during rendering fetches.

In the official Google documentation, one of the suggested solution to avoid soft 404 redirect for SPAs is to use a JavaScript redirect to a URL for which the server responds with a 404 HTTP status code (for example /not-found).

Depending on how the Accept-Language redirect is implemented, we may have inconsistent soft 404 handling and indexing signals get muddy:

Googlebot requests Soft 404 URL (e.g., /product)
The page has a Javascript redirect to /not-found
The server intercepts the fetch with Accept-Language: en_US and redirects again based on internal rules
Googlebot lands on /en/not_found, or on a page that not returns a 404, or worse on a generic homepage

Conclusion

Accept-Language isn't a "bad" header, it's a standard that platforms can use in multiple ways. What breaks is the assumption that crawlers behave like users.

The web now includes a wide range of crawlers, many send browser-like headers and aggressively explore edge cases.

Our internal testing across major Bots and LLM platforms supports these statements:

Bots and LLM crawlers do not use Accept-Language as a localisation signal
When present, Accept-Language is typically default en-US,en;q=0.9
Therefore, Accept-Language based redirects for bots:
- do not reliably improve user experience,
- introduce content accessibility risks,
- and can reduce indexing quality for both search engines and LLM retrieval systems.

LLMs naturally tend to prefer English over other languages - an emergent behaviour driven by English-heavy training data, tokeniser efficiency differences, and alignment signals. However, redirecting based on Accept-Language, reinforces this bias by forcing "English-only" content. This can unnecessarily exclude relevant non-English sources and create a partial, skewed view of available information.

We recommend avoiding redirects for bots based on the Accept-Language header. Build following internet standards, make URLs explicit, and keep redirects predictable.