There's a setting buried in your CDN dashboard that decides whether ChatGPT, Perplexity, and Google's AI Mode can even see your content. Most marketers have never looked at it. In 2026, that's a problem.
Cloudflare now blocks AI crawlers like OpenAI's GPTBot and Anthropic's ClaudeBot by default for new sites on its network, and its Pay Per Crawl system lets publishers charge bots for access. The upshot: your infrastructure team might be making a visibility decision that your marketing team doesn't know about — and if a model's crawler can't read your pages, your brand can't be cited in its answers.
Cloudflare's AI Crawl Control gives a publisher three choices for every AI bot that knocks on the door. Allow grants free access. Charge requires payment at a price you set, using the rarely-used HTTP 402 "Payment Required" response code. Block denies access with an HTTP 403, leaving a hint that payment could be negotiated later. (Search Engine Land)
Before this, the choice was binary and ugly: open your content to AI scrapers for free, or wall it off entirely. The "charge" option is the new middle path. Cloudflare introduced the HTTP 402 mechanism so a crawler that requests a paid URL gets a price header back; if the bot is configured to pay up to that amount, the content is served. (Cloudflare)
That sounds like a publisher win. For your AI visibility, it's a double-edged sword.
Here's the chain most teams miss. Models like ChatGPT and Perplexity surface your brand by retrieving and reading live web pages. If your crawler door is shut, you're not in the candidate set. No crawl, no read, no citation.
The scale of crawling is bigger than most realize. Bytespider — ByteDance's crawler — is the most active AI bot on Cloudflare's network, hitting more than 40.4% of all Cloudflare-protected domains. GPTBot, despite being the most mainstream, is also one of the most frequently blocked. (Search Engine Land) Many publishers don't even know these bots are visiting.
So you can end up invisible in AI search two ways: deliberately, by blocking crawlers to protect content, or accidentally, because a default setting or a security rule already blocks them and nobody checked.
This is exactly the blind spot Sourceable was built to catch. It tracks when and how your brand shows up across ChatGPT, Claude, Gemini, and Perplexity — so if a crawler change quietly knocks you out of AI answers, you see the drop in mentions instead of guessing why traffic fell.
Blocking or charging AI crawlers protects your content and can create a licensing revenue stream. Major publishers have already joined Cloudflare's Pay Per Crawl program, betting that AI companies should pay for the work they train and answer on. (Search Engine Land)
But there's a cost. The same Cloudflare update, for the first time, lets publishers block or charge traditional search crawlers like Googlebot and Bingbot too. Almost nobody should. Google still drives 30–60% or more of most publishers' traffic, and The Wall Street Journal has reported real traffic declines tied to AI search experiences already. (Search Engine Land) Cutting Bing matters especially for AI: ChatGPT's web browsing leans on Bing's index, so blocking Bingbot can remove you from ChatGPT's reach even if you never touched GPTBot.
The decision isn't "protect content" vs. "give it away." It's: how much AI-answer visibility are you willing to trade for control or licensing dollars? For a media company with a paywall, blocking may pay. For a B2B SaaS brand that wants to be the named answer when someone asks ChatGPT "best tools for X," blocking is self-sabotage.
The choice is no longer all-or-nothing per bot. Cloudflare has proposed a Content Signals mechanism that lets publishers declare whether their content may be used for AI training, for search indexing, or for inference (the live answer generation that produces citations) — three separate permissions instead of one blunt switch. (InfoQ)
That granularity is the move smart brands should want. You can say "don't train on my content, but do use it to answer questions and cite me." It separates the thing you might want to protect (training) from the thing that drives visibility (inference and indexing).
Platforms are layering on more controls too — Cloudflare and newsletter platform beehiiv rolled out joint AI crawler controls for publishers in June 2026. (Search Engine Land)
Pull up your robots.txt and your CDN's bot rules. Confirm what's actually happening to GPTBot, ClaudeBot, Google-Extended, PerplexityBot, and Bingbot. "We never decided to block anything" is not the same as "nothing is blocked" — defaults change.
A practical default for most brands that want AI visibility:
Allow the inference and search crawlers that generate cited answers (GPTBot, PerplexityBot, Google-Extended, and keep Bingbot fully open).
Consider charging or blocking only the pure-scraper bots that hammer your site without sending traffic, like Bytespider, if bandwidth or content theft is a real concern.
Use Content Signals (where supported) to separate training permission from indexing and inference permission.
Monitor the outcome. A setting change is a hypothesis. Whether you actually gained or lost AI-answer presence shows up in your mention counts, not your firewall logs.
That last point is where most teams fly blind. You can audit access rules all day, but the proof is whether ChatGPT and Perplexity still name you. Sourceable closes that loop by watching your brand's presence across the major models over time, so a crawler tweak becomes a measurable cause-and-effect instead of a mystery.
Does blocking AI crawlers remove my brand from ChatGPT and Perplexity? It can. If a model's crawler can't fetch your pages, your content won't be in the candidate set it reads to build an answer, so you're far less likely to be cited. Blocking Bingbot is especially risky because ChatGPT's browsing relies on Bing's index.
What is HTTP 402 and why does it suddenly matter? HTTP 402 "Payment Required" is a long-dormant status code Cloudflare repurposed for Pay Per Crawl. When an AI bot requests a paid URL, the server returns 402 with a price; the bot pays if it's configured to. It's the technical basis for "charge" as a third option beyond allow and block. (Cloudflare)
Should a small business block AI crawlers to protect content? Usually no. Unless you sell access to proprietary content, the bigger risk is invisibility. Most small brands benefit more from being cited in AI answers than from the small licensing fees or content protection that blocking provides.
How do I know if a crawler change hurt my AI visibility? Track your brand mentions across AI models before and after the change. A monitoring tool like Sourceable records how often ChatGPT, Claude, Gemini, and Perplexity reference you, so a drop is visible instead of hidden in analytics.
Pay Per Crawl reframed a question every brand now has to answer on purpose: block, charge, or allow. Get it wrong quietly and you can disappear from AI search without a single line of bad content. Decide it deliberately, set crawler rules that match your goals, and then watch your AI mentions to confirm the call was right.
Want to see whether AI models can actually find and cite your brand today? Check your AI search visibility with Sourceable.
Continue reading our latest insights
Most teams that have started tracking AI search treat it like a single scoreboard: "Are we showing up in AI answers, yes or no?" That framing is quietly costing them. The same brand, asked the same question, can be a confident recommendation inside Perplexity and a total no-show inside ChatGPT. New research from early 2026 puts hard numbers on just how far apart these engines really are — and the gap is wider than almost anyone assumes.