You Ranked. Now What? How AI Agents Decide Which Pages to Actually Read

Key takeaways

Mueller is right that llms.txt does not affect search rankings. That is not what it is for.
The real dynamic is selection after discovery: which of the 10-15 ranked URLs does an agent actually read?
Google's own Lighthouse now automatically checks for llms.txt. The signal is clear.

John Mueller is Google's Senior Search Analyst and one of the most influential voices in SEO. For years he has been the person webmasters, SEO professionals, and site owners turn to when they want to understand what Google actually thinks about how the web should be built. When he speaks, the industry listens. So when he posted that llms.txt is not necessary, that LLMs can read HTML just fine, and that it is likely a temporary crutch while the technology matures, the SEO community largely nodded along. He is not wrong about any individual claim. But he is answering a question nobody serious is actually asking.

The debate has the wrong frame

Everyone arguing about llms.txt is arguing about discovery. Does it help you rank? Will Google use it as a ranking signal? Will Bing? That is the SEO frame, and it is the wrong one. Nobody is claiming llms.txt helps you show up in search results. That is SEO's job, and SEO still matters enormously. You still need to rank to even be considered.

The question is what happens after you rank.

How it actually plays out

A user asks an AI agent to recommend GPUs for their ML project. The agent calls a search API, gets back 10 to 15 URLs, and now faces a resource allocation problem: which of these do I actually spend time and tokens reading?

Search returns 10-15 URLs ranked by traditional SEO signals.
The agent checks: which of these have an llms.txt file?
Sites with llms.txt: structured context pulled in milliseconds. The agent knows what the site sells, what is relevant, what to trust.
Sites without llms.txt: the agent has to fetch the page, parse HTML, strip navigation and ads, and guess what matters.
Once the agent has gathered enough signal from the structured sites, it stops. The rest never get read.

This is not a theoretical edge case. It is the default behavior of any rational token-budgeted system. When you have 15 candidates and 8 of them offer a clean structured summary while 7 require expensive crawl-and-parse, you read the 8 first. If the 8 are enough to answer the query, the 7 never get touched.

The Lighthouse signal

Google's own team just merged a pull request into Lighthouse that adds an automatic llms.txt check to the agentic browsing config. The PR description is plain: it adds a gatherer and audit to check for a valid llms.txt file on the site and integrates it into the agentic browsing config. This is not a third-party tool or a startup's wishful thinking. This is Google's web quality framework treating llms.txt as a first-class signal for how well a site serves AI agents.

Mueller speaks from the search ranking team's perspective, which is exactly the right lens for that specific question. Lighthouse is Google's web quality tool, the one that tells developers whether their site is actually working well in practice. Mueller says llms.txt is not necessary for ranking. Lighthouse says it matters for agent readiness. Both are correct. The mistake is assuming they are talking about the same thing.

The two-tier web

What is forming right now is a clean split between two types of sites. Tier 1 sites can be understood instantly. An agent fetches one lightweight file and gets a complete, verified picture of the business. Tier 2 sites cost real compute to evaluate: HTML fetches, DOM parsing, navigation stripping, inference over noisy token streams. Agents gravitate to Tier 1 by default because time and tokens are finite.

The interesting consequence is what happens to rankings. The site ranked 8th with a clean llms.txt can beat the site ranked 2nd without one, because the agent picks up the structured site first and may never get to the others. Traditional ranking still gets you into the pool. But it is no longer sufficient to guarantee you get read.

Ranking gets you on the list. llms.txt gets you read.

Mueller is technically right and practically wrong

The "LLMs can read HTML just fine" argument is technically accurate. Agents can crawl and parse HTML. They do it every day. The problem is the word "will." Yes, agents can read your HTML. Will they, when there are 8 other sites that delivered clean summaries in 200ms? Probably not.

The "temporary crutch" framing is where the argument breaks down completely. Token efficiency is not a transitional quirk waiting to be engineered away. It is a fundamental property of how these systems work. Compute costs money. Context windows have limits. Agents operating at scale will always prefer sources that cost less to process. That preference is not going to disappear when models get smarter. If anything, as agents become more capable and handle more complex multi-step tasks, the pressure to conserve tokens per step increases.

The physics of compute do not negotiate

This is the part that gets lost in the SEO debate framing. llms.txt is not a ranking trick. It is not a shortcut while AI figures itself out. It is a response to a permanent constraint: processing structured data is cheaper than inferring structure from unstructured markup, and cheaper sources get chosen more often. That is not a quirk of current models. That is arithmetic.

Every site without llms.txt is implicitly choosing to compete only on traditional search rank, then hope agents still bother to read them. Every site with llms.txt is competing at both layers: rank in search, then be the cheapest and clearest site to read once the agent has its candidate list.

What this means in practice

Keep doing SEO. Rank well. That part of the argument is completely valid: you need to appear in the candidate list before any of this matters. But understand that getting into the list is now just the first filter. The second filter is agent efficiency, and it runs on different rules.

Run a Site Scan on your domain to see what an agent actually encounters when it finds you. Then ask whether you want to be Tier 1 or Tier 2 in the world that Lighthouse just formalized.