But that isn't that different from requesting the llms.txt version. Why not just...

fullstackchris · 2026-05-31T21:43:42 1780263822

How can it know which tokens not to read without reading them? and llms.txt is a single file for the whole site... not the same

k1m · 2026-06-01T06:35:12 1780295712

I was using llms.txt as the general idea of providing an alternative version of your content for agents - whether that's llms.txt for the entire site, my-article.md instead of my-article.html for a specific page, or via content-negotiation as your link prefers.

The content (HTML or Markdown) only become tokens when given to the model. Agents use parameters to limit the output from their tool calls all the time, precisely to reduce the number of tokens they have to pass to the model. So when an agent requests content for example.com/page and gets a 800KB response, those are not tokens yet. It could simply call a tool to extract the useful info before it gives the content to the model. That would effectively produce the same number of tokens as requesting example.com/page.md or example.com/page with request headers preferring markdown.

So why not just make sure the useful info is easily extractable from the same HTML? Less work, no content negotiation on the server side, no worrying about maintaining two similar versions of the same content.

As an aside, I've always been against content negotiation for different representations of content. So if you really must maintain two different versions of your content (HTML and Markdown, say) make them different URLs. I agree with Roy Fielding on this[1]:

> It is a bad design trade-off to send a bunch of header fields on every request just to tell the server all of the possible variations of preference held by the user, particularly when there is a very small chance that any of those dimensions are applicable to the target resource. It has been a bad design trade-off ever since the very brief period in 1993-94 when folks didn't know which image format would be usable on all UAs and there was no CSS or javascript to allow for client-side adaptation.

> ...The caching impact of proactive negotiation is far worse than the one extra round trip per site for reactive negotiation, and even that round-trip isn't necessary in formats that support client-side adaptation.

On the caching impact, see this from Simon Willison[2]:

> ...you can’t deploy an application that uses content negotiation via the Accept header behind the Cloudflare CDN — for example serving JSON or HTML for the same URL depending on the incoming Accept header. If you do, Cloudflare may serve cached JSON to an HTML client or vice-versa.

[Edited to add: if the source of truth is already Markdown in your system, by all means expose that. What I'm discussing here is related to efforts to produce new Markdown or plain text output, in addition to HTML, specifically for agents]

[1] https://lists.w3.org/Archives/Public/ietf-http-wg/2013JanMar... [2] https://simonwillison.net/2023/Nov/20/cloudflare-does-not-co...