← All articles
Content Strategy

Write for the chunk: how AI engines actually read your site

AI engines do not read your page, they read chunks. Here is how retrieval splits the web into passages, why one paragraph gets quoted while the rest is ignored, and how to write answer-first, self-contained, entity-named passages that get cited.

June 16, 2026·6 min read·Tripcite

AI engines do not read your page. They read chunks. A live-retrieval engine never loads your beautifully designed page top to bottom and weighs it as a whole. It splits the web into passages, scores each one against the question, and quotes the single passage that wins, read in isolation from everything around it. If you want to be cited, you do not optimize the page. You optimize the chunk.

What a chunk actually is

When a retrieval engine like Perplexity, ChatGPT with web search, Gemini or Google AI Overviews answers a question, it runs a pipeline that looks roughly like this:

  • Split: your page is broken into passages, often a few sentences to a paragraph each.
  • Embed: each passage is turned into a vector, a numerical representation of its meaning.
  • Match: the question is embedded the same way, and the engine retrieves the passages whose vectors are closest to it.
  • Quote: the model writes an answer grounded in the top passages and cites the source.

The unit of competition is the passage, not the URL. One paragraph on your page can win the citation while the rest of the page is never seen. That is the whole game, and it changes how you should write.

Why most pages lose the chunk

A page can rank beautifully on Google and still never get quoted by an AI, because the two systems reward different things. Classic SEO rewards the page. Retrieval rewards the passage. The passages that lose tend to share three problems:

  • The answer is buried. The useful sentence sits three paragraphs into a section, after setup and throat-clearing. The retriever scores the passage on the question, and a passage that opens with context instead of the answer matches worse.
  • The passage cannot stand alone. It depends on the heading above it, the sentence before it, or a pronoun whose antecedent is two paragraphs back. Read in isolation, it means nothing, so the model cannot safely quote it.
  • The entity is missing. The passage says “our platform” or “the tool” instead of naming the brand. When the model lifts it, your name does not come with it, and an unnamed mention is not a citation.

The three rules of a citable chunk

Write every passage you care about so it survives being read on its own:

  • Answer-first. Put the answer in the first sentence, then explain. Lead with the claim, the number, the verdict. The opening sentence is what the retriever scores hardest against the question.
  • Self-contained. The passage should make complete sense with zero surrounding context. No orphan pronouns, no “as mentioned above,” no dependency on the heading. Assume it will be read alone, because it will be.
  • Entity-named. Say the brand name explicitly, not “we” or “our product.” If a competitor names themselves clearly in their passage and you do not, the model can describe them and not you, and it quietly recommends the one it can explain.

Formats that are already shaped like a self-contained answer get quoted disproportionately: FAQ entries, comparison tables, “X vs Y” sections, and short definitional paragraphs. They are answer-first and standalone by construction, which is exactly what a retriever is looking for.

A quick before and after

Before (buried, dependent, unnamed):“There are a lot of factors to consider when choosing a solution. After weighing them, many teams find that our platform offers the best balance of price and features for their needs.”

After (answer-first, self-contained, entity-named):“Tripcite measures how often ChatGPT, Perplexity, Gemini and Claude cite your brand versus competitors, reported as a share-of-model percentage by engine and query.” The second version can be lifted into an answer verbatim, and your name rides along with it.

The chunk is one lever, not the whole game

Writing for the chunk wins you live-retrieval engines, the ones reading the web right now. It does not, by itself, win the engines answering from memory with no web access, which name brands they already know from training. That is a separate lever, entity authority, and the two work together. We cover the full picture in What is Generative Engine Optimization (GEO)? and the strategic split in SEO vs GEO: what changes, and how to attack each.

Start by seeing which chunks already win

Before you rewrite anything, find out what the engines quote about your category today, and whether it is you or a competitor. That baseline is your share of model. See how to measure it in How to measure your AI search visibility, or look at two real, public audits we ran: MyCantera and Odisea Tours. Tripcite tracks it across every major engine, and shows you the exact citation sources behind every answer, so you know which chunks to win.

What is your share of model?

See how often ChatGPT, Perplexity, Gemini and Claude cite your brand versus your competitors. Get a baseline audit.