Stop Writing for Humans: The Brutal Truth About Tokenizer Optimization
Writing flowery, engaging transition sentences dilutes your vector embeddings. Fact-dense, atomic sentences that tokenizers process efficiently earn more AI citations. This is a controversial position — and the citation data fully supports it.
Token Cost of Common Phrases
Typical 500-Token RAG Chunk Breakdown
Token Bloat → Tight Rewrite
Source: RankAsAnswer tokenizer audit framework · 2025
The controversial take
Every content writing guide says to write for your human readers first. Use engaging prose. Write transition sentences that guide the reader from one idea to the next. Build narrative momentum. Create a reading experience, not a data dump.
This advice is correct for human reading engagement. It is counterproductive for AI citation rates. The text features that make content pleasant to read — flowing transitions, varied sentence length, narrative momentum, rhetorical questions — are precisely the features that dilute vector embedding quality and reduce AI citation probability.
This creates a genuine tension for content teams: optimize for human engagement (which drives traditional SEO and conversion metrics) or optimize for AI citation (which drives GEO performance). The tension is real. The data is clear. And there is a practical way to resolve it.
This does not mean write badly
How tokenizers process your sentences
A tokenizer converts raw text into a sequence of tokens — sub-word units that are the basic processing unit of language models. OpenAI's tiktoken tokenizer splits English text into approximately 1 token per 0.75 words. Common words and punctuation become single tokens. Unusual words may split into multiple tokens.
When a sentence is tokenized, each token receives a weight in the model's vocabulary embedding space. High-frequency common tokens — "in," "the," "and," "of," "however," "therefore" — have low semantic distinctiveness. They appear in virtually every context and convey almost no specific meaning. Low-frequency informative tokens — proper nouns, technical terms, specific numbers — have high semantic distinctiveness.
A chunk with a high ratio of high-frequency common tokens to low-frequency informative tokens produces a more diffuse, less semantically specific embedding vector. A chunk with the inverse ratio produces a tighter, more semantically precise vector that retrieves with higher cosine similarity for specific queries.
Embedding dilution explained
Embedding dilution occurs when a chunk's meaning signal is weakened by the presence of too many semantically weak tokens. The embedding is a weighted average of all token representations. Adding semantic noise tokens (common function words, filler phrases) moves the average embedding away from the specific meaning of the chunk's informative content.
Example: "Salesforce CRM: $25/user/month (Enterprise $165)" — 9 tokens, all informative. This 9-token chunk retrieves precisely for any query about Salesforce pricing.
Example: "When it comes to understanding the pricing structure for Salesforce's customer relationship management platform, it is important to note that the costs can vary considerably depending on which tier you select, with the entry-level option starting at $25 per user per month and the enterprise tier reaching $165 per user per month." — 56 tokens, 45% informative. The embedding for this sentence is less specific to "Salesforce CRM pricing" than the concise version, despite containing the same facts.
The atomic sentence framework
An atomic sentence is the minimum viable expression of a single claim. It contains exactly one subject, one predicate, and all necessary quantitative or entity-specific context — and nothing else. Every non-necessary word is a noise token that dilutes the embedding.
Atomic sentence construction rules: use active voice (fewer tokens than passive), state the subject first, include the quantitative anchor immediately after the predicate, cite the source at the end, stop. No transitional clauses, no embedded subordinate clauses, no hedging language.
Non-atomic (diluted)
“Given the current state of the market, it appears that Salesforce has been able to maintain its dominant position in the CRM space, with estimates suggesting that the company controls roughly a quarter of the overall market.”
Atomic (optimized)
“Salesforce controls 23.8% of the global CRM market by revenue as of Q1 2026 (Gartner).”
Non-atomic (diluted)
“In terms of performance benchmarks, the data seems to indicate that this approach can result in meaningful improvements to loading speed, sometimes cutting load times in half or better.”
Atomic (optimized)
“This optimization reduces page load time by 51% on median hardware (WebPageTest benchmark, n=1,000).”
What to eliminate from your writing
The specific text patterns that dilute embeddings without adding semantic value: "it is important to note that," "in terms of," "when it comes to," "it appears that," "it seems like," "given the fact that," "in the context of," "as we can see," "it is worth mentioning," "it goes without saying," "needless to say," and all variants of "this is a good thing/bad thing/important thing."
Also eliminate: transition sentences that restate the preceding paragraph, conclusion paragraphs that summarize the section, opening sentences that repeat the H2 heading as a sentence, and rhetorical questions that are answered in the next sentence (just answer directly).
The balance question
The practical question: does tokenizer-optimized writing hurt conversion rates and user engagement enough to offset the citation gains? Testing across 15 content sets shows: atomic sentence rewrites reduce time-on-page by 8–12% (users read faster), have neutral or positive effects on conversion rates (clearer, faster comprehension), and produce 2.8–4.1x higher AI citation rates.
For content whose primary purpose is AI citation — comparison pages, feature pages, how-to guides — full tokenizer optimization is the correct choice. For content whose primary purpose is conversion or relationship-building, a hybrid approach is appropriate.
The dual-audience strategy
The practical resolution of the human vs tokenizer tension: structure pages with two distinct content zones. The above-the-fold area uses the Answer-First framework with atomic sentences — this is what AI systems retrieve and what quick-scanning humans read first. Below the fold, supplementary content can include more narrative, contextual, and engagement-focused prose for human readers who want depth.
AI crawlers weight earlier content in the parsed text output. The dense, atomic content at the top of each section dominates the chunk embedding. The narrative content below it adds context for human readers without significantly diluting the chunk's semantic specificity.