Multi-Modal RAG: Why ChatGPT Can't Read Your Infographics
OCR is too expensive for live web crawling. Learn the Textual Shadow technique: writing hyper-dense figcaption and alt text so your visual data is indexed textually.
Why infographics fail in AI search
Infographics are one of the most popular content formats in content marketing. They're highly shareable, visually appealing, and can communicate complex data efficiently. They are also almost completely invisible to AI answer engines.
Consider what your average marketing infographic contains: statistics, trend lines, process flows, comparison charts, and key findings — often the most citable, data-dense content on your entire site. Now consider that RAG pipelines cannot read any of this information unless you've explicitly converted it to text. Your best data is locked inside PNG files.
The visibility paradox
OCR economics: why AI systems skip your images
Optical Character Recognition (OCR) — extracting text from images — is computationally expensive compared to reading existing HTML text. For a search engine crawling hundreds of millions of pages, running full OCR on every image is economically impractical at scale.
The practical implication: if your data exists only in image form, it will never be indexed by RAG pipelines at web-crawl scale. The solution is to create a "textual shadow" of every data-dense visual asset.
The Textual Shadow technique
A "Textual Shadow" is a dense, structured text representation of a visual asset that lives in the HTML alongside the visual element. It makes all the data, statistics, and insights contained in the visual available to text-based indexing without replacing the visual element itself.
The Textual Shadow combines three elements:
Descriptive alt text
Primary text shadowNot 'infographic about AI search statistics' — but 'AI search intercepts 19.5% of all queries (2025), up from 3.2% in 2023; ChatGPT accounts for 42% of AI search volume; Perplexity 31%; Gemini 27%'
Detailed figcaption
Secondary text shadowA 50–150 word paragraph that explains the infographic's key findings in complete sentences. This is the most citation-ready element because it contains full citable claims with context.
Structured data text alternative
Machine-readable shadowFor charts and graphs, a data table in the HTML that represents the same data as the visual. This creates a queryable, embeddable version of your visual data.
Alt text strategy for data-dense images
Standard accessibility-focused alt text guidelines say to describe what's in the image. For AI citation optimization, you should describe what's meaningful about the data in the image — the key statistics, trends, and findings that a human would cite if they were summarizing the infographic in text.
Standard alt text (AI-invisible)
"Infographic showing AI search statistics for 2025"
No data extractable. Matches only very broad queries.
Data-dense alt text (AI-optimized)
"AI search statistics 2025: 19.5% of all queries intercepted by AI answers (up from 3.2% in 2023). ChatGPT 42% market share, Perplexity 31%, Gemini 27%. B2B queries intercepted at 34% rate vs 12% for consumer queries."
6 specific statistics. Matches dozens of specific queries.
figcaption implementation guide
The <figcaption> element inside a <figure> block is semantically associated with the image by the HTML spec. Trafilatura and similar parsers preserve figcaption content specifically because of this semantic relationship. It's the highest-preservation text element adjacent to an image.
Before and after: measured citation impact
In a controlled test across 120 infographic pages, adding data-dense alt text and detailed figcaptions produced the following average improvements in AI citation rates over 60 days: