A year ago, "ranking in AI search" meant showing up in a weird preview snippet at the top of Google results. Today it means being cited by ChatGPT when someone asks about your industry, appearing in Claude's web-browsing responses, getting quoted in Perplexity's answer engine, and surviving Google's AI Overview as a linked source rather than getting summarised and skipped. Each of those surfaces behaves differently. And collectively, they now intercept roughly 30% of informational searches before a user ever clicks a traditional blue link.
The discipline of optimising for them has a clumsy name — Generative Engine Optimisation, or GEO — and a short but growing body of evidence about what actually works. This guide covers what we have learned running SEO audits across 50+ client sites in the last year, watching which content gets cited by AI systems and which gets ignored. It is long on specifics and short on platitudes, because platitudes are everywhere and none of them are actually testable.
What Changed in 18 Months
Three architectural shifts happened in search between mid-2024 and now, and they all compound.
- Google rolled out AI Overviews globally. These AI-generated summaries now appear on roughly 30% of informational queries. They pull content from 3 to 8 cited sources — sometimes visible, sometimes absorbed into the summary without attribution.
- ChatGPT, Claude, and Gemini gained web browsing and retrieval. When you ask ChatGPT "what is the best CRM for a 10-person agency," it fetches real web pages, reads them, and writes an answer with citations. The citations behave exactly like rankings — click-through follows the order.
- Perplexity and similar answer engines became genuinely mainstream. Perplexity's monthly active users reportedly passed 15 million in late 2025. These users often skip Google entirely.
The practical takeaway is brutal for traditional SEO: a well-ranked Google position is now only one of three outcomes that matter. The article that ranks #3 on Google but never gets cited by ChatGPT is losing 40% of its potential reach. The article that ranks #8 but gets cited in every Perplexity answer on its topic is winning.

How LLMs Actually Pick What to Cite

This is the first question we get from clients, and the honest answer is: nobody outside the labs knows for sure. But we have run enough experiments — and the published literature has enough hints — that we can describe the pattern with reasonable confidence.
When an LLM answers a question by browsing or via retrieval-augmented generation, it goes through something like the following steps. First, the user's question is turned into a search query (sometimes multiple queries). Second, those queries hit a search index — usually Google or Bing under the hood — and return the top N results. Third, the model reads the pages, one at a time, and picks which sources to quote, paraphrase, or drop.
The third step is where GEO actually lives. Because the model is *reading the full page* and deciding, in effect, "is this a good source for this specific question," the properties of the page that matter are different from classical SEO. Page authority matters less. Structural clarity matters more. Specific answers to specific questions matter enormously.
In practice, the following properties predict LLM citation most reliably:
- The first sentence of a relevant section directly answers the user's sub-question. Not "In this article we will explore the concept of…" Direct: "INP replaced FID as Google's responsiveness metric in March 2024."
- Specific data points — numbers, dates, percentages. "Roughly 30%" beats "a significant portion." "March 2024" beats "recently."
- Clear H2 structure — each H2 reads like an answerable question, and the paragraph below delivers the answer self-contained. The LLM reads H2 → paragraph as a discrete unit.
- Named, attributable claims — "Google's Search Quality Rater Guidelines say X" is more citable than "experts agree X." LLMs prefer sources that make their sourcing explicit.
- Reasonable recency — LLMs that see two candidate pages, one dated 2022 and one 2026, will cite 2026 unless the older page has overwhelmingly more authority.
The 8-Point GEO Checklist
This is what we add to every content engagement now, on top of normal SEO. Nothing in this list is optional if the content is competing for questions LLMs will be asked.
- 1. Open every article with a 50-word direct answer. The first paragraph should be able to stand alone as the response to the title as a question. LLMs frequently lift this paragraph verbatim.
- 2. Use question-shaped H2s. "How Does [X] Work" outperforms "[X] Overview." The model can match query to heading directly.
- 3. Put one killer data point in every section. Statistics are citation bait. If you don't have original data, attribute sources.
- 4. Define your terms explicitly, in the same paragraph you first use them. "Interaction to Next Paint (INP), which measures user-perceived responsiveness…" not "INP, see our glossary."
- 5. Publish an /llms.txt at the root of your site (the emerging spec). Indexes your pillar content for LLM crawlers.
- 6. Use FAQ schema for any page with 3+ question-answer pairs. AI Overviews over-index on FAQ-schema content for long-tail queries.
- 7. Keep pages at stable URLs with accurate lastmod dates in your sitemap. LLMs prefer fresh, authoritative sources over stale ones, all else equal.
- 8. Write named-entity-heavy prose. Mention specific companies, people, products, dates. Entity density is a strong signal of topical authority.


The llms.txt Convention
The llms.txt spec is to LLM crawlers what robots.txt is to search engines. You place a Markdown file at the root of your domain (e.g. socialscript.in/llms.txt) that lists your site's most important URLs with short descriptions. LLM-side crawlers like Perplexity's, ChatGPT's web plugin, and increasingly Claude's retrieval layer use this as a starting point for understanding your site.
We ship llms.txt on every client site now. The overhead is low — it takes under an hour to write a good one — and the upside is real: it guarantees that when an AI crawler is summarising your site, it sees the pages you actually care about, not whatever random URLs Google happens to surface.
Minimum viable llms.txt:
- An H1 with your brand name
- A blockquote with a one-sentence summary
- A paragraph of 100–150 words framing what your site covers
- Sections for your main offerings, each with a Markdown link plus a one-line description: [Service Name](https://www.yourbrand.com/services/x): one-line summary
Keep it under 200 lines. Anything longer gets truncated by most crawlers. For depth, publish a companion llms-full.txt with longer descriptions.

AI Overviews — Google's Specific Quirks

Google's AI Overview has its own behaviour worth understanding. Three patterns we have observed consistently:
- It heavily over-indexes on recency. A month-old article on a trending topic will beat a year-old industry standard page most of the time.
- It favours listicles and structured answers. 'Top 10' and 'Step 1, Step 2' formats get cited more than narrative essays, even when the narrative is better written.
- It summarises aggressively. The cited sources often get 1 sentence of reference and a linked icon. Click-through from AI Overviews runs about 50% lower than from classical blue-link rankings, so being cited is worth less than being the first blue link below.
The playbook that follows from this: for queries where an AI Overview already appears, aim to be cited AND rank #1 below. Getting cited alone is a pyrrhic win — users read the summary and leave. Ranking alone misses the traffic that would have clicked the summary source. Both together is the outcome.
What This Means For Content Strategy
The shift has reshaped how we plan content calendars. Specifically:
- Pillar posts matter more. A 3,000-word in-depth guide on your core topic will out-earn five shallow posts. LLMs prefer comprehensive, authoritative sources.
- FAQ content is high leverage. A FAQ-heavy page that answers 20 common questions about your niche gets cited constantly — one citation per question, many questions per page.
- Listicles are not dead. The 'Top N' format survives because it maps cleanly to how LLMs structure answers. Rewrite sparingly but keep them in the mix.
- Publish a position. LLMs cite articles that take clear stances more than they cite 'balanced' think-pieces. Hedging reads as low signal.
- Re-publish old hits. If a 2022 article is getting LLM citations, update the publish date, add 3-4 new paragraphs with fresh data, and republish. The signal refresh is worth the work.

Measuring GEO Without Good Tools
Tracking is the weak link. Google Search Console shows you clicks from Google blue links but not from AI Overviews. ChatGPT, Claude, and Perplexity do not publish referrer analytics. The best signals we have found, ranked by usefulness:
- Direct user reports. "I found you through ChatGPT" in your signup form. Add a "how did you hear about us" field and check it monthly.
- Referrer analytics. Not all AI systems strip the referrer. perplexity.ai and similar sometimes send identifiable traffic; check your analytics for these referrers.
- Brand search volume. If your brand name's monthly search volume is rising faster than your content output, LLMs are probably citing you (people then Google the brand to verify).
- Ahrefs / Semrush AI-mode tracking. Both launched rough AI-visibility features in 2025; neither is accurate enough to be the primary signal yet, but both are improving.
- Direct testing. Ask the top 20 questions your customers ask, in the top 3 LLMs, once a month. Track whether you appear. Manual but accurate.
The Three Failure Modes
Before closing, here are three patterns we see that consistently underperform in LLM citation:
- Thin content behind a paywall. LLMs cannot crawl what they cannot see. If your best content is gated, it will not be cited.
- Heavy client-side rendering with no server-rendered fallback. AI crawlers are getting better at JavaScript, but most still read server-rendered HTML preferentially. If your site ships an empty shell to bots, you are invisible.
- Over-optimised keyword-stuffed copy. LLMs have been trained on piles of spammy SEO content. They pattern-match it and down-weight. Ironically, writing for humans performs better than writing for SEO bots.
Run a GEO Audit on Your Site
SocialScript's free SEO audit at socialscript.in/seo-audit now checks for several GEO-adjacent signals: canonical tags, FAQ schema, llms.txt presence, Article schema, publishing freshness, heading structure, and mobile readability. It takes under 90 seconds and the report includes a prioritised fix list. If you are serious about showing up in AI answers over the next twelve months, start with that — the foundation is the same as classical SEO, and the audit will surface whatever is broken.
Generative Engine Optimisation is SEO for a world where the index is an LLM. The core discipline — writing specific, structured, authoritative, well-sourced content for humans — is unchanged. What is changing is that your readers are increasingly delegating their search to an AI, and the AI's taste is slightly different from Google's. Optimising for both at once is the work of the next three years. The agencies that figure it out first will win the next growth cycle. The agencies that wait will spend it chasing a ranking system that is already fading.


