What GPTBot sees before your React app hydrates

Lighthouse scores the hydrated DOM. Many AI crawlers stop at the first HTML response.

You ship a React app. The page loads, the headings show up, Lighthouse gives you a green SEO score. Looks fine.

Now fetch the same URL the way a lot of bots do — no JavaScript, just the first HTTP response. Often you get a shell: <div id="root"></div>, a couple of script tags, maybe a title. The copy your users actually read never appeared in that response.

That split — raw HTML vs Live DOM — is what we call DOM drift. Your app can look perfect in Chrome and still ship an empty document to anything that stops at step one.

Stacked terminal panels: hydrated Live DOM above, raw initial HTML below — same page, two different documents — Same URL, two documents: hydrated Live DOM vs raw initial HTML.

The empty `#root` problem

Most SPAs follow the same sequence: minimal HTML, download JS, mount into #root, user sees content. Steps three and four happen after the wire transfer. GPTBot, ClaudeBot, and plenty of other agents fetch the URL, respect robots.txt, and parse what came back — which is often much closer to that first HTML than to the tree you inspect in DevTools.

Crawler behavior is not uniform. Googlebot renders JavaScript for indexing (with limits and delays). Many AI fetchers do not run your bundle at all, or run far less of it. The practical takeaway for a dev shipping a CSR app: if your <h1> only exists after hydration, do not assume every agent saw it.

Typical CSR response:

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Acme — AI-native analytics</title>
  </head>
  <body>
    <div id="root"></div>
    <script src="/assets/main.js"></script>
  </body>
</html>

After hydration:

<main>
  <h1>AI-native analytics for product teams</h1>
  <p>Track agent-readability alongside human UX…</p>
</main>

Users never notice. Lighthouse might not either — it scores the page once your framework has run. An audit that compares both surfaces will.

You see this on Vite + React SPAs, Vue CLI apps, client-only Next routes. SSR helps, but a client-only wrapper on one route brings the gap back. Analytics look fine, the SEO category passes, and the marketing site still curls to an empty shell.

When to care: pricing tables, docs headings, anything you'd be embarrassed to paste from curl into a ticket. If the proposition lives in JS, it probably isn't in the initial HTML.

Wrong tool for the job

When “SEO” comes up, teams reach for tools they already have. Each one is useful. None of them diff raw HTML against the hydrated DOM.

Tool	What it actually tells you
Lighthouse	Performance, human a11y, basic meta — on the hydrated page
Cloud GEO dashboards	Whether ChatGPT cited your brand this week
Site crawlers	Rankings, links, campaigns across the domain

Lighthouse is great at what it does. Ahrefs and Semrush are great at theirs. BotScore checks something else: llms.txt, AI bot robots rules, landmarks, JSON-LD, and whether the HTML that shipped over the wire matches what Chrome shows after your bundle runs.

If you want to know whether ChatGPT mentioned you yesterday, get a monitoring tool. If you want to fix what GPTBot can read on this tab before you merge, you need a local audit that mirrors crawler constraints.

When live: FAQ — How is BotScore different from Lighthouse?

Side-by-side: raw HTML vs Live DOM

BotScore runs in the browser on the active tab. For DOM drift it fetches initial HTML for the same URL and diffs visible text and landmarks against the Live DOM after hydration.

Try it:

Open a CSR-heavy page in Firefox (Chrome listing pending as of June 2026 — botscore.io has current store links).
Open the side panel.
Toggle Raw HTML and Live DOM.
Look for a missing <h1>, an empty <main>, body copy that only shows up hydrated.
Check Machine Readability in the breakdown — DOM drift failures land there.

In the split view, ask four questions:

Is there an <h1> in raw HTML, or only after JS?
Does <main> exist in the first response, or is everything inside #root with no semantic shell?
Pick a paragraph users can read — same text in the raw panel?
Primary nav links — real <a href> in HTML, or injected client-side?

A page can pass three Lighthouse SEO checks and fail all four. That's the point: not another score, a literal diff between what shipped and what Chrome shows.

Extension side panel with Raw HTML vs Live DOM toggle — highlight missing h1 in raw view — Toggle Raw HTML and Live DOM in the side panel.

SEM-DOM-DELTA

We encode this as rule SEM-DOM-DELTA:

Fetch initial HTML for the active URL.
Compare visible text and landmarks (h1, main, key regions) to the hydrated DOM.
Fail at ~40%+ client-only text, or when structural landmarks are missing from initial HTML.
Warn at ~15%+.

Fixes depend on your stack: SSR or SSG for headings and body copy, pre-render critical routes at build time, or move <h1> / <main> into the shell even if styling still hydrates client-side. SPAs are fine for humans. Agent-readable HTML is a separate deliverable from “works in Chrome after JS runs.”

Discoverability checks that pair with DOM drift

DOM drift is the loud failure on CSR sites. Agent-readability is wider — BotScore runs 27 rules across five categories. A few that show up alongside drift:

Check	Why it matters
llms.txt / llms-full.txt	Machine-readable site summary for agents
AI bot robots	Whether GPTBot / ClaudeBot may fetch your URLs
Landmarks	`<h1>`, `<main>`, heading hierarchy without CSS layout
JSON-LD	Structured context for answer engines

Example on our site: botscore.io/llms.txt

Stacked code terminals: robots.txt AI bot rules + llms.txt excerpt

Follow-ups later: llms.txt checklist, JSON-LD for SaaS landings. This piece stays on the CSR gap.

Score, fix, re-audit

The workflow is tab-native — audit the page you're editing, not a crawl report that lands next week.

Install from botscore.io (Firefox live; Chrome in review).
Run an audit — 27 rules, 0–100 GEO score, five categories.
Read on-page highlights on failing elements.
Copy Agent Brief — Markdown for a ticket or PR.
Fix — SSR, meta, llms.txt, robots, landmark HTML.
Re-audit in the panel.

Agent Brief copy button and issues list in extension side panel — Copy an Agent Brief for tickets, PRs, or docs.

Free: full audit, GEO score, Agent Brief, highlights, CSV — no account. Pro ($19/mo · $189/yr): PDF export, bulk audits (eight tabs), saved custom rules. License check hits our server; audit content does not. No Team tier at launch. This is not a citation-tracking dashboard.

Privacy

Audits run locally in your browser. Page content is not uploaded for scoring. Pro validates your license key over the network — that's it for BotScore servers. DOM drift checks fetch initial HTML, robots.txt, and llms.txt from the site you're auditing (same-origin to that page, not an upload to us). Snapshots stay in browser storage.

Full policy: botscore.io/privacy

The extension is not open source. We do not claim zero network — license validation uses it. We do claim zero telemetry on audit content.

Next steps

Install from botscore.io — Firefox Add-ons is live; Chrome link goes up when Google approves the listing.

If you ship SPAs, treat initial HTML as a first-class artifact. When your <h1> only appears after createRoot, assume a chunk of agents never saw it. Run raw vs live on the tab you're about to ship — locally, before the PR merges.

Planned next: llms.txt checklist — what belongs in /llms.txt, how it differs from llms-full.txt, validating both without a cloud crawl.

Updates: @BotScore_io