Skip to content

Extract inline CSS into <link> tags for LLM crawlability#3359

Merged
kennethkalmer merged 1 commit intomainfrom
extract-css-from-body
May 1, 2026
Merged

Extract inline CSS into <link> tags for LLM crawlability#3359
kennethkalmer merged 1 commit intomainfrom
extract-css-from-body

Conversation

@kennethkalmer
Copy link
Copy Markdown
Member

@kennethkalmer kennethkalmer commented Apr 29, 2026

Description

Related: DX-537

LLM/AI crawlers consistently bail before reaching meaningful content when the document <head> is dominated by inline CSS. Gatsby's default behaviour inlines every chunk of route CSS as <style data-href="…">…</style> — on /docs/basics that's ~241 KB before the <body> even starts.

This PR adds an onPreRenderHTML hook in gatsby-ssr.tsx that swaps each Gatsby-emitted <style data-href="…"> for a <link rel="stylesheet" href="…"> pointing at the same already-emitted CSS file in public/. Production-only — dev keeps inline styles for HMR. Sandpack's inline <style> (no data-href) is left untouched.

This mirrors the same workaround already running in apps/voltaire, originally added there to lower Total Blocking Time. Same mechanism, different motivation here (crawlability rather than TBT). Reference: gatsbyjs/gatsby#1526.

Impact (measured on /docs/basics)

Metric Before (live) After (this PR)
<style data-href> tags 1 (large) 0
Inline CSS bytes 241,138 4,321
<head> size ~247 KB ~8 KB
Byte offset of <body> 246,849 8,250

Visual rendering is unchanged — the same CSS still loads, just via a <link>.

Checklist

Test plan

  • yarn build succeeds
  • Built public/docs/basics/index.html contains zero <style data-href> and a <link rel="stylesheet" href="/styles.<hash>.css">
  • Sandpack <style id="sandpack"> block still present
  • yarn serve + Chrome DevTools — page renders identically to live ably.com/docs/basics
  • Reviewer: spot-check a page with code samples (Sandpack) and a page with the dark-themed nav to confirm no FOUC
  • Reviewer: confirm no regression in dev mode (yarn develop) — the hook is a no-op outside production

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Improvements
    • Enhanced production build CSS handling by automatically converting inline styles to external stylesheets while preserving styling integrity.
    • Optimized CSS cascade order to ensure correct styling precedence in rendered HTML.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6d61d521-d6ae-432d-8973-e07cfd91b829

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Updated SSR configuration in gatsby-ssr.tsx to strengthen typing with the GatsbySSR interface, added a production-only onPreRenderHTML hook that transforms inlined <style> elements into stylesheet links, and enforces CSS cascade order by prioritizing global stylesheets in the head.

Changes

Cohort / File(s) Summary
SSR Hook Enhancement
gatsby-ssr.tsx
Added onPreRenderHTML hook with production-only style transformation logic that converts inlined <style> elements to <link> tags, sorts global stylesheets for proper cascade order, and strengthened typing for both onRenderBody and new onPreRenderHTML exports.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 ✨ Stylesheets now shine, transformed with care,
Inlined styles swap for links in the air,
Global cascades sorted, just right,
Production HTML, pristine and bright! 🎨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: extracting inline CSS into link tags to improve LLM crawler crawlability, which matches the primary objective of the pull request.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch extract-css-from-body

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kennethkalmer kennethkalmer added the review-app Create a Heroku review app label Apr 29, 2026
@kennethkalmer kennethkalmer self-assigned this Apr 29, 2026
@ably-ci ably-ci temporarily deployed to ably-docs-extract-css-f-gimcw9 April 29, 2026 12:24 Inactive
@kennethkalmer
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
gatsby-ssr.tsx (1)

52-63: Remove any from the type guard and model href as optional.

At Lines 60-63, the lint suppression and any-based guard reduce TS safety; also at Line 56, href is typed as required even though style nodes (e.g., Sandpack) may not have it.

Proposed refactor
 type StyleComponent = {
   type: 'style';
   props?: {
     'data-href'?: string;
-    href: string;
+    href?: string;
   };
 };
 
-/* eslint-disable `@typescript-eslint/no-explicit-any` */
-const isStyleComponent = (node: any): node is StyleComponent =>
-  typeof node === 'object' && node != null && (node as any).type === 'style';
-/* eslint-enable `@typescript-eslint/no-explicit-any` */
+const isStyleComponent = (node: React.ReactNode): node is StyleComponent =>
+  typeof node === 'object' &&
+  node != null &&
+  'type' in node &&
+  (node as { type?: unknown }).type === 'style';

As per coding guidelines, **/*.{ts,tsx}: Avoid using 'any' type in TypeScript; use specific types instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gatsby-ssr.tsx` around lines 52 - 63, Change StyleComponent.props.href to be
optional and replace the `any` in the type guard with a safer type (use unknown
or a DOM/ReactNode union) so TypeScript can narrow correctly; update
isStyleComponent signature from (node: any) to (node: unknown): node is
StyleComponent, remove the eslint-disable comments, and adjust the runtime
checks inside isStyleComponent to first ensure node is an object/non-null and
then check (node as Partial<StyleComponent>).type === 'style' and optional
props/href presence accordingly so no use of `any` remains and href is optional
on StyleComponent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gatsby-ssr.tsx`:
- Around line 83-92: The comparator passed to Array.sort currently takes a
single parameter and ignores the second, breaking the comparator contract;
change the sort callback to compare two nodes (e.g., .sort((a, b) =>
Number(isGlobalStyleNode(b)) - Number(isGlobalStyleNode(a)))) and extract the
style-detection logic into a helper isGlobalStyleNode that uses the existing
isStyleComponent and the regex on data-href/href; also update the type guard
signature from (node: any) to (node: unknown) in isStyleComponent (or related
type-guard function) to remove the any usage and improve type safety.
- Line 98: The returned link element in gatsby-ssr.tsx currently includes
media="screen", which restricts the stylesheet to screen media; remove the
media="screen" attribute from the JSX return in the function that maps/returns
style links (the line returning <link key={styleHref} href={styleHref}
rel="stylesheet" type="text/css" ... />) so the linked stylesheet inherits the
default (all media) behavior and preserves original styling scope.

---

Nitpick comments:
In `@gatsby-ssr.tsx`:
- Around line 52-63: Change StyleComponent.props.href to be optional and replace
the `any` in the type guard with a safer type (use unknown or a DOM/ReactNode
union) so TypeScript can narrow correctly; update isStyleComponent signature
from (node: any) to (node: unknown): node is StyleComponent, remove the
eslint-disable comments, and adjust the runtime checks inside isStyleComponent
to first ensure node is an object/non-null and then check (node as
Partial<StyleComponent>).type === 'style' and optional props/href presence
accordingly so no use of `any` remains and href is optional on StyleComponent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b6cd4faa-e512-4f0f-ab2f-0f19754a65e1

📥 Commits

Reviewing files that changed from the base of the PR and between f06e5c4 and deccd61.

📒 Files selected for processing (1)
  • gatsby-ssr.tsx

Comment thread gatsby-ssr.tsx Outdated
Comment thread gatsby-ssr.tsx Outdated
@kennethkalmer kennethkalmer temporarily deployed to ably-docs-extract-css-f-gimcw9 April 29, 2026 12:51 Inactive
@kennethkalmer
Copy link
Copy Markdown
Member Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Gatsby SSR hook to reduce inline CSS in the rendered <head> by converting Gatsby-emitted <style data-href="…"> tags into <link rel="stylesheet" href="…"> in production, improving crawlability for LLM/AI crawlers.

Changes:

  • Type the existing onRenderBody hook using Gatsby’s GatsbySSR API types.
  • Add onPreRenderHTML to replace Gatsby inline stylesheet <style> nodes with external <link rel="stylesheet"> tags (prod-only).
  • Add helper type guards and ordering logic intended to keep global styles first.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gatsby-ssr.tsx Outdated
Comment thread gatsby-ssr.tsx Outdated
Comment thread gatsby-ssr.tsx Outdated
Comment thread gatsby-ssr.tsx Outdated
@kennethkalmer kennethkalmer temporarily deployed to ably-docs-extract-css-f-gimcw9 April 29, 2026 13:13 Inactive
@kennethkalmer kennethkalmer force-pushed the extract-css-from-body branch from ec9249e to 1d7af61 Compare April 29, 2026 13:20
@kennethkalmer kennethkalmer temporarily deployed to ably-docs-extract-css-f-gimcw9 April 29, 2026 13:20 Inactive
@sacOO7 sacOO7 requested a review from Copilot April 30, 2026 09:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gatsby-ssr.tsx Outdated
return false;
}
const styleHref = node.props?.['data-href'] ?? node.props?.href ?? '';
return /^\/styles\.[a-zA-Z0-9]+\.css$/.test(styleHref);
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isGlobalStyleNode only matches hrefs that start with /styles.<hash>.css. This repo config supports assetPrefix (e.g. Heroku PR deploys), which typically causes Gatsby to emit absolute asset URLs; in that case the global stylesheet won’t be detected and won’t be sorted first as intended. Consider normalizing styleHref to a pathname (e.g. via new URL(styleHref, 'http://localhost').pathname) before applying the /styles.<hash>.css check, so it works for both absolute and relative URLs.

Suggested change
return /^\/styles\.[a-zA-Z0-9]+\.css$/.test(styleHref);
try {
const stylePathname = new URL(styleHref, 'http://localhost').pathname;
return /^\/styles\.[a-zA-Z0-9]+\.css$/.test(stylePathname);
} catch {
return false;
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — confirmed by inspecting the Heroku review app earlier in this PR (it served https://ably-docs-extract-css-f-gimcw9.herokuapp.com/styles.<hash>.css, which the previous regex did not match, so the global-first sort was silently a no-op there). Fixed in e692544 by normalizing through new URL(href, "http://localhost").pathname before matching, with a try/catch in case the value is not URL-parseable.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gatsby-ssr.tsx Outdated
Comment on lines +92 to +96
const transformedHeadComponents = headComponents.map((node) => {
const reorderedNode = isStyleComponent(node) ? sortedStyleComponents[sortedStyleIndex++] : node;

if (isStyleComponent(reorderedNode)) {
const styleHref = reorderedNode.props?.['data-href'] ?? reorderedNode.props?.href;
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reordering logic operates over all <style> nodes (filter(isStyleComponent)), including the Sandpack <style id="sandpack"> (which has no data-href). Because you then substitute style nodes from the sorted list, non-Gatsby inline styles can be swapped/reordered relative to Gatsby’s emitted CSS chunks, changing cascade and contradicting the “leave Sandpack untouched” goal.

Consider only sorting/replacing the subset of style nodes that have data-href/href (Gatsby-emitted stylesheet chunks), and leave other <style> nodes in-place (don’t include them in the sorted list / index).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right — the previous version pulled Sandpack into the sorted list, and although the sort itself is stable, the substitution-into-original-positions step could swap Sandpack with a Gatsby data-href node if their relative order in the sorted list ended up different from the original.

Fixed in e692544 by introducing isExtractableStyleNode = isStyleComponent && getStyleHref !== undefined. Only nodes with a data-href/href are filtered, sorted, and used as substitution targets. Sandpack (and any other inline <style> without an href) is now passed through node unchanged in the .map, so its original position is guaranteed.

Copy link
Copy Markdown
Contributor

@umair-ably umair-ably left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@kennethkalmer kennethkalmer enabled auto-merge April 30, 2026 15:12
@kennethkalmer kennethkalmer disabled auto-merge April 30, 2026 15:13
@kennethkalmer kennethkalmer temporarily deployed to ably-docs-extract-css-f-gimcw9 April 30, 2026 15:19 Inactive
@kennethkalmer kennethkalmer force-pushed the extract-css-from-body branch from e692544 to b24bdf8 Compare April 30, 2026 15:31
@kennethkalmer kennethkalmer temporarily deployed to ably-docs-extract-css-f-gimcw9 April 30, 2026 15:32 Inactive
@kennethkalmer
Copy link
Copy Markdown
Member Author

Final review-app verification

Compared the freshly built review app against live ably.com/docs/basics. Both rendered visually identical (review app is missing the search box and cookie banner because they sit behind feature flags / consent that arent set under basic auth — unrelated to this change).

Document-level metrics on /docs/basics

Metric Live (ably.com) Review app Δ
<style data-href> tags 1 (large) 0 extracted
Inline CSS bytes (review app = Sandpack only) 241,138 4,321 −98.2%
<head> size 246,728 B 9,022 B −96.3%
<body> byte offset 246,734 9,028 27× sooner
Gatsby <link rel="stylesheet"> 0 1 added
Stylesheet media attribute n/a none (default = all media) print-safe
<style id="sandpack"> position vs Gatsby link n/a after (5443 > 963) original order preserved

A crawler now reaches the body within the first ~9 KB of the document instead of past 246 KB.

assetPrefix correctness

The review app emits its stylesheet as an absolute URL — https://ably-docs-extract-css-f-gimcw9.herokuapp.com/styles.6abe1dcd792cb717c21c.css — because Heroku PR deploys set assetPrefix. The URL-normalization step (new URL(href, "http://localhost").pathname) means the global-first sort still recognizes it as the global stylesheet on the review app, not just locally where the path is relative.

Sandpack untouched

isExtractableStyleNode excludes inline <style> tags without a data-href/href, so Sandpack is not pulled into the sorted list and is not used as a substitution target. Its position relative to the Gatsby <link> is the same as in the unmodified Gatsby output.

@kennethkalmer
Copy link
Copy Markdown
Member Author

@sacOO7 I addressed Copilots feedback from the review you requested

LLM/AI crawlers bail before reaching content when the document head
contains a wall of inline CSS. Replace Gatsby's `<style data-href>`
head components with `<link rel="stylesheet">` pointing at the same
already-emitted CSS files. Production-only; dev keeps inline styles
for HMR. Sandpack and any other inline `<style>` without an href are
left in place.

Mirrors the workaround in apps/voltaire (originally added there for
Total Blocking Time, same mechanism applies here). Reference:
gatsbyjs/gatsby#1526.

The global stylesheet href is normalized through the URL constructor
before matching, so the cascade-preserving sort works whether Gatsby
emits relative paths or the absolute URLs produced when assetPrefix
is set (Heroku review apps).

On /docs/basics: head shrinks from ~247 KB to ~8 KB, body tag
reachable 30x sooner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kennethkalmer kennethkalmer force-pushed the extract-css-from-body branch from b24bdf8 to 54ad04a Compare May 1, 2026 13:24
@kennethkalmer kennethkalmer enabled auto-merge May 1, 2026 13:24
@kennethkalmer kennethkalmer temporarily deployed to ably-docs-extract-css-f-gimcw9 May 1, 2026 13:24 Inactive
@kennethkalmer kennethkalmer merged commit 671ba86 into main May 1, 2026
7 checks passed
@kennethkalmer kennethkalmer deleted the extract-css-from-body branch May 1, 2026 13:34
@sacOO7
Copy link
Copy Markdown
Contributor

sacOO7 commented May 1, 2026

Related to #3015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-app Create a Heroku review app

Development

Successfully merging this pull request may close these issues.

5 participants