This document describes the SEO infrastructure for anyplot.ai, including bot detection, dynamic meta tags, branded og:images, and sitemap generation.
anyplot.ai is a React SPA (Single Page Application). SPAs have a fundamental SEO challenge: social media bots and search engine crawlers cannot execute JavaScript, so they see an empty page without proper meta tags.
Our solution uses nginx-based bot detection to serve pre-rendered HTML with correct og:tags to bots, while regular users get the full SPA experience.
┌─────────────────────┐
│ Social Media Bot │
│ (Twitter, FB, etc) │
└──────────┬──────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ nginx (Frontend) │
│ │
│ 1. Check User-Agent against bot list │
│ 2. If bot → proxy to api.anyplot.ai/seo-proxy/* │
│ 3. If human → serve React SPA (index.html) │
└──────────────────────────────────────────────────────────────────────┘
│ │
│ Bot │ Human
▼ ▼
┌───────────────────────────┐ ┌───────────────────────────┐
│ Backend API (FastAPI) │ │ React SPA │
│ │ │ │
│ /seo-proxy/* │ │ Client-side routing │
│ Returns HTML with: │ │ Dynamic content │
│ - og:title │ │ Full interactivity │
│ - og:description │ │ │
│ - og:image (branded) │ │ │
└───────────────────────────┘ └───────────────────────────┘
│
│ og:image URL
▼
┌───────────────────────────┐
│ /og/{spec_id}.png │ ← Collage (2x3 grid, top 6 by quality)
│ /og/{spec_id}/{lib}.png │ ← Single branded implementation
│ │
│ Dynamically generated │
│ 1-hour cache │
└───────────────────────────┘
nginx detects 27 bots via User-Agent matching, organized by category:
Social Media:
| Bot | User-Agent Pattern |
|---|---|
| Twitter/X | twitterbot |
facebookexternalhit |
|
linkedinbot |
|
pinterestbot |
|
redditbot |
|
| Tumblr | tumblr |
| Mastodon | mastodon |
Messaging Apps:
| Bot | User-Agent Pattern |
|---|---|
| Slack | slackbot |
| Discord | discordbot |
| Telegram | telegrambot |
whatsapp |
|
| Signal | signal |
| Viber | viber |
| Skype/Teams | skypeuripreview |
| Microsoft Teams | microsoft teams |
| Snapchat | snapchat |
Search Engines:
| Bot | User-Agent Pattern |
|---|---|
googlebot |
|
| Bing | bingbot |
| Yandex | yandexbot |
| DuckDuckGo | duckduckbot |
| Baidu | baiduspider |
| Apple | applebot |
Link Preview Services:
| Bot | User-Agent Pattern |
|---|---|
| Embedly | embedly |
| Quora | quora link preview |
| Outbrain | outbrain |
| Rogerbot | rogerbot |
| Showyoubot | showyoubot |
Located in app/nginx.conf:
# Bot detection map
map $http_user_agent $is_bot {
default 0;
~*twitterbot 1;
~*facebookexternalhit 1;
# ... more bots
}
# SPA routing with bot detection
location / {
error_page 418 = @seo_proxy;
if ($is_bot) {
return 418; # Trigger proxy to backend
}
try_files $uri $uri/ /index.html;
}
# Named location for bot proxy
location @seo_proxy {
proxy_pass https://api.anyplot.ai/seo-proxy$request_uri;
}Backend endpoints that serve HTML with correct meta tags for bots.
Router: api/routers/seo.py
| Endpoint | Purpose | og:image |
|---|---|---|
GET /seo-proxy/ |
Home page | Default (og-image.png) |
GET /seo-proxy/plots |
Plots page | Default |
GET /seo-proxy/specs |
Specs page | Default |
GET /seo-proxy/legal |
Legal page | Default |
GET /seo-proxy/{spec_id} |
Spec overview (cross-language hub) | Collage (2x3 grid) |
GET /seo-proxy/{spec_id}/{language} |
301 → /seo-proxy/{spec_id} (consolidated) |
— |
GET /seo-proxy/{spec_id}/{language}/{library} |
Implementation | Single branded |
All SEO proxy endpoints return minimal HTML with meta tags:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>{title}</title>
<meta name="description" content="{description}" />
<meta property="og:title" content="{title}" />
<meta property="og:description" content="{description}" />
<meta property="og:image" content="{image}" />
<meta property="og:url" content="{url}" />
<meta property="og:type" content="website" />
<meta property="og:site_name" content="anyplot.ai" />
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="{title}" />
<meta name="twitter:description" content="{description}" />
<meta name="twitter:image" content="{image}" />
<link rel="canonical" href="{url}" />
</head>
<body><h1>{title}</h1><p>{description}</p></body>
</html>Dynamically generated preview images with anyplot.ai branding.
Router: api/routers/og_images.py
Image Processing: core/images.py
| Endpoint | Description | Dimensions |
|---|---|---|
GET /og/{spec_id}.png |
Collage of top 6 implementations | 1200x630 |
GET /og/{spec_id}/{library}.png |
Single branded implementation | 1200x630 |
Layout:
- anyplot.ai logo (centered, MonoLisa font 42px, weight 700)
- Tagline: "Beautiful Python plotting made easy."
- Plot image in rounded card with shadow
- Label:
{spec_id} · {library}
Layout:
- anyplot.ai logo (centered, MonoLisa font 38px)
- Tagline
- 2x3 grid of top 6 implementations (sorted by
quality_scoredescending) - Each plot in 16:9 rounded card with label below
- TTL: 1 hour (3600 seconds)
- Cache Key:
og:{spec_id}:{library}orog:{spec_id}:collage - Storage: In-memory API cache
Uses MonoLisa variable font (commercial, not in repo):
- Downloaded from GCS:
gs://anyplot-static/fonts/MonoLisaVariableNormal.ttf - Cached locally in
/tmp/anyplot-fonts/ - Fallback: DejaVuSansMono-Bold
Static file at app/public/robots.txt:
User-agent: *
Allow: /
Disallow: /debug
Sitemap: https://anyplot.ai/sitemap.xmlDynamic endpoint at GET /robots.txt:
User-agent: *
Disallow: /Why block the API?
- APIs should not be indexed by search engines
- Prevents crawling of debug endpoints, docs, and API responses
- Social media bots (WhatsApp, Twitter, etc.) are unaffected - they fetch og:images directly
Dynamic XML sitemap for search engine indexing.
GET /sitemap.xml (proxied from frontend nginx to backend)
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>https://anyplot.ai/</loc></url>
<url><loc>https://anyplot.ai/plots</loc></url>
<url><loc>https://anyplot.ai/specs</loc></url>
<url><loc>https://anyplot.ai/legal</loc></url>
<!-- For each spec with implementations: -->
<url><loc>https://anyplot.ai/{spec_id}</loc></url>
<url><loc>https://anyplot.ai/{spec_id}/{language}/{library}</loc></url>
<!-- ... -->
</urlset>The /{spec_id}/{language} tier is intentionally not listed: language
filtering is served as /{spec_id}?language={language} (the hub with a filter
query param, same canonical as the unfiltered hub), so listing it would create
duplicate-content entries for Google.
- Home page (
/) - Plots page (
/plots) - Legal page (
/legal) - Spec overview pages (
/{spec_id}) — only if spec has implementations - Implementation pages (
/{spec_id}/{language}/{library}) — all implementations
location = /sitemap.xml {
proxy_pass https://api.anyplot.ai/sitemap.xml;
}# Simulate Twitter bot
curl -H "User-Agent: Twitterbot/1.0" https://anyplot.ai/scatter-basic
# Should return HTML with og:tags, not React SPA# Single implementation
curl -o test.png https://api.anyplot.ai/og/scatter-basic/matplotlib.png
# Collage
curl -o test.png https://api.anyplot.ai/og/scatter-basic.png- LinkedIn: https://www.linkedin.com/post-inspector/
| File | Purpose |
|---|---|
app/nginx.conf |
Bot detection, SPA routing, sitemap proxy |
app/public/robots.txt |
Frontend robots.txt (blocks /debug) |
api/routers/seo.py |
SEO proxy endpoints, robots.txt, sitemap generation |
api/routers/og_images.py |
Branded og:image endpoints |
core/images.py |
Image processing, branding functions |
Spec URLs are organised so the spec slug is the top-level identifier and the language sits between spec and library. This keeps the spec — the actual SEO entity — at the URL root and lets us add Julia, R, and MATLAB without touching existing Python URLs.
| URL | Purpose | canonical |
|---|---|---|
/ |
Landing | self |
/{spec_id} |
Cross-language hub — lists every implementation across all languages | self |
/{spec_id}?language={language} |
Hub filtered to one language (client-side filter) | /{spec_id} (without query) |
/{spec_id}/{language}/{library} |
Implementation detail — preview ↔ interactive toggle | self |
/{spec_id}/{language}/{library}?view=interactive |
Same page, interactive iframe pre-selected | base URL without query |
/plots, /specs, /libraries, /palette, /about, /legal, /mcp, /stats |
Static pages | self |
There is intentionally no canonical /{spec_id}/{language} URL. Language
filtering is served via a ?language= query param on the hub, and the hub's
canonical tag omits the query — so the hub and its language-filtered variants
all consolidate on the same canonical URL. Legacy links to
/{spec_id}/{language} redirect to /{spec_id}?language={language} (SPA
client-side redirect via app/src/router.tsx; bots get a 301 from
/seo-proxy/{spec_id}/{language} to /seo-proxy/{spec_id}).
The interactive view follows the same pattern: ?view=interactive is a
deep-link parameter only; the canonical tag always points at the base URL
without the query string.
Spec IDs are top-level path segments, so they must not collide with reserved
routes. The blocklist is enforced at runtime in app/src/utils/paths.ts
(RESERVED_TOP_LEVEL) and at spec creation time in .github/workflows/spec-create.yml:
plots, specs, libraries, palette, about, legal, mcp, stats, debug,
sitemap.xml, robots.txt
There is no legacy redirect layer. Old /python/{spec_id}[/{library}] and
/python/interactive/{spec_id}/{library} URLs return the SPA's NotFoundPage
(catch-all * route) and emit a 404 on bot requests via /seo-proxy. The
sitemap stops listing those URLs, and Google removes them on next crawl.
python.anyplot.ai is served by a dedicated nginx server block
(app/nginx.conf) that proxies bot requests to the main-domain hub / detail
proxies:
| Subdomain URL | Internal rewrite | Canonical (in HTML) |
|---|---|---|
python.anyplot.ai/{spec_id} |
/seo-proxy/{spec_id} |
https://anyplot.ai/{spec_id} |
python.anyplot.ai/{spec_id}/{library} |
/seo-proxy/{spec_id}/python/{library} |
https://anyplot.ai/{spec_id}/python/{library} |
The user keeps the marketing-friendly hostname; Google sees a canonical on the
main-domain hub so authority and ranking signals stay consolidated on a single
URL. Human visitors: the SPA may detect
window.location.hostname === 'python.anyplot.ai' and append
?language=python on spec routes so the grid renders filtered without
changing the canonical.
Frontend URL generation is centralized in app/src/utils/paths.ts:
specPath(specId, language?, library?)— builds the three-tier URL based on which arguments are provided.langFromPath(pathname)— extracts the language segment from a path.RESERVED_TOP_LEVEL— Set of slugs that cannot be used as spec IDs.
When adding Julia, R, or MATLAB:
- Set
Library.language = "julia"(etc.) on each library row. - Implementations automatically appear under
/{spec_id}/julia/{library_id}; sitemap and OG image routes pick them up. - The cross-language hub
/{spec_id}lists the new language's implementations alongside Python's — no per-spec migration needed. - Users can filter the hub to a single language via
/{spec_id}?language=julia(no new canonical URL is created; the filter is UX-only). - Optionally add a
julia.anyplot.aiserver block mirroring the Python one.
- All user input (spec_id, library) is HTML-escaped before rendering
- XSS prevention via
html.escape()for all dynamic content - og:image URLs use
html.escape(url, quote=True)to prevent attribute injection