Skip to content

refactor(seo): consolidate spec language overview into hub with ?language= filter#5297

Merged
MarkusNeusinger merged 2 commits intomainfrom
claude/consolidate-spec-pages-OEGyx
Apr 21, 2026
Merged

refactor(seo): consolidate spec language overview into hub with ?language= filter#5297
MarkusNeusinger merged 2 commits intomainfrom
claude/consolidate-spec-pages-OEGyx

Conversation

@MarkusNeusinger
Copy link
Copy Markdown
Owner

The /{spec_id} and /{spec_id}/{language} pages rendered virtually identical
content with separate canonical URLs, causing duplicate-content issues for
search engines. Consolidate them into a single canonical hub page
(/{spec_id}), and serve language filtering as /{spec_id}?language={language}
— a client-side filter whose canonical tag still points at the unfiltered
hub. Same pattern already used for ?view=interactive on detail pages.

  • Router: /:specId/:language redirects to /:specId?language=:language
  • SpecPage: Mode reduced to hub|detail; hub grid filters from ?language=
    when present, canonical always /{spec_id}
  • Sitemap: stop emitting /{spec}/{language} URLs
  • SEO proxy: /seo-proxy/{spec}/{language} returns 301 to /seo-proxy/{spec}
  • nginx python.anyplot.ai: hub rewrites to /seo-proxy/{spec} (detail URLs
    keep /python segment since those are content-unique)
  • Docs + unit tests updated

https://claude.ai/code/session_01Hiwzn5mc979FDGCHkW4os1

…uage= filter

The /{spec_id} and /{spec_id}/{language} pages rendered virtually identical
content with separate canonical URLs, causing duplicate-content issues for
search engines. Consolidate them into a single canonical hub page
(/{spec_id}), and serve language filtering as /{spec_id}?language={language}
— a client-side filter whose canonical tag still points at the unfiltered
hub. Same pattern already used for ?view=interactive on detail pages.

- Router: /:specId/:language redirects to /:specId?language=:language
- SpecPage: Mode reduced to hub|detail; hub grid filters from ?language=
  when present, canonical always /{spec_id}
- Sitemap: stop emitting /{spec}/{language} URLs
- SEO proxy: /seo-proxy/{spec}/{language} returns 301 to /seo-proxy/{spec}
- nginx python.anyplot.ai: hub rewrites to /seo-proxy/{spec} (detail URLs
  keep /python segment since those are content-unique)
- Docs + unit tests updated

https://claude.ai/code/session_01Hiwzn5mc979FDGCHkW4os1
Copilot AI review requested due to automatic review settings April 21, 2026 07:40
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 71.42857% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
app/src/pages/SpecPage.tsx 61.90% 8 Missing ⚠️

📢 Thoughts on this report? Let us know!

Comment thread api/routers/seo.py
it — Google should consolidate the page, not a filtered variant.
"""
del language # referenced for route matching only; deliberately not forwarded
return RedirectResponse(url=f"/seo-proxy/{spec_id}", status_code=301)
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates spec language overview pages into the cross-language hub to eliminate duplicate-content URLs for search engines, moving language selection to a ?language= filter while keeping a single canonical URL per spec.

Changes:

  • Redirect SPA route /:specId/:language/:specId?language=:language, and simplify SpecPage to hub/detail modes with hub filtering driven by ?language=.
  • Update sitemap generation/tests to stop emitting /{spec_id}/{language} URLs and keep only hub + implementation detail URLs.
  • Change SEO proxy GET /seo-proxy/{spec_id}/{language} to a permanent 301 redirect to /seo-proxy/{spec_id}, and adjust python.anyplot.ai nginx proxying accordingly; update SEO docs.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/api/test_seo_helpers.py Updates sitemap helper expectations to exclude the per-language overview tier.
tests/unit/api/test_routers.py Updates sitemap router test expectations to exclude /{spec_id}/{language} URLs.
docs/reference/seo.md Documents the consolidated hub + ?language= filtering and updated sitemap behavior.
app/src/router.tsx Adds a redirect component for legacy /:specId/:language SPA routes.
app/src/pages/SpecPage.tsx Removes language mode, adds hub filtering from ?language=, updates canonical/title/links.
app/nginx.conf Updates python.anyplot.ai bot proxying to use the hub proxy without injecting /python for hub routes.
api/routers/seo.py Removes language-overview URLs from sitemap and converts /seo-proxy/{spec}/{language} to a 301 redirect.
Comments suppressed due to low confidence (1)

app/nginx.conf:201

  • On the python.anyplot.ai server block, a request like /scatter-basic/python will match this /:specId/:library location and proxy to /seo-proxy/scatter-basic/python/python (treating python as a library). That will likely 404 and prevents bots from following the intended /seo-proxy/{spec_id}/{language} -> 301 /seo-proxy/{spec_id} consolidation path. Add a higher-priority location for ^/{spec_id}/python/?$ (or a general /{spec_id}/{language} handler) that proxies to /seo-proxy/$spec_id/python (or directly to /seo-proxy/$spec_id).
    # /:specId/:library -> detail on main domain (language stays in path)
    location ~ "^/(?<spec_id>[A-Za-z0-9][A-Za-z0-9-]*)/(?<library>[A-Za-z0-9][A-Za-z0-9-]*)/?$" {
        set $python_seo_uri /seo-proxy/$spec_id/python/$library;
        error_page 418 = @seo_proxy_python;
        if ($is_bot) { return 418; }

Comment thread api/routers/seo.py
Comment on lines 309 to +320
@router.get("/seo-proxy/{spec_id}/{language}")
async def seo_spec_language(spec_id: str, language: str, db: AsyncSession | None = Depends(optional_db)):
"""Bot-optimized language-specific spec overview."""
if db is None:
return HTMLResponse(
BOT_HTML_TEMPLATE.format(
title=f"{html.escape(spec_id)} - {html.escape(language)} | anyplot.ai",
description=DEFAULT_DESCRIPTION,
image=DEFAULT_HOME_IMAGE,
url=f"https://anyplot.ai/{html.escape(spec_id)}/{html.escape(language)}",
)
)

key = cache_key("seo", spec_id, language)
cached = get_cache(key)
if cached:
return HTMLResponse(cached)

repo = SpecRepository(db)
spec = await repo.get_by_id(spec_id)
if not spec:
raise HTTPException(status_code=404, detail="Spec not found")

lang_impls = [i for i in spec.impls if i.library and i.library.language == language]
has_previews = any(i.preview_url for i in lang_impls)
image = f"https://api.anyplot.ai/og/{spec_id}.png" if has_previews else DEFAULT_HOME_IMAGE

result = BOT_HTML_TEMPLATE.format(
title=f"{html.escape(spec.title)} - {html.escape(language)} | anyplot.ai",
description=html.escape(spec.description or DEFAULT_DESCRIPTION),
image=html.escape(image, quote=True),
url=f"https://anyplot.ai/{html.escape(spec_id)}/{html.escape(language)}",
)
set_cache(key, result)
return HTMLResponse(result)
async def seo_spec_language(spec_id: str, language: str):
"""Permanent redirect: language-overview URLs now live on the hub with ?language=.

The /{spec_id}/{language} tier was consolidated into /{spec_id} to eliminate
duplicate content. Bots following this endpoint get a 301 to the hub proxy;
humans get the SPA redirect configured in app/src/router.tsx. The `language`
query parameter is dropped because the hub's canonical tag does not include
it — Google should consolidate the page, not a filtered variant.
"""
del language # referenced for route matching only; deliberately not forwarded
return RedirectResponse(url=f"/seo-proxy/{spec_id}", status_code=301)
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint’s behavior changed from serving HTML to returning a permanent 301. There’s currently no unit test asserting the redirect status code and Location header (and that the language segment is intentionally dropped). Adding a test for GET /seo-proxy/{spec}/{language} would prevent regressions and ensure crawlers get the expected consolidation behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +245 to +247
if (mode === 'hub') {
trackPageview(languageFilter ? `/${specId}?language=${languageFilter}` : `/${specId}`);
} else if (mode === 'detail' && selectedLibrary) {
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In hub mode this calls trackPageview with a ?language= query string. useAnalytics().trackPageview currently validates urlOverride with /^\/([\w\-/])*$/ (no ?, =), so this override will be rejected and the pageview won’t be tracked when languageFilter is set. Consider either (a) encoding the filter into a path-only override that matches the allowed charset, or (b) not overriding and extending the analytics URL builder to incorporate language safely.

Copilot uses AI. Check for mistakes.
…path segment

Three fixes for PR #5297:

1. seo_spec_language (api/routers/seo.py): validate spec_id against the
   canonical `^[a-z0-9]+(-[a-z0-9]+)*$` pattern before embedding it in the
   Location header. Closes the CodeQL "Untrusted URL redirection" alert.
2. Add unit tests asserting the 301 + Location behaviour and the 404
   response for malformed spec_ids.
3. Fix analytics tracking for the filtered hub: buildPlausibleUrl now
   includes `language` in its orderedKeys list, so ?language=python is
   converted to the /{spec}/language/python path-segment form that
   matches every other filter. SpecPage.tsx hub mode calls trackPageview()
   without an override so the new path-segment URL is picked up. This
   unblocks pageview tracking that was silently dropped by the urlOverride
   validation regex (which rejects ? and =).

https://claude.ai/code/session_01Hiwzn5mc979FDGCHkW4os1
@MarkusNeusinger MarkusNeusinger merged commit 129e10c into main Apr 21, 2026
8 of 9 checks passed
@MarkusNeusinger MarkusNeusinger deleted the claude/consolidate-spec-pages-OEGyx branch April 21, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants