Skip to content

Add WordPress export (WXR/.xml) import support#4046

Open
wojtekn wants to merge 7 commits into
trunkfrom
add-xml-wxr-import
Open

Add WordPress export (WXR/.xml) import support#4046
wojtekn wants to merge 7 commits into
trunkfrom
add-xml-wxr-import

Conversation

@wojtekn

@wojtekn wojtekn commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Related issues

How AI was used in this PR

Authored with Claude Code: codebase exploration, the importer/handler/validator implementation, the PHP driver, tests, and this description. All changes were reviewed by the author, and the flows were manually tested in the app (see Testing Instructions).

Proposed Changes

Studio's Import/Export screen previously accepted full-site backups (Jetpack, Local, Playground, .wpress) and raw .sql databases, but not a WordPress export file — the .xml (WXR) produced by Tools → Export. That importer is a first-class option in the WordPress dashboard, and users reasonably expect it in Studio too.

This adds .xml as a supported import format in both the Import/Export tab (existing site) and the Add-Site → Import from a backup flow (new site). The content — posts, pages, terms, authors, and media — is imported the same way the dashboard does it: via the official wordpress-importer plugin. The plugin is installed into the target site before the import runs, so the import works offline with no wordpress.org fetch at runtime.

WXR import differs from the other formats in that it merges content into an existing WordPress install rather than replacing files/database. A few things the reviewer should know about:

  • Add-Site ordering: for .xml specifically, the new site is now created and started once (installing WordPress) before the import runs, so wp-config.php and the database exist when the importer needs them. Other backup types keep their existing skip-start behavior.
  • Media caveat (intentional): attachments are downloaded from the URLs baked into the WXR, exactly like the dashboard importer. Exports from a publicly reachable site import media correctly. Exports from another local Studio site do not — their .wp.local URLs resolve only via the OS hosts file + Studio's proxy and serve a self-signed cert, none of which the import runtime has, so posts import but images keep pointing at the source site. This is documented in the PHP driver and left as a possible follow-up.
  • Plugin bundling: the wordpress-importer plugin is not vendored in the repo. It's downloaded at install time (pinned version) via the existing FILES_TO_DOWNLOAD registry in scripts/download-wp-server-files.ts, alongside WP core, SQLite, WP-CLI, etc., and ships in the CLI bundle under wp-files/. The runtime install into the site is a local copy — no network needed at import time. CI must run postinstall (or the download step) before packaging so the plugin is present in the bundle; the sibling downloads already rely on this.

Testing Instructions

  1. In any WordPress site, go to Tools → Export → All content to produce a .xml (WXR) file. Prefer a public source site if you want to verify media import.
  2. Import/Export tab: open an existing Studio site → Import/Export → drop the .xml. Confirm the progress bar completes, then open the site and verify posts/pages imported. With a public source, media appears in the Media Library.
  3. Add-Site flow: Add site → Import from a backup → select the .xml. Confirm the new site is created and the content imported.
  4. Offline: disable networking and confirm the import still runs (the wordpress-importer plugin installs from the bundled copy). Media that must be fetched from origin will be missing offline, which is expected.
  5. Automated: npm run cli:build && npm test -- apps/cli/lib/import-export (adds validator, backup-handler-factory, and importer-selection tests).

Pre-merge Checklist

  • Have you checked for TypeScript, React or other console errors?

Support importing a WordPress export file directly in Studio's import/export
screen and the Add-Site "Import from a backup" flow, matching the WordPress
dashboard's Tools → Import → WordPress. The content is imported via the
bundled wordpress-importer plugin so it works offline.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
wojtekn and others added 2 commits July 2, 2026 11:52
Fetch the wordpress-importer plugin via a postinstall download script into
the gitignored wp-files/ bundle, matching how PHP, WP server files, and agent
skills are handled. Keeps the ~2.2 MB third-party plugin out of the repo while
the runtime import stays fully offline (the plugin ships in the built bundle).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Register the wordpress-importer plugin as an entry in the existing
FILES_TO_DOWNLOAD registry instead of a standalone script, reusing the shared
fetch/extract/retry plumbing. No behavior change; the plugin still downloads
into wp-files/ at install time and ships in the CLI bundle.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@wpmobilebot

wpmobilebot commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

📊 Performance Test Results

Comparing f1dbbf0 vs trunk

app-size

Metric trunk f1dbbf0 Diff Change
App Size (Mac) 1316.92 MB 1318.31 MB +1.38 MB 🔴 0.1%

site-editor

Metric trunk f1dbbf0 Diff Change
load 743 ms 1126 ms +383 ms 🔴 51.5%

site-startup

Metric trunk f1dbbf0 Diff Change
siteCreation 6515 ms 6479 ms 36 ms ⚪ 0.0%
siteStartup 1859 ms 1861 ms +2 ms ⚪ 0.0%

Results are median values from multiple test runs.

Legend: 🟢 Improvement (faster) | 🔴 Regression (slower) | ⚪ No change (<50ms diff)

wojtekn and others added 4 commits July 2, 2026 13:52
Build the expected path with path.join instead of a hardcoded forward-slash
string, so the assertion matches on Windows CI where path.join uses backslash.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The comment referenced a standalone download-wordpress-importer.ts script that
was folded into download-wp-server-files.ts and never committed. Point it at
the FILES_TO_DOWNLOAD registry instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The plugin is downloaded at build time and shipped in the CLI bundle, not
vendored in the repo. Align the comment with that.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
updateSiteUrl only rewrites the target site's own URL and returns early when it
is unchanged — which is always the case for a WXR merge (the DB keeps the
target URL, and the importer has no knowledge of the source URL). The call did
nothing but implied a capability it lacks. Replace it with a comment explaining
why internal links keep pointing at the source, matching the dashboard importer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@wojtekn wojtekn requested a review from a team July 2, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants