Skip to content

feat(migrations): add overwrite and skip options to CSV/JSON/Appwrite imports#11910

Draft
premtsd-code wants to merge 2 commits into1.9.xfrom
feat/skip-duplicates
Draft

feat(migrations): add overwrite and skip options to CSV/JSON/Appwrite imports#11910
premtsd-code wants to merge 2 commits into1.9.xfrom
feat/skip-duplicates

Conversation

@premtsd-code
Copy link
Copy Markdown
Contributor

Summary

Exposes two new optional boolean params on the three migration creation endpoints so CSV / JSON / appwrite-to-appwrite imports can choose how to handle rows whose IDs already exist at the destination.

POST /v1/migrations/appwrite
POST /v1/migrations/csv/imports
POST /v1/migrations/json/imports

Each now accepts overwrite and skip booleans (both optional, both default false).

Parameter semantics

overwrite skip Behavior
false false Default. createDocuments(), fails fast on DuplicateException. Original behavior, unchanged.
false true createDocuments() wrapped in skipDuplicates() scope guard. Duplicate-id rows silently no-op at the adapter layer (INSERT IGNORE equivalent). Existing rows are preserved.
true false upsertDocuments(). Existing rows are replaced with imported values.
true true overwrite wins (upsert subsumes skip).

Params are stored in the migration Document's options array alongside existing fields (path, size, etc.), matching the existing pattern for destination behavior config. The worker's processDestination() reads them back and passes them to DestinationAppwrite's new constructor params.

Changes

  • app/controllers/api/migrations.php — adds ->param('overwrite', ...) and ->param('skip', ...) to the Appwrite, CSV, and JSON migration create endpoints; stores both in options; threads into action function signatures
  • src/Appwrite/Platform/Workers/Migrations.phpprocessDestination() passes $options['overwrite'] ?? false and $options['skip'] ?? false to DestinationAppwrite constructor
  • composer.json — temporarily pinned to dev branches (see blocker below)
  • tests/e2e/Services/Migrations/MigrationsBase.php — 3 new E2E test methods + 1 shared helper

E2E test coverage

Three new test methods in MigrationsBase, all passing locally against a full appwrite stack (6 seconds total, 65 assertions):

  1. testCreateCSVImportSkipDuplicates — imports documents.csv (100 rows), mutates one row's age from 56 → 22, re-imports with skip=true. Asserts the mutated row keeps age=22 (skip did not overwrite) and row count stays at 100.

  2. testCreateCSVImportOverwrite — same setup, mutates age 56 → 22, re-imports with overwrite=true. Asserts the mutated row is restored to age=56 (overwrite replaced the value from the CSV) and row count stays at 100.

  3. testCreateCSVImportDefaultFailsOnDuplicate — regression guard. Re-imports with neither flag set. Asserts migration status is failed with errors populated. Prevents accidental behavior change to the default.

Reuses the existing documents.csv fixture (100 rows with $id as first column). No new fixtures needed.

Blocker

Depends on two upstream PRs that are not yet merged:

  1. utopia-php/database#852 — adds the skipDuplicates() scope guard
  2. utopia-php/migration#169 — adds $overwrite and $skip params to DestinationAppwrite

composer.json is temporarily pinned to dev-csv-import-upsert-v2 (database) and dev-feat/skip-duplicates (migration). Both pins must be reset to proper release versions (^5.X.Y and ^1.9.X respectively) once the upstream PRs merge and the upstream libraries ship releases.

This PR should stay in draft until the upstream chain is resolved.

Test plan

  • PHP lint clean (php -l)
  • Pint / PSR-12 format clean
  • 3 new E2E tests pass locally: testCreateCSVImportSkipDuplicates, testCreateCSVImportOverwrite, testCreateCSVImportDefaultFailsOnDuplicate (65 assertions, 6 seconds)
  • Upstream PRs merged + composer pins restored to proper release versions
  • Console frontend changes surfaced via regenerated SDK (out of scope for this PR — console already has the UI bypassing the SDK via raw HTTP call pending SDK regeneration)

Exposes two new optional boolean params on the three migration
creation endpoints so CSV / JSON / appwrite-to-appwrite imports can
choose how to handle rows whose IDs already exist at the destination.

Endpoints updated (app/controllers/api/migrations.php):
- POST /v1/migrations/appwrite
- POST /v1/migrations/csv/imports
- POST /v1/migrations/json/imports

Parameter semantics:
- overwrite=true  -> destination uses upsertDocuments instead of
                     createDocuments; existing rows are replaced
                     with imported values
- skip=true       -> destination wraps createDocuments in
                     skipDuplicates; existing rows are preserved
                     unchanged, duplicate-id rows silently no-op
- both false      -> default; fails fast on DuplicateException
                     (original behavior, unchanged)
- both true       -> overwrite wins (upsert subsumes skip)

Both params are stored in the migration Document's options array
(matches the existing pattern for destination behavior config like
path, size, delimiter, bucketId, etc.) and read back in the worker's
processDestination() to instantiate DestinationAppwrite with the
new constructor params.

Feature-branch note: depends on utopia-php/migration#feat/skip-duplicates
(DestinationAppwrite constructor params) which in turn depends on
utopia-php/database#852 (skipDuplicates scope guard). composer.json is
temporarily pinned to dev-feat/skip-duplicates and
dev-csv-import-upsert-v2 respectively; both must be reset to proper
release versions once the upstream PRs merge.
Three new test methods in MigrationsBase, following the existing
testCreateCSVImport setup pattern:

- testCreateCSVImportSkipDuplicates
  Seeds documents.csv, mutates one row, re-imports with skip=true.
  Asserts the mutated row keeps its mutated value (not overwritten
  by the CSV's original value) and the row count stays at 100.

- testCreateCSVImportOverwrite
  Seeds documents.csv, mutates one row, re-imports with overwrite=true.
  Asserts the mutated row is restored to the CSV's original value
  (proving upsertDocuments actually replaced the row) and the row
  count stays at 100.

- testCreateCSVImportDefaultFailsOnDuplicate
  Regression guard: re-imports documents.csv with no flags. Asserts
  the migration goes to status=failed with errors populated, proving
  the default duplicate-throws behavior is preserved.

All three share a prepareCsvImportFixture() helper that sets up
database + table (name, age columns) + bucket + documents.csv
upload. Returns the known first-row id + original name/age so tests
can mutate and assert on a predictable row.

Reuses the existing documents.csv fixture (100 rows with \$id as the
first column). No new fixture files needed.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 15, 2026

🔄 PHP-Retry Summary

Flaky tests detected across commits:

Commit c5fe716 - 10 flaky tests
Test Retries Total Time Details
LegacyCustomClientTest::testCreateIndexes 1 243.15s Logs
LegacyCustomServerTest::testCreateAttributes 1 242.13s Logs
LegacyCustomServerTest::testOneToOneRelationship 1 241.40s Logs
LegacyTransactionsConsoleClientTest::testDeleteDocument 1 240.28s Logs
LegacyTransactionsCustomClientTest::testTransactionExpiration 1 240.56s Logs
VectorsDBConsoleClientTest::testGetCollectionLogs 1 6ms Logs
DatabasesConsoleClientTest::testGetCollectionLogs 1 25ms Logs
UsageTest::testFunctionsStats 1 10.24s Logs
UsageTest::testPrepareSitesStats 1 6ms Logs
UsageTest::testEmbeddingsTextUsageDoesNotBreakProjectUsage 1 5ms Logs

@github-actions
Copy link
Copy Markdown

✨ Benchmark results

  • Requests per second: 1,908
  • Requests with 200 status code: 343,586
  • P99 latency: 0.095115825

⚡ Benchmark Comparison

Metric This PR Latest version
RPS 1,908 1,297
200 343,586 233,480
P99 0.095115825 0.158781968

@blacksmith-sh

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant