feat(migrations): add overwrite and skip options to CSV/JSON/Appwrite imports#11910
Draft
premtsd-code wants to merge 2 commits into1.9.xfrom
Draft
feat(migrations): add overwrite and skip options to CSV/JSON/Appwrite imports#11910premtsd-code wants to merge 2 commits into1.9.xfrom
premtsd-code wants to merge 2 commits into1.9.xfrom
Conversation
Exposes two new optional boolean params on the three migration
creation endpoints so CSV / JSON / appwrite-to-appwrite imports can
choose how to handle rows whose IDs already exist at the destination.
Endpoints updated (app/controllers/api/migrations.php):
- POST /v1/migrations/appwrite
- POST /v1/migrations/csv/imports
- POST /v1/migrations/json/imports
Parameter semantics:
- overwrite=true -> destination uses upsertDocuments instead of
createDocuments; existing rows are replaced
with imported values
- skip=true -> destination wraps createDocuments in
skipDuplicates; existing rows are preserved
unchanged, duplicate-id rows silently no-op
- both false -> default; fails fast on DuplicateException
(original behavior, unchanged)
- both true -> overwrite wins (upsert subsumes skip)
Both params are stored in the migration Document's options array
(matches the existing pattern for destination behavior config like
path, size, delimiter, bucketId, etc.) and read back in the worker's
processDestination() to instantiate DestinationAppwrite with the
new constructor params.
Feature-branch note: depends on utopia-php/migration#feat/skip-duplicates
(DestinationAppwrite constructor params) which in turn depends on
utopia-php/database#852 (skipDuplicates scope guard). composer.json is
temporarily pinned to dev-feat/skip-duplicates and
dev-csv-import-upsert-v2 respectively; both must be reset to proper
release versions once the upstream PRs merge.
Three new test methods in MigrationsBase, following the existing testCreateCSVImport setup pattern: - testCreateCSVImportSkipDuplicates Seeds documents.csv, mutates one row, re-imports with skip=true. Asserts the mutated row keeps its mutated value (not overwritten by the CSV's original value) and the row count stays at 100. - testCreateCSVImportOverwrite Seeds documents.csv, mutates one row, re-imports with overwrite=true. Asserts the mutated row is restored to the CSV's original value (proving upsertDocuments actually replaced the row) and the row count stays at 100. - testCreateCSVImportDefaultFailsOnDuplicate Regression guard: re-imports documents.csv with no flags. Asserts the migration goes to status=failed with errors populated, proving the default duplicate-throws behavior is preserved. All three share a prepareCsvImportFixture() helper that sets up database + table (name, age columns) + bucket + documents.csv upload. Returns the known first-row id + original name/age so tests can mutate and assert on a predictable row. Reuses the existing documents.csv fixture (100 rows with \$id as the first column). No new fixture files needed.
🔄 PHP-Retry SummaryFlaky tests detected across commits: Commit
|
| Test | Retries | Total Time | Details |
|---|---|---|---|
LegacyCustomClientTest::testCreateIndexes |
1 | 243.15s | Logs |
LegacyCustomServerTest::testCreateAttributes |
1 | 242.13s | Logs |
LegacyCustomServerTest::testOneToOneRelationship |
1 | 241.40s | Logs |
LegacyTransactionsConsoleClientTest::testDeleteDocument |
1 | 240.28s | Logs |
LegacyTransactionsCustomClientTest::testTransactionExpiration |
1 | 240.56s | Logs |
VectorsDBConsoleClientTest::testGetCollectionLogs |
1 | 6ms | Logs |
DatabasesConsoleClientTest::testGetCollectionLogs |
1 | 25ms | Logs |
UsageTest::testFunctionsStats |
1 | 10.24s | Logs |
UsageTest::testPrepareSitesStats |
1 | 6ms | Logs |
UsageTest::testEmbeddingsTextUsageDoesNotBreakProjectUsage |
1 | 5ms | Logs |
✨ Benchmark results
⚡ Benchmark Comparison
|
This comment has been minimized.
This comment has been minimized.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Exposes two new optional boolean params on the three migration creation endpoints so CSV / JSON / appwrite-to-appwrite imports can choose how to handle rows whose IDs already exist at the destination.
Each now accepts
overwriteandskipbooleans (both optional, both defaultfalse).Parameter semantics
overwriteskipfalsefalsecreateDocuments(), fails fast onDuplicateException. Original behavior, unchanged.falsetruecreateDocuments()wrapped inskipDuplicates()scope guard. Duplicate-id rows silently no-op at the adapter layer (INSERT IGNOREequivalent). Existing rows are preserved.truefalseupsertDocuments(). Existing rows are replaced with imported values.truetrueoverwritewins (upsert subsumes skip).Params are stored in the migration Document's
optionsarray alongside existing fields (path,size, etc.), matching the existing pattern for destination behavior config. The worker'sprocessDestination()reads them back and passes them toDestinationAppwrite's new constructor params.Changes
app/controllers/api/migrations.php— adds->param('overwrite', ...)and->param('skip', ...)to the Appwrite, CSV, and JSON migration create endpoints; stores both inoptions; threads into action function signaturessrc/Appwrite/Platform/Workers/Migrations.php—processDestination()passes$options['overwrite'] ?? falseand$options['skip'] ?? falsetoDestinationAppwriteconstructorcomposer.json— temporarily pinned to dev branches (see blocker below)tests/e2e/Services/Migrations/MigrationsBase.php— 3 new E2E test methods + 1 shared helperE2E test coverage
Three new test methods in
MigrationsBase, all passing locally against a full appwrite stack (6 seconds total, 65 assertions):testCreateCSVImportSkipDuplicates— importsdocuments.csv(100 rows), mutates one row's age from 56 → 22, re-imports withskip=true. Asserts the mutated row keepsage=22(skip did not overwrite) and row count stays at 100.testCreateCSVImportOverwrite— same setup, mutates age 56 → 22, re-imports withoverwrite=true. Asserts the mutated row is restored toage=56(overwrite replaced the value from the CSV) and row count stays at 100.testCreateCSVImportDefaultFailsOnDuplicate— regression guard. Re-imports with neither flag set. Asserts migration status isfailedwith errors populated. Prevents accidental behavior change to the default.Reuses the existing
documents.csvfixture (100 rows with$idas first column). No new fixtures needed.Blocker
Depends on two upstream PRs that are not yet merged:
skipDuplicates()scope guard$overwriteand$skipparams toDestinationAppwritecomposer.jsonis temporarily pinned todev-csv-import-upsert-v2(database) anddev-feat/skip-duplicates(migration). Both pins must be reset to proper release versions (^5.X.Yand^1.9.Xrespectively) once the upstream PRs merge and the upstream libraries ship releases.This PR should stay in draft until the upstream chain is resolved.
Test plan
php -l)testCreateCSVImportSkipDuplicates,testCreateCSVImportOverwrite,testCreateCSVImportDefaultFailsOnDuplicate(65 assertions, 6 seconds)