Publish v1.1.0: continuous localization (deepl sync) + hardening by sjsyrek · Pull Request #44 · DeepLcom/deepl-cli

sjsyrek · 2026-04-23T18:19:32Z

Summary

Publishes v1.1.0 from the primary development remote. This PR lands the continuous localization engine (deepl sync), Translation Memory support, the TMS push/pull integration, and the full 2026-04-23 hardening pass (TTY discipline, exit-code contract fix, SQLite cache versioning, HTTP retry jitter, XLIFF decoder corrections, symlink guard, proxy warning, log-redaction coverage).

The v1.1.0 tag is already on the remote (pushed separately); this PR lands the commits that the tag points to.

Changes Made

New: deepl sync continuous localization engine

Scan → diff → translate → write pipeline for 11 i18n file formats (JSON, YAML, TOML, Gettext PO, Android XML, iOS Strings, Xcode String Catalog, ARB, XLIFF, Java Properties, Laravel PHP arrays)
Auto-detect for i18next / Rails / Symfony / Flutter / go-i18n / Django / Laravel / Xcode layouts
.deepl-sync.lock for incremental translation with content hashing
--frozen mode (CI drift detection, exit 10)
--watch mode with debounced auto-sync and optional auto-commit
sync push / sync pull integration with a documented TMS REST contract
sync export to XLIFF 1.2 for CAT tool handoff
sync audit (renamed from prototype glossary-report) for terminology-inconsistency detection
sync resolve for git-merge-conflict lockfile auto-resolution with length-heuristic fallback
sync.limits config block (per-file and per-bucket caps) with max_source_files guarding runaway globs
sync.max_characters preflight cost cap with --force override requiring interactive confirmation or --yes in CI

Translation Memory

deepl translate --translation-memory <name-or-uuid> with name resolution cached per run
translation.translation_memory in .deepl-sync.yaml with per-locale overrides
Pair-pinned model enforcement (quality_optimized required when TM is set)
deepl tm list subcommand

CLI hardening (2026-04-23 pass)

Exit codes: PartialFailure = 12 (distinct from GeneralError = 1), SyncPartialFailureError typed class mirroring SyncDriftError / SyncConflictError. CI can now branch on $? -eq 12 to retry failed locales without confusing it with a CLI crash. SyncDrift uses soft-exit to drain --watch event loops cleanly.
TTY discipline: Spinner gated on stderr.isTTY; write --interactive fails fast on non-TTY stdin (was hanging CI indefinitely); explicit NO_COLOR handling; translate --format table falls back to plain text in non-TTY.
write --to alias for muscle-memory consistency with translate --to (long-flag only; -t intentionally not bound).
Platform reliability: SQLite cache carries PRAGMA user_version; corrupt DBs renamed aside (.corrupt-<ts>) rather than unlinked; HTTP retry uses full jitter with verbose per-retry logs; sync.limits.max_source_files caps glob cardinality.
Security: safeReadFileSync migration on 3 sync config read sites (symlink guard); XLIFF entity decoder single-pass (retires a pre-existing double-decode bug, adds numeric char ref support, rejects CDATA in translatable elements); startup warning on HTTP_PROXY/HTTPS_PROXY http-proxy-fronts-https-target; Logger sanitizer now covers X-Api-Key / X-Auth-Token headers and ?api_key= / ?apikey= query params.

Docs & examples

docs/SYNC.md — full reference for the sync engine, including the stable TMS REST contract
docs/API.md — Exit Codes appendix + complete flag reference
11 new example scripts in examples/ covering sync basics, CI, watch-mode, push/pull, TM, glossary workflows

See CHANGELOG.md for the full 1.1.0 entry.

Test Coverage

Full suite green across unit + integration + E2E. Coverage gates in jest.config.js enforced at 86% branches / 93% lines / 94% functions. Test hygiene tightened: tests/setup.ts afterEach now asserts every nock interceptor registered during a test actually fired — catches silent test gaps.

Backward Compatibility

v1.0.0 → v1.1.0. No breaking changes to any 1.0.0-shipped surface. New commands (sync, tm) are additive. Behavioral corrections documented in CHANGELOG:

CI scripts branching on $? -eq 1 for partial sync failure should switch to $? -eq 12. Generic $? -ne 0 continues to work.
sync.limits.max_source_files default 10k may skip buckets with larger globs — escape hatch is raising the cap in .deepl-sync.yaml.
write --interactive now fails fast on non-TTY stdin (previously hung).

Size: Large

Full 1.1.0 release. ~25k LOC across src/ + tests. 300+ commits of work consolidated into this release cycle.

🤖 Opened via gh pr create after force-pushing to internal primary remote with scrub-clean history. The tag v1.1.0 is already on this remote.

…atterns

test(sync): measure bucket-source readFile call count with template patterns See merge request hack-projects/deepl-cli!2

- Add --format FORMAT to docs/API.md and docs/SYNC.md sync export option tables (the flag was registered in register-sync-export.ts but undocumented). Clarify that on sync export the format affects only the error envelope on stderr; success output is always XLIFF 1.2. - Correct the audit subcommand rename note in docs/API.md: the previous wording implied 1.0.0 users had access to 'glossary-report', but the CHANGELOG [1.1.0] entry explicitly says it never shipped in any tagged release. Rewording aligns with the CHANGELOG.

`tests/setup.ts` `afterEach` now throws if nock has any pending interceptors when a test finishes. An unasserted mock (registered scope with no matching request) is a silent test gap — the test passes but isn't proving the SUT actually hit the mocked network call. Catching this at the shared hook means every new integration test picks up the discipline automatically. Zero fallout: 49 integration files, 766 integration tests, plus the full unit + E2E suite (4501 tests total) all pass with the assertion active. The codebase already had clean nock hygiene; this commit locks it in so regressions can't reintroduce silent gaps. Negative-path tests that intentionally register non-firing interceptors (e.g., to prove a cache-hit path skips the network) can opt out by calling nock.cleanAll() from their own afterEach before the shared hook runs.

Four related fixes for non-TTY / CI / screen-reader contexts, all gated at the nearest single chokepoint so every callsite gets the behavior for free: 1. Logger.shouldShowSpinner() now gates on !quiet && !!process.stderr.isTTY. ora writes to stderr by default, so a non-TTY stderr (CI, piped logs) should not spawn spinners. All current ora() callsites route through this chokepoint so they pick it up automatically. 2. `write --interactive` fails fast with ValidationError when stdin is not a TTY. Previously the process hung indefinitely on an @inquirer/prompts select call that a non-TTY stream cannot answer, which was a silent CI hang when users passed --interactive without --no-input. The new message names both escape routes. 3. `translate --format table` falls back to `[lang] text` lines with a WARN on stderr when stdout is not a TTY. Screen readers and log scrapers no longer have to parse cli-table3 Unicode box-drawing; pipe `--format table > out.txt` now produces parseable plain text. 4. CLI bootstrap explicitly sets chalk.level = 0 when NO_COLOR is set. chalk auto-detects NO_COLOR, but the codebase also has an independent isColorEnabled() check in utils/formatters.ts. Making the hook explicit keeps both detection paths unambiguously in sync if chalk's auto-detection ever changes or is mocked in tests. Tests updated to mock process.stdout.isTTY / process.stdin.isTTY where the existing assertions assumed the old defaults. New tests added for the non-TTY fallback paths in each area.

…rift Four related exit-code fixes that land as one cohesive PR because they share a CHANGELOG story and exercise the same CI-contract surface. 1. ExitCode.PartialFailure is now 12 (distinct from GeneralError = 1). A prior version aliased both to 1, which prevented CI scripts from branching on "one or more locales failed, others succeeded" vs "CLI crashed". Exit 12 is now the only path for the mixed-locale outcome; exit 1 is strictly unclassified failure. The paired typed error class SyncPartialFailureError (exit 12, envelope code "SyncPartialFailure") is added to src/utils/errors.ts to mirror the SyncDriftError (10) / SyncConflictError (11) pattern. 2. `deepl sync --frozen` drift exit is now soft. register-sync-root.ts used to call process.exit(ExitCode.SyncDrift) directly, killing any in-flight writes, auto-commit steps, or --watch event loop mid-cycle. It now sets process.exitCode and returns from the action handler so the event loop drains. Observable exit code is unchanged at 10; docs/API.md has promised this shape since 1.1.0 and the implementation now matches. 3. register-sync-init.ts had a bare process.exit(7) literal in the JSON-envelope ConfigError branch. Now routes through ExitCode.ConfigError so exit-code renumbering can't silently desync. 4. docs/API.md and docs/SYNC.md exit-code tables document the new PartialFailure = 12 row, replace the "1 = GeneralError / PartialFailure" conflation with separate #### 1 and #### 12 sections, and align the SyncDrift section with the soft-exit implementation. The existing tests/unit/exit-codes.test.ts already pins every classifyByMessage substring branch, satisfying the joint proposal's "classifier drift guard" intent at the unit level. No separate E2E suite added — it would just fork the CLI to exercise the same logic ~30x slower. Migration: CI scripts that branched on `$? -eq 1` to detect partial sync failure should switch to `$? -eq 12`. Generic `$? -ne 0` checks continue to work unchanged.

`deepl write --to <language>` is now accepted as a long-only alias of `--lang`. Users can reach for `--to` uniformly across `deepl translate` and `deepl write` — the single most common vocabulary split flagged in cross-command usage. Scope decisions worth naming: - `--from` is NOT added. The Write API auto-detects source language (`detected_source_language` on the response); there is no `source_lang` request parameter. Adding `--from` would introduce a flag with no semantic meaning on this command. - The short form `-t` is intentionally NOT bound on `write`. It belongs to `deepl translate -t, --to`, and silently reusing it here would make `deepl translate -t de "hello"` vs `deepl write -t de "hello"` do the same thing with two different semantics (one translates, one rephrases). A unit test guards against accidental rebinding. - `--lang` is not deprecated. `--to` is additive for v1.x. A v2 housekeeping pass can decide whether to formally deprecate `--lang` if `--to` is the vocabulary that sticks. Validation: passing both `--to` and `--lang` with *different* values exits with a ValidationError; passing the same value works fine (the redundancy is accepted so scripts can set both defensively without tripping). docs/API.md also gains a paragraph distinguishing `deepl sync --locale` (filter over configured targets) from `deepl translate --to` (invocation-time specifier). The split is semantic, not an oversight — sync owns its locale mapping via `.deepl-sync.yaml`, translate does not — and documenting the distinction is the right fix rather than forcing one to compromise for surface symmetry.

Three cross-cutting platform reliability improvements bundled into one PR — they share CI-touching test surface and one CHANGELOG story, and all three are additive (no on-disk / wire / config breakage). 1. SQLite cache schema versioning + backup-on-corrupt PRAGMA user_version stamps fresh DBs at version 1; pre-versioned DBs (user_version=0, before this field existed) are upgrade-stamped in place with no data migration. Opening a higher-versioned DB now fails with ConfigError rather than risking silent data loss. Corrupted DBs are renamed to cache.db.corrupt-<timestamp> (plus any -wal/-shm sidecars) instead of being unlinked. Users keep 30 days of cache history and a forensic artifact; a fresh DB is created alongside and sync continues. Matches the backup pattern already used by sync-lock.ts:125-147. 2. HTTP retry backoff uses full jitter `computeBackoffWithJitter(attempt)` returns uniform random in `[0, min(INIT * 2^n, MAX)]`. Concurrent sync buckets that all 429 simultaneously no longer form a thundering herd on the retry (AWS-recommended pattern). Retry-After header is respected verbatim as before. Each retry decision now emits a Logger.verbose line naming attempt / delay / reason — previously retries were silent and a user seeing elevated latency had no visibility. 3. sync.limits.max_source_files (default 10k, hard max 1M) New optional config field. Caps how many files a bucket's `include` glob may match. A bucket exceeding the cap is skipped entirely with a warning — processing the first N of an oversized glob would silently drop the rest, which is worse than abort. Guards against a misconfigured **/*.json accidentally picking up a vendored tree. Sits alongside the existing per-file caps (max_entries_per_file / max_file_bytes / max_depth) and picks up the same validator machinery (typed positive-integer check, hard ceiling, did-you-mean hint on typos). Tests: 3 new CacheService tests (fresh-stamp, upgrade-stamp, backup-on-corrupt), 2 new walker tests (skip-bucket on excess, pass-bucket at boundary), existing http-client retry tests updated to assert jittered delay ranges rather than fixed values. Full suite 5272 tests green.

Two security findings from the council review, bundled because they share a CHANGELOG story (both land under Security/Fixed on the same release) and touch adjacent defensive surfaces. F1 — symlink guard on sync config + auto-detect reads ===================================================== Migrates 3 readFileSync call sites in sync/ to safeReadFileSync: - src/sync/sync-config.ts:566 — .deepl-sync.yaml read - src/sync/sync-init.ts:239 — package.json read during auto-detect - src/sync/sync-init.ts:271 — first-match i18n file for key counting A hostile repo could previously ship a .deepl-sync.yaml symlinked to ~/.ssh/id_rsa (or another dotfile outside the project root) and surface the target's contents in YAML parser errors on stderr. The helper (src/utils/safe-read-file.ts) was already in use elsewhere; these three sync sites were the remaining gaps. Runtime reads during sync itself are unchanged — the bucket walker already refuses symlinks via fast-glob's followSymbolicLinks: false. F3 — XLIFF single-pass decode + CDATA reject ============================================ The chained-.replace() XML entity decoder had a pre-existing double-decode bug: &lt; (literal "<") collapsed to "<" because the decoder ran & → & on pass 1 then < → < on pass 2. Replaced with a single-pass regex that also adds decimal (&#NN;) and hex (&#xNN;) numeric character reference support, retiring the double-decode bug and the missing-numeric-ref gap in one stroke. CDATA sections inside <source> / <target> previously round-tripped asymmetrically — < / > inside a CDATA body came out encoded by the escape pass even though they entered as raw bytes. The parser now throws ValidationError at extract time if CDATA appears inside a translatable element. Matches the allowlist posture of the Laravel PHP parser's heredoc / interpolation rejection. CDATA elsewhere (e.g. <note>) is still accepted. Not shipping the v1.2 hybrid parser swap in this PR — that's tracked as a v1.2 follow-up. This PR is in-place hardening only. Tests: 7 new xliff unit tests (decimal/hex refs, no-double-decode guard, unknown-entity passthrough, CDATA-in-source reject, CDATA-in-target reject, CDATA-in-note accept). Full suite 5274 tests green.

Two council F2/F4 security findings bundled — both touch auth-credential flow and land under Security in the CHANGELOG. F2 — HTTP proxy + HTTPS endpoint warning ========================================= When HTTP_PROXY / HTTPS_PROXY is an http:// URL and the target API is https://, the CLI now emits a startup warning via Logger.warn. TLS is still tunneled end-to-end via CONNECT, so this is a visibility fix rather than a behavior change — but a compromised or misconfigured http:// proxy that terminates TLS with a trusted-root cert would see the Authorization header, and the user should be aware before routing production traffic through it. The warning does NOT abort the connection. Users with legitimate corporate http-only proxies see it once at startup and proceed; the CLI can't tell corporate infra apart from attacker infra, so forcing an abort would break a non-trivial population. F4 — log redaction hardening ============================ Adds three patterns to Logger.sanitize(): - `X-Api-Key: <value>` — common in REST APIs, including TMS backends (Phrase, Lokalise, custom endpoints). axios error dumps frequently include config.headers, which previously leaked this key when verbose logging was on. - `X-Auth-Token: <value>` — same shape, same failure mode. - `?api_key=<value>` / `?apikey=<value>` / `&api_key=<value>` — query-string variants. The existing `?token=` pattern didn't match these because some third-party APIs use the longer name. Existing redaction paths (DeepL-Auth-Key, Authorization: Bearer, ?token=, env-var exact values) are unchanged. Tests: 5 new logger cases (X-Api-Key, X-Auth-Token, case-insensitive X-Api-Key, ?api_key= query, ?apikey= no-separator query) plus 2 new deepl-client cases (http-proxy warn fires, https-proxy warn does NOT fire). Full suite 5274 tests green.

Two small cleanups found during a full CHANGELOG audit. 1. [Unreleased] section ordering The consolidation of the 8-PR bundle into [Unreleased] ended up with sections in the wrong K-a-C order (Security appeared first). Reordered to Added → Changed → Fixed → Security, matching both the K-a-C convention and the existing [1.1.0] block's ordering. No content changes — purely structural. 2. Three pre-existing duplicates in [1.1.0] Three behavioral-correction entries appeared in BOTH ### Changed and ### Fixed sections of [1.1.0]: - "Voice API no longer hardcodes the Pro endpoint" - "`auth set-key` and `init` now validate entered keys..." - "Standard DeepL URLs ... no longer override key-based auto-detection" These are bug fixes, not design changes. Keeping the ### Fixed copies (which use the prefixed style consistent with the rest of the Fixed block) and removing the ### Changed duplicates. Not addressed in this commit (flagged for release-prep): - [1.1.0] line ~120 currently documents `ExitCode.PartialFailure = 1`, which [Unreleased] supersedes with PartialFailure = 12. At v1.1.0 tag time, the folding of [Unreleased] into [1.1.0] should DELETE that line rather than merge it — a single release can't claim both states. Flagged for the chore(release) commit that prepares the tag. - [1.0.0] section ordering (Added, Security, Changed) is non-K-a-C but already tagged — leaving as historical. Lint + type-check green. No test impact (docs-only change).

Commit be8188b ("fix: harden TTY and output-discipline across CLI surface") rewrote the --interactive-in-non-TTY error message from "not supported in non-interactive mode" to the more actionable "--interactive requires an interactive terminal. Run without --interactive in CI or non-TTY environments, or pipe input via stdin." The unit test in tests/unit/register-write.test.ts was updated in that commit, but the e2e and integration tests were missed. Both assert on the write --interactive --no-input failure path. Match on the stable fragment "requires an interactive terminal" rather than the full sentence, for drift resilience against future wording tweaks. The adjacent init assertions keep their "not supported in non-interactive mode" phrasing — that message still applies to register-init.ts:25 and is unchanged.

1.1.0 hardening: 8 cross-cutting improvements + CHANGELOG audit See merge request hack-projects/deepl-cli!3

Final pre-tag CHANGELOG consolidation. No source or version changes (package.json and VERSION are already at 1.1.0). - Merged the 22 [Unreleased] bullets (3 Added, 7 Changed, 9 Fixed, 3 Security) from the 1.1.0-hardening MR (merged as 85685dc) into the existing [1.1.0] - 2026-04-23 section under matching K-a-C subsections. - Deleted the superseded "Exit code 1 for partial sync failure" bullet from [1.1.0] Changed. The Unreleased bullet that folds in now correctly states PartialFailure = 12 (exit 1 is unclassified CLI failure only). Keeping both would have contradicted itself within a single release. - Moved the translation-memory **Note** to the end of [1.1.0] Added so the folded-in bullets don't appear after it (the Note is a global caveat on the Added list, not a mid-section divider). Ready to tag v1.1.0.

GitHub Actions and GitLab CI set CI=true, which the sync --force guard in register-sync-root.ts checks before the TTY branch. Two tests were inheriting CI=true via {...process.env, ...} and hitting the CI-guard exit (6) instead of the branch they were written to cover: - cli-sync.e2e.test.ts "should retranslate with --force" runSyncAll('--force') -> execSync threw Command failed. Fix in the shared buildEnv() so every CI-neutral invocation in this file is unaffected by the runner's CI flag. - cli-sync-force-guard.e2e.test.ts "proceeds without prompting when stdin is not a TTY" Explicitly asserts status !== 6 for the piped-stdin / no-yes branch; CI-guard fired first. Strip CI only in that one case; the sibling "CI=true --force without --yes" test still sets CI=true explicitly.

"reloads sync config when SIGHUP is received" flaked on the Node 22 CI job with TypeError: handlers[0] is not a function, because the fixed-50 rounds of flushWatchSetup() occasionally return before chokidar's dynamic-import + FS-sweep chain reaches watcher.on('change', ...). Add a bounded wait-until-registered loop (max 10 extra flushes) and a length assertion so the test either succeeds or fails loudly with a clear diagnostic, instead of the misleading TypeError. The sibling "three ticks" test at line 1027 already has an implicit barrier (expect(handlers.length).toBeGreaterThan(0)) which is why it has not flaked.

…e test Same flake class as the SIGHUP fix — "returns SIGINT/SIGTERM listener counts to their baseline after shutdown" (line 943) surfaced on Node 22 CI once the earlier SIGHUP flake was fixed. Under worker contention the fixed-50 flushWatchSetup() rounds occasionally return before watchAndSync reaches its process.on('SIGINT', ...) call, producing "Expected: 1, Received: 0" at the listenerCount assertion. Apply the same bounded wait-until-registered loop in both halves of the test (firstRun and secondRun) so failures remain loud but timing-robust.

The per-test bounded wait loops in the previous two commits only patched the two flaky call sites we had already seen. Another sibling test in the same describe block ("reloads sync config when .deepl-sync.yaml is one of the changed files") surfaced with the identical TypeError on Node 22 CI immediately after. Raise the helper's iteration budget at the source instead of adding a third bounded loop. Each round is two zero-work awaits (sub-microsecond), so 250 has negligible cost but absorbs the Node 22 + CI worker-contention tail that was pushing the setup chain past 50. Tests still fail loudly if setup never completes — the safety ceiling is just larger. The earlier bounded loops stay in place as belt-and-suspenders and as diagnostic breadcrumbs (they surface a clear length assertion rather than a misleading TypeError).

sjsyrek added 18 commits April 23, 2026 13:30

test(sync): measure bucket-source readFile call count with template p…

3cc27fd

…atterns

Merge branch 'feat/sync' into 'main'

6b9ec2a

test(sync): measure bucket-source readFile call count with template patterns See merge request hack-projects/deepl-cli!2

Merge branch 'release/1.1.0-consolidated' into 'main'

8f7d182

1.1.0 hardening: 8 cross-cutting improvements + CHANGELOG audit See merge request hack-projects/deepl-cli!3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish v1.1.0: continuous localization (deepl sync) + hardening#44

Publish v1.1.0: continuous localization (deepl sync) + hardening#44
sjsyrek wants to merge 18 commits intomainfrom
release/v1.1.0-publish

sjsyrek commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sjsyrek commented Apr 23, 2026

Summary

Changes Made

Test Coverage

Backward Compatibility

Size: Large

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant