feat: retry, semaphore, and download concurrency/retry config (#13 3/5)#18
Merged
roziscoding merged 3 commits intoJun 6, 2026
Conversation
Add a generic retry() helper (bounded attempts, exponential backoff with full jitter, optional Retry-After override, injectable sleep/random) and a download retry classifier (transient: network/timeout/5xx/429/incomplete stream; permanent: non-429 4xx and others). Add a FIFO async Semaphore. Extend DownloadsConfig with maxConcurrentDownloads and retry knobs (all defaulted so existing configs keep parsing). Primitives are wired into DownloadsService in a later change. Refs #13.
This was referenced Jun 6, 2026
…4/5) (#19) * feat: track download attempts and expose stale rows for re-drive Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13. * feat: bound, retry, and resume peer downloads end-to-end (#13 5/5) (#20) * feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13.
ce79179
into
feat/harden-peer-downloads/2-client-resume
6 checks passed
roziscoding
added a commit
that referenced
this pull request
Jun 6, 2026
* feat: resume interrupted peer downloads via HTTP Range downloadFile now detects an existing .part file and sends Range: bytes=<size>-, validating the peer's 206 + Content-Range against the persisted expected size before appending. On 200 (range ignored), a Content-Range mismatch, or 416 it discards the stale .part and restarts from byte 0, emitting a restart progress event. The write path uses a node:fs FileHandle (append/write) with datasync at checkpoints, and the .part is preserved on error so the next attempt can resume. A truncated stream throws a retryable IncompleteDownloadError. Refs #13. * feat: retry, semaphore, and download concurrency/retry config (#13 3/5) (#18) * feat: add retry, semaphore, and download concurrency/retry config Add a generic retry() helper (bounded attempts, exponential backoff with full jitter, optional Retry-After override, injectable sleep/random) and a download retry classifier (transient: network/timeout/5xx/429/incomplete stream; permanent: non-429 4xx and others). Add a FIFO async Semaphore. Extend DownloadsConfig with maxConcurrentDownloads and retry knobs (all defaulted so existing configs keep parsing). Primitives are wired into DownloadsService in a later change. Refs #13. * feat: track download attempts and expose stale rows for re-drive (#13 4/5) (#19) * feat: track download attempts and expose stale rows for re-drive Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13. * feat: bound, retry, and resume peer downloads end-to-end (#13 5/5) (#20) * feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13. * fix: address retry review feedback * fix: guard non-ok resume responses * fix: avoid leaked peer download reader lock * fix: close peer download handle on reader failure
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack 3/5 for #13. Base:
feat/harden-peer-downloads/2-client-resume(#17).What this PR does
Adds the two cross-cutting primitives the hardened download flow needs, plus their config knobs. These are standalone and unit-tested here; they're wired into
DownloadsServicein PR 5.retry()(src/lib/retry.ts) — bounded attempts, exponential backoff with full jitter, optionalRetry-Afteroverride (capped atmaxDelayMs), and injectablesleep/randomfor deterministic tests.isTransientDownloadError/downloadRetryAfterMs(src/modules/downloads/retry-policy.ts) — transient (retry): network failures, timeouts, HTTP 5xx, 429, andIncompleteDownloadError; permanent (no retry): non-429 4xx and everything else. ParsesRetry-After(seconds or HTTP-date).Semaphore(src/lib/semaphore.ts) — FIFO async counting semaphore with direct permit handoff; throws on a non-positive permit count.DownloadsConfiggainsmaxConcurrentDownloads(default 3),maxDownloadAttempts(5),retryBaseDelayMs(1000),retryMaxDelayMs(60000) — all defaulted so existing configs keep parsing.examples/config.jsoncdocuments them.Files
apps/backend/src/lib/retry.ts,apps/backend/src/lib/semaphore.ts(new)apps/backend/src/modules/downloads/retry-policy.ts(new)apps/backend/src/lib/config.ts,examples/config.jsoncapps/backend/src/__tests__/retry.test.ts,semaphore.test.ts(new);config.test.ts(+2)apps/backend/src/__tests__/{downloads-api,integration}.test.ts— config literals switched toAppConfig.parse({...})because the new defaulted fields are required on the output type.Testing
12 retry/classifier tests, 4 semaphore tests, 2 config tests. Full suite green.
Review focus
Retry-Aftercap; the transient-vs-permanent classification boundaries.Greptile Summary
This PR introduces two cross-cutting primitives — a bounded-retry helper with exponential backoff/full-jitter and a FIFO counting semaphore — plus a download retry classification policy and four new
DownloadsConfigknobs.DownloadsServiceis refactored to wire these together: concurrent downloads are now capped by a semaphore, transient failures are retried withRetry-Aftersupport, and stale rows from prior runs are actively re-driven instead of bulk-failed.retry.ts/semaphore.ts: New standalone primitives with injectedsleep/randomfor deterministic testing;retryvalidatesmaxAttempts ≥ 1eagerly and uses AWS full-jitter backoff;Semaphorehands permits directly to FIFO waiters and releases correctly on throw.retry-policy.ts:isTransientDownloadErrorcorrectly classifies 5xx/429/networkTypeError/TimeoutError/IncompleteDownloadErroras transient and manualAbortErroras permanent;downloadRetryAfterMsparses both seconds and HTTP-dateRetry-Afterheaders.downloads.service.ts: AddsresumeStaleDownloads()with destPath deduplication,reenqueuedguard to block the watcher from duplicating resuming stubs, and retry-wrappeddownloadWithRetry; theactiveSet check and add happen with noawaitin between, so there is no TOCTOU race in the duplicate-destPath guard.Confidence Score: 5/5
Safe to merge. The new primitives are standalone, well-tested, and the service wiring is correct.
The retry helper, semaphore, and retry policy are all independently unit-tested with injected dependencies. AbortError is correctly classified as permanent. The FIFO semaphore releases permits correctly on throw. The active/reenqueued guards in DownloadsService are set synchronously before any await, preventing TOCTOU races. All issues raised in prior review rounds have been addressed.
No files require special attention.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[processTorrentFile] --> B{reenqueued?} B -- yes --> Z1[skip] B -- no --> C{file exists?} C -- no --> Z2[skip] C -- yes --> D[parse stub] D --> E{active.has destPath?} E -- yes --> Z3[skip duplicate] E -- no --> F[create DB row] F --> G[runDownload] R[resumeStaleDownloads] --> S[listStaleDownloads] S --> T[dedupe by destPath] T --> U[claim reenqueued set] U --> G G --> H{active.has destPath?} H -- yes --> Z4[return] H -- no --> I[active.add] I --> J[semaphore.run downloadWithRetry] J --> K{peer found?} K -- no --> Z5[markFailed] K -- yes --> L[retry: incrementAttempts + downloadFile] L --> M{success?} M -- yes --> N[unlink + triggerImport + markImportQueued + reenqueued.delete] M -- transient --> O[sleep jitter or Retry-After] O --> L M -- permanent --> P[markFailed] J --> Q[finally: active.delete]Reviews (3): Last reviewed commit: "fix: address retry review feedback" | Re-trigger Greptile