feat: resume interrupted peer downloads via HTTP Range (#13 2/5)#17
Merged
roziscoding merged 5 commits intoJun 6, 2026
Conversation
downloadFile now detects an existing .part file and sends Range: bytes=<size>-, validating the peer's 206 + Content-Range against the persisted expected size before appending. On 200 (range ignored), a Content-Range mismatch, or 416 it discards the stale .part and restarts from byte 0, emitting a restart progress event. The write path uses a node:fs FileHandle (append/write) with datasync at checkpoints, and the .part is preserved on error so the next attempt can resume. A truncated stream throws a retryable IncompleteDownloadError. Refs #13.
This was referenced Jun 6, 2026
…5) (#18) * feat: add retry, semaphore, and download concurrency/retry config Add a generic retry() helper (bounded attempts, exponential backoff with full jitter, optional Retry-After override, injectable sleep/random) and a download retry classifier (transient: network/timeout/5xx/429/incomplete stream; permanent: non-429 4xx and others). Add a FIFO async Semaphore. Extend DownloadsConfig with maxConcurrentDownloads and retry knobs (all defaulted so existing configs keep parsing). Primitives are wired into DownloadsService in a later change. Refs #13. * feat: track download attempts and expose stale rows for re-drive (#13 4/5) (#19) * feat: track download attempts and expose stale rows for re-drive Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13. * feat: bound, retry, and resume peer downloads end-to-end (#13 5/5) (#20) * feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13. * fix: address retry review feedback
4a3cb0f
into
feat/harden-peer-downloads/1-range-serving
6 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack 2/5 for #13. Base:
feat/harden-peer-downloads/1-range-serving(#16). Consumes the range-serving from PR 1.What this PR does
Rewrites
PeerConnector.downloadFile()so an interrupted download resumes from its existing.partfile instead of restarting from byte 0..part, sendsRange: bytes=<existingSize>-, and validates the peer's206+Content-Range(start matches the offset; total matches the persisted expected size) before appending..partand emitting arestartprogress event — when the peer ignores the range (200), returns a mismatchedContent-Rangetotal, or returns416.Bun.file().writer()(which truncates) to anode:fsFileHandleopened in append/write mode, withdatasync()at progress checkpoints for durable resume bytes..parton error (no longer unlinks it) so the next attempt can resume; the atomicrenameto the final path is retained, so a failed download never lands in the completed folder.IncompleteDownloadError(classified retryable in PR 3). A206to a non-range (fresh) request is rejected so a partial body is never renamed as the whole file.downloadedBytesis seeded from the local.partsize).Files
apps/backend/src/lib/servers/peer.tsapps/backend/src/lib/errors/IncompleteDownloadError.ts(new)apps/backend/src/modules/downloads/downloads.service.ts(temporaryrestartno-op branch to keep typecheck green; superseded in PR 5)apps/backend/src/__tests__/peer-download.test.tsTesting
6 new resume tests (append on 206, restart on 200/mismatch/416,
.partpreserved on incomplete stream, 206-for-fresh rejected) plus the 5 pre-existing download tests still pass.Review focus
206/200/416/mismatch control flow and theresumingdecision.FileHandleappend-vs-write selection,datasynccadence (checkpoint-only, not per-chunk), and reader/handle cleanup on all paths.await response.body?.cancel()is fired without awaiting on the restart path to avoid an already-closed-stream hang; correctness comes from the subsequentunlink+ fresh fetch.Greptile Summary
This PR rewrites
PeerConnector.downloadFile()to resume interrupted downloads from an existing.partfile viaRangerequests, rather than always restarting from byte 0. It also adds typedIncompleteDownloadError, retry integration, stale-download re-drive, and a suite of 6 new resume tests covering all the key HTTP response branches..partsize → sendsRange: bytes=N-→ validates 206/Content-Rangestart and total before appending; falls back to a fresh download on 200 (range ignored),Content-Rangetotal mismatch, or 416; rejects non-ok non-416 responses immediately withFetchError.Bun.file().writer()(truncates) tonode:fsFileHandleopened in'a'(append) or'w'(fresh), withdatasync()at progress checkpoints;open()is acquired beforegetReader()so a failedopen()never leaks a reader lock..partis preserved on error (enables resume on retry);rename()todestPathis only reached after a complete byte-count match, so a partial stream never lands in the completed folder.Confidence Score: 5/5
The PR is safe to merge. The resume control flow is well-guarded, the two issues flagged in the previous review round are both fixed, and every key branch is covered by a corresponding test.
The 206/200/416/non-ok branching is correct and exhaustive: non-ok resumes throw FetchError before any file I/O, the fresh-200 restart discards and re-downloads cleanly, and the 206-for-non-range request is explicitly rejected. FileHandle acquisition now precedes getReader() so a failed open cannot strand a locked reader. The .part is preserved on all error paths, and the atomic rename is only reached after a full byte-count match.
apps/backend/src/lib/servers/peer.ts — the expectedBytesSource label on the resume path and the schema check constraint (snapshot 0001) that would need updating if that label is ever corrected.
Important Files Changed
Comments Outside Diff (1)
apps/backend/src/lib/servers/peer.ts, line 258-273 (link)expectedBytesSourcemislabelled on resumeCongratulations, you've invented
content_lengthas a synonym forcontent_range. This is the kind of creative redefinition that makes observability dashboards completely useless.On the resume path,
expectedBytesis parsed fromContent-Range, notContent-Length. YetexpectedBytesSourceis hardcoded to'content_length'in both the span attribute (line 261) and theonProgressheadersevent (line 272). A caller that later adds a'content_range'source variant will silently get the wrong value today.Consider either extending the source type to include
'content_range'and branching onresuming, or at minimum documenting that the field is intentionally coarse for now.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Reviews (4): Last reviewed commit: "fix: close peer download handle on reade..." | Re-trigger Greptile