feat: track download attempts and expose stale rows for re-drive (#13 4/5)#19
Merged
roziscoding merged 2 commits intoJun 6, 2026
Conversation
Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13.
This was referenced Jun 6, 2026
* feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13.
7c8844e
into
feat/harden-peer-downloads/3-retry-concurrency
6 checks passed
roziscoding
added a commit
that referenced
this pull request
Jun 6, 2026
…5) (#18) * feat: add retry, semaphore, and download concurrency/retry config Add a generic retry() helper (bounded attempts, exponential backoff with full jitter, optional Retry-After override, injectable sleep/random) and a download retry classifier (transient: network/timeout/5xx/429/incomplete stream; permanent: non-429 4xx and others). Add a FIFO async Semaphore. Extend DownloadsConfig with maxConcurrentDownloads and retry knobs (all defaulted so existing configs keep parsing). Primitives are wired into DownloadsService in a later change. Refs #13. * feat: track download attempts and expose stale rows for re-drive (#13 4/5) (#19) * feat: track download attempts and expose stale rows for re-drive Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13. * feat: bound, retry, and resume peer downloads end-to-end (#13 5/5) (#20) * feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13. * fix: address retry review feedback
roziscoding
added a commit
that referenced
this pull request
Jun 6, 2026
* feat: resume interrupted peer downloads via HTTP Range downloadFile now detects an existing .part file and sends Range: bytes=<size>-, validating the peer's 206 + Content-Range against the persisted expected size before appending. On 200 (range ignored), a Content-Range mismatch, or 416 it discards the stale .part and restarts from byte 0, emitting a restart progress event. The write path uses a node:fs FileHandle (append/write) with datasync at checkpoints, and the .part is preserved on error so the next attempt can resume. A truncated stream throws a retryable IncompleteDownloadError. Refs #13. * feat: retry, semaphore, and download concurrency/retry config (#13 3/5) (#18) * feat: add retry, semaphore, and download concurrency/retry config Add a generic retry() helper (bounded attempts, exponential backoff with full jitter, optional Retry-After override, injectable sleep/random) and a download retry classifier (transient: network/timeout/5xx/429/incomplete stream; permanent: non-429 4xx and others). Add a FIFO async Semaphore. Extend DownloadsConfig with maxConcurrentDownloads and retry knobs (all defaulted so existing configs keep parsing). Primitives are wired into DownloadsService in a later change. Refs #13. * feat: track download attempts and expose stale rows for re-drive (#13 4/5) (#19) * feat: track download attempts and expose stale rows for re-drive Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13. * feat: bound, retry, and resume peer downloads end-to-end (#13 5/5) (#20) * feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13. * fix: address retry review feedback * fix: guard non-ok resume responses * fix: avoid leaked peer download reader lock * fix: close peer download handle on reader failure
roziscoding
added a commit
that referenced
this pull request
Jun 6, 2026
* feat: resume interrupted peer downloads via HTTP Range downloadFile now detects an existing .part file and sends Range: bytes=<size>-, validating the peer's 206 + Content-Range against the persisted expected size before appending. On 200 (range ignored), a Content-Range mismatch, or 416 it discards the stale .part and restarts from byte 0, emitting a restart progress event. The write path uses a node:fs FileHandle (append/write) with datasync at checkpoints, and the .part is preserved on error so the next attempt can resume. A truncated stream throws a retryable IncompleteDownloadError. Refs #13. * feat: retry, semaphore, and download concurrency/retry config (#13 3/5) (#18) * feat: add retry, semaphore, and download concurrency/retry config Add a generic retry() helper (bounded attempts, exponential backoff with full jitter, optional Retry-After override, injectable sleep/random) and a download retry classifier (transient: network/timeout/5xx/429/incomplete stream; permanent: non-429 4xx and others). Add a FIFO async Semaphore. Extend DownloadsConfig with maxConcurrentDownloads and retry knobs (all defaulted so existing configs keep parsing). Primitives are wired into DownloadsService in a later change. Refs #13. * feat: track download attempts and expose stale rows for re-drive (#13 4/5) (#19) * feat: track download attempts and expose stale rows for re-drive Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13. * feat: bound, retry, and resume peer downloads end-to-end (#13 5/5) (#20) * feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13. * fix: address retry review feedback * fix: guard non-ok resume responses * fix: avoid leaked peer download reader lock * fix: close peer download handle on reader failure
roziscoding
added a commit
that referenced
this pull request
Jun 6, 2026
* feat: serve HTTP byte ranges on peer file endpoint Parse the Range header in the peer file route and serve 206 Partial Content with Content-Range, 416 Range Not Satisfiable for unsatisfiable ranges, and Accept-Ranges: bytes on full responses. streamFile now returns a discriminated result (full/partial/unsatisfiable) resolved against the file size, streaming only the requested slice. Foundation for resumable peer downloads (#13). * feat: resume interrupted peer downloads via HTTP Range (#13 2/5) (#17) * feat: resume interrupted peer downloads via HTTP Range downloadFile now detects an existing .part file and sends Range: bytes=<size>-, validating the peer's 206 + Content-Range against the persisted expected size before appending. On 200 (range ignored), a Content-Range mismatch, or 416 it discards the stale .part and restarts from byte 0, emitting a restart progress event. The write path uses a node:fs FileHandle (append/write) with datasync at checkpoints, and the .part is preserved on error so the next attempt can resume. A truncated stream throws a retryable IncompleteDownloadError. Refs #13. * feat: retry, semaphore, and download concurrency/retry config (#13 3/5) (#18) * feat: add retry, semaphore, and download concurrency/retry config Add a generic retry() helper (bounded attempts, exponential backoff with full jitter, optional Retry-After override, injectable sleep/random) and a download retry classifier (transient: network/timeout/5xx/429/incomplete stream; permanent: non-429 4xx and others). Add a FIFO async Semaphore. Extend DownloadsConfig with maxConcurrentDownloads and retry knobs (all defaulted so existing configs keep parsing). Primitives are wired into DownloadsService in a later change. Refs #13. * feat: track download attempts and expose stale rows for re-drive (#13 4/5) (#19) * feat: track download attempts and expose stale rows for re-drive Add an attempts column to the downloads table (additive migration) and repository methods: incrementAttempts, markResumeReset (reset downloadedBytes and record the resume-from-zero transition), and listStaleDownloads (returns stale downloading rows without mutating them, for active startup re-enqueue). reconcileStaleDownloads is kept as the fallback for when downloads is unconfigured. Refs #13. * feat: bound, retry, and resume peer downloads end-to-end (#13 5/5) (#20) * feat: bound, retry, and resume peer downloads end-to-end Rewire DownloadsService around a shared Semaphore (maxConcurrentDownloads), a retry loop (bounded backoff+jitter, attempts tracked, transient vs permanent classification, Retry-After honored), and resume: the restart progress event persists via markResumeReset, and an active/reenqueued dedupe prevents duplicate rows. On startup, index.ts re-drives stale downloading rows with resumeStaleDownloads() before the watcher scans, falling back to reconcileStaleDownloads() when downloads is unconfigured. Closes #13. * fix: harden startup re-enqueue dedupe (review feedback) - Dedupe stale downloading rows by destPath before re-driving: only one row per destination is resumable (they share the same .part), so mark the superseded duplicates failed instead of letting the second silently early-return in runDownload and stay stuck in downloading. - Release the reenqueued claim on successful resume (stub already unlinked, so no scan race) so a later legitimate re-drop of the same torrent filename is not silently skipped for the rest of the process. Refs #13. * fix: address retry review feedback * fix: guard non-ok resume responses * fix: avoid leaked peer download reader lock * fix: close peer download handle on reader failure * test: use Bun.write instead of node:fs writeFile in range test * chore: sync bun.lock with hono bump from main
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack 4/5 for #13. Base:
feat/harden-peer-downloads/3-retry-concurrency(#18).What this PR does
Adds the persistence the startup re-enqueue and retry accounting need (consumed in PR 5). No call sites change here, so it ships independently with green tests.
attemptscolumn to thedownloadstable (additive Drizzle migration0001_tearful_the_fallen.sql:ALTER TABLE downloads ADD attempts integer DEFAULT 0 NOT NULL). Migrations auto-apply at DB open, so the test suite exercises it on fresh DBs.DownloadsRepositorygains:incrementAttempts(id)— atomicattempts = attempts + 1, returns the new count.markResumeReset(id)— resetsdownloadedBytesto 0 and records the resume-from-zero transition (status staysdownloading).listStaleDownloads()— returns staledownloadingrows without mutating them, for active re-drive.reconcileStaleDownloads()is intentionally retained as the fallback for whendownloadsis unconfigured.Files
apps/backend/src/database/schema.tsapps/backend/drizzle/0001_tearful_the_fallen.sql+drizzle/meta/*(generated)apps/backend/src/modules/downloads/downloads.repository.tsapps/backend/src/__tests__/database.test.ts(+2)Testing
2 new repository tests (attempts increment + resume reset; stale listing without mutation). Full suite green.
Review focus
incrementAttemptsuses a parameterized Drizzlesql\${downloads.attempts} + 1`column reference (no injection);listStaleDownloads` is read-only.Greptile Summary
[Linus Torvalds Mode]
Congratulations, you wrote a stack-PR that is actually coherent — I almost had to check if someone else wrote it. Fine, it moves, it doesn't explode, let's talk about what it actually does.
This PR (4/5 of the harden-peer-downloads stack) adds the persistence layer that the upcoming retry-and-re-enqueue machinery in PR 5 will consume. The core deliverables are:
ALTER TABLE downloads ADD attempts INTEGER DEFAULT 0 NOT NULL— safe for existing rows, no drama.incrementAttempts(id): Atomicattempts = attempts + 1via a parameterized Drizzlesqlexpression; returns the new count. Called inside theretryloop so each attempt is tracked.markResumeReset(id): ResetsdownloadedBytesto 0 and records the resume-from-zero transition inerror;statusstaysdownloading, cleared on eventualmarkCompleted.listStaleDownloads(): Read-onlyWHERE status = 'downloading'— consumed by the newresumeStaleDownloads()in the service.resumeStaleDownloads()inDownloadsService: Deduplicates bydestPath(marking superseded rows failed), claims all resumable stubs inreenqueuedsynchronously (before the watcher starts), then fires re-drives in the background behind the semaphore.reenqueuedis only cleared on success so a crashed re-drive keeps its claim for the next restart.index.ts:reconcileStaleDownloads(mark-all-failed fallback) moved to theelsebranch whenconfig.downloadsis absent;resumeStaleDownloadsruns when it is present.The two previously-flagged concerns on
listStaleDownloads(no time-based predicate) andincrementAttempts(silent 0 on missing ID) remain open — go read those threads again if you haven't.Confidence Score: 5/5
[Linus Torvalds Mode] Two open prior-thread concerns exist and haven't been resolved, which is annoying, but they don't block this additive PR from landing safely — no new P0/P1 issues found, migration is safe, tests pass.
I'm not handing out 5s like candy, but this one earns it. The prior thread issues (silent-zero return from
incrementAttemptson a missing ID, andlistStaleDownloadshaving no time-based staleness predicate) are already tracked and the author clearly knows about them. Within the scope of this PR: the migration is additive with a default, thereenqueuedguard correctly prevents watcher duplication, theid: -1fallback is unreachable whenrepois defined, and the fire-and-forget re-drive pattern is both intentional and properly handled. No new correctness or data-integrity issues discovered.apps/backend/src/modules/downloads/downloads.repository.ts— the two already-flagged issues live here; if you're feeling brave, fix them before PR 5 makes callers depend on the lying return value.Important Files Changed
attempts INTEGER DEFAULT 0 NOT NULL— safe for existing rows, no breakpoints needed.attemptsinteger column withnotNull().default(0)— matches the migration exactly.incrementAttempts,markResumeReset, andlistStaleDownloads;incrementAttemptsreturns 0 silently on nonexistent ID (previously flagged);listStaleDownloadshas no time-based staleness predicate (previously flagged).Semaphore, retry loop,resumeStaleDownloads,runDownload,downloadWithRetry; fire-and-forget re-drive pattern is intentional and guarded by thereenqueuedset;id: -1fallback is safe becauserepois undefined whenevercreatedis undefined.reconcileStaleDownloadsinto theelsebranch (no downloads config) and addsresumeStaleDownloadswhen config is present — correct conditional logic.incrementAttempts/markResumeResetround-trip andlistStaleDownloadsnon-mutation — both thorough and correct.Sequence Diagram
sequenceDiagram participant idx as index.ts participant svc as DownloadsService participant repo as DownloadsRepository participant db as SQLite DB participant peer as PeerConnector idx->>svc: resumeStaleDownloads() svc->>repo: listStaleDownloads() repo->>db: "SELECT WHERE status='downloading'" db-->>repo: stale rows repo-->>svc: DownloadRecord[] svc->>repo: markFailed(id, 'superseded') [duplicates] svc->>svc: reenqueued.add(torrentFilename) svc-->>idx: resumable.length (fire-and-forget started) idx->>idx: BlackholeWatcher.start() par background re-drives svc->>svc: runDownload(record) svc->>svc: semaphore.run(...) svc->>repo: incrementAttempts(id) repo->>db: "UPDATE attempts = attempts + 1" svc->>peer: downloadFile(...) peer-->>svc: onProgress(restart) svc->>repo: markResumeReset(id) repo->>db: "UPDATE downloadedBytes=0, error=..." peer-->>svc: onProgress(completed) svc->>repo: markCompleted(id, bytes) svc->>repo: markImportQueued(id) svc->>svc: reenqueued.delete(torrentFilename) end idx->>svc: processTorrentFile(filePath, filename) alt filename in reenqueued svc-->>idx: skip (stub owned by re-enqueue) else normal processing svc->>repo: create(...) svc->>svc: runDownload(record) endReviews (2): Last reviewed commit: "feat: bound, retry, and resume peer down..." | Re-trigger Greptile