Add long-running tasks and machine lifecycle blueprint#2416
Merged
Conversation
New guide covering the interaction between auto_stop_machines and long-running in-process work: how the proxy decides to stop a machine, why background tasks are invisible to that decision, and two patterns (disable autostop with an in-app drain; split web and worker into separate process groups) to keep work from getting killed. Also covers kill_signal/kill_timeout semantics under autostop and other stop pathways. Adds the entry to the Background Jobs & Automation section of the blueprints index (with a NEW!! tag) and to the corresponding sidebar nav group.
Replace em dash separators with colons in the Picking a pattern table, and replace the em dash placeholder in the kill_signal Max column with n/a.
Replace prose references to 'blueprint(s)' with 'guide(s)' throughout the doc. Link paths under /docs/blueprints/ are unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
New blueprint covering the interaction between
auto_stop_machinesand long-running in-process work. Documents how the Fly proxy decides to stop a machine, why background tasks are invisible to that decision, and two patterns to keep work from getting killed:SIGTERM/kill_timeout.Also covers
kill_signal/kill_timeoutsemantics under autostop and other stop pathways (deploys,fly machine stop, host migrations), and a Common Problems section that addresses things like self-pings (they don't work) and workers not stopping on deploy.Placement
blueprints/long-running-tasks.html.mdblueprints/index.html.mdwith aNEW!!tag, next to the work-queues, task-scheduling, and supercronic blueprints.partials/_guides_nav.html.erb.Empirical backing
Every technical claim in the draft is backed by a live deployment test on Fly. Specifically:
auto_stop_machines = "off"keeps machines up;SIGTERM/kill_timeoutgraceful drain fires on manual stop with the full drain window observed.webtier while theworkertier stays untouched.<app>.fly.devhostname. The proxy stops the machine within 5 to 10 minutes regardless. This is captured in the Common Problems section as "Why doesn't a self-ping keep my machine alive?"kill_timeoutconfirmed to be honored under the autostop pathway (full drain window beforeSIGKILL).