Skip to content

fix(daemon): run async kubo cleanup before process.exit on startup failure (issue #98)#99

Merged
Rinse12 merged 2 commits into
masterfrom
fix/daemon-graceful-exit-cleanup-98
Jun 13, 2026
Merged

fix(daemon): run async kubo cleanup before process.exit on startup failure (issue #98)#99
Rinse12 merged 2 commits into
masterfrom
fix/daemon-graceful-exit-cleanup-98

Conversation

@Rinse12

@Rinse12 Rinse12 commented Jun 13, 2026

Copy link
Copy Markdown
Member

Closes #98

Problem

The daemon registers an asyncExitHook (exit-hook) to kill kubo and destroy the RPC server on shutdown. exit-hook only runs async hooks for SIGINT/SIGTERM/beforeExit; an explicit process.exit() runs only synchronous hooks and prints the SYNCHRONOUS TERMINATION NOTICE.

The notice's real trigger is not the force-quit guard at daemon.ts:655 (by the second signal exit-hook's internal isCalled is already true, so its synchronous handler returns before the warning). It's the startup-failure path:

  1. The hook is registered (daemon.ts:600)
  2. keepKuboUp() spawns kubo (:688)
  3. createOrConnectRpc() (:689) throws — e.g. the Test flake: hardcoded kubo API ports in macOS ephemeral range cause intermittent 'address already in use' #87/Flaky daemon tests: bind-race retry predicate misses the daemon's 'PKC RPC port became occupied' fail-fast (lineage #87) #97 TOCTOU RPC-port race
  4. The throw unwinds to oclif's error handler (@oclif/core errors/handle.js), which calls process.exit(1)
  5. That skips the async cleanup → the notice prints, kubo is orphaned, and the daemon's state file is left behind

Fix (src/cli/commands/daemon.ts)

  • Extract the async exit-hook body into a named shutdownDaemon() and capture both the hook's unsubscribe handle and the fn (hoisted above the try).
  • Startup-failure catch: await shutdownDaemon() (kills kubo, destroys the RPC server, deletes the state file) and drop the hook before re-throwing, so oclif's process.exit() runs clean. No-ops if we failed before the hook was registered.
  • Force-quit guard: drop the hook before its deliberate process.exit() (belt-and-suspenders); kubo is still SIGKILLed by the emergency process.on("exit") handler.

Test

  • New guarded test hook PKC_CLI_TEST_FAIL_AFTER_KUBO_START throws at the top of createOrConnectRpc, after kubo is up — mirroring the real port race.
  • New daemon.test.ts case: "kills kubo (and prints no exit-hook termination notice) when startup fails after kubo is up" — asserts non-zero exit, no SYNCHRONOUS TERMINATION NOTICE, and kubo stopped (not orphaned).
  • Verified red before the fix (daemon printed the notice and orphaned kubo), green after.

Verification

  • npm run build && npm run build:test pass
  • Full daemon.test.ts (23 tests), update-install-restart-race, and daemon-kubo-restart-race suites pass

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved daemon startup reliability with proper cleanup of background processes if initialization fails
    • Enhanced shutdown signal handling to prevent process orphaning during termination
  • Tests

    • Added test coverage for daemon recovery during startup failures

…ilure (issue #98)

The daemon registers an asyncExitHook (exit-hook) to kill kubo and destroy the
RPC server on shutdown. exit-hook only runs async hooks for SIGINT/SIGTERM/
beforeExit; an explicit process.exit() runs only synchronous hooks and prints
"SYNCHRONOUS TERMINATION NOTICE".

The notice's real trigger is the startup-failure path, not the force-quit guard
at daemon.ts:655 (by the second signal exit-hook's internal isCalled is already
true, so its synchronous handler returns before the warning). When the hook is
registered (daemon.ts:600), kubo is spawned by keepKuboUp() (:688), and then
createOrConnectRpc() (:689) throws — e.g. the #87/#97 TOCTOU RPC-port race — the
throw unwinds to oclif's error handler (@oclif/core errors/handle.js), which
calls process.exit(1). That skips the async cleanup: the notice prints, kubo is
orphaned, and this daemon's state file is left behind.

Fix:
- Extract the async exit-hook body into a named shutdownDaemon() and capture
  both the hook's unsubscribe handle and the fn (hoisted above the try).
- In the startup-failure catch, await shutdownDaemon() (kills kubo, destroys the
  RPC server, deletes the state file) and drop the hook before re-throwing, so
  oclif's process.exit() runs clean. No-ops if we failed before the hook was
  registered.
- In the force-quit guard, drop the hook before its deliberate process.exit() as
  belt-and-suspenders; kubo is still SIGKILLed by the emergency "exit" handler.

Test:
- Add a guarded test hook PKC_CLI_TEST_FAIL_AFTER_KUBO_START that throws at the
  top of createOrConnectRpc, after kubo is up, mirroring the real port race.
- New daemon.test.ts case "kills kubo (and prints no exit-hook termination
  notice) when startup fails after kubo is up": asserts non-zero exit, no
  SYNCHRONOUS TERMINATION NOTICE, and kubo stopped (not orphaned). Verified red
  before the fix, green after.

Full daemon.test.ts (23), update-install-restart-race, and
daemon-kubo-restart-race suites pass.
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@Rinse12, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 32 minutes and 32 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 821961f0-a2a7-4ee0-81e5-61d74980d6b7

📥 Commits

Reviewing files that changed from the base of the PR and between 7d94cdb and bf4778e.

📒 Files selected for processing (2)
  • src/cli/commands/daemon.ts
  • test/cli/daemon.test.ts
📝 Walkthrough

Walkthrough

The PR fixes issue #98 by refactoring the daemon's startup and shutdown flow to properly handle async kubo cleanup. The code captures both the async shutdown function and exit-hook unsubscribe handle, then uses these to run cleanup and drop the hook before process.exit() on startup errors or hard termination signals, preventing kubo orphaning and the "SYNCHRONOUS TERMINATION NOTICE" warning.

Changes

Daemon Async Shutdown & Kubo Orphan Prevention

Layer / File(s) Summary
Shutdown mechanism and state capture
src/cli/commands/daemon.ts
Variables introduced to store async shutdown function and exit-hook unsubscribe handler; shutdownDaemon async function refactored and registered via exit-hook, capturing both the unsubscribe handle and shutdown function for later invocation and removal in other paths.
Startup error recovery with async cleanup
src/cli/commands/daemon.ts
In run() startup error catch block, captured async shutdown cleanup is conditionally invoked and exit hook is removed before rethrowing error, ensuring kubo/daemon teardown is not skipped on oclif's process.exit() unwind path.
Force-quit with explicit hook removal
src/cli/commands/daemon.ts
When second SIGINT/SIGTERM is received during shutdown, captured removeAsyncExitHook is called before forcing immediate process.exit(), avoiding async exit-hook "synchronous termination notice" behavior and orphaned kubo.
Regression test with test-only failure hook
src/cli/commands/daemon.ts, test/cli/daemon.test.ts
Test-only environment-variable guard (PKC_CLI_TEST_FAIL_AFTER_KUBO_START) added to createOrConnectRpc to simulate startup failure after kubo starts; new test verifies daemon exits non-zero without "SYNCHRONOUS TERMINATION NOTICE" and confirms kubo is properly torn down.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #59: Both changes modify daemon startup/shutdown sequencing to ensure async shutdown cleanup/exit-hook registration happens before kubo/RPC readiness, preventing orphaned kubo on early signal termination.

Possibly related PRs

  • bitsocialnet/bitsocial-cli#74: Both PRs modify src/cli/commands/daemon.ts's shutdown/asyncExitHook behavior with timeout/cleanup around exit, using env-driven shutdown timing to prevent kubo teardown/restart races.
  • bitsocialnet/bitsocial-cli#47: Both PRs modify src/cli/commands/daemon.ts startup logic, specifically the createOrConnectRpc flow; main adds the PKC_CLI_TEST_FAIL_AFTER_KUBO_START failure hook.
  • bitsocialnet/bitsocial-cli#71: Both PRs change src/cli/commands/daemon.ts's kubo process lifecycle/termination handling with shutdown cleanup and exit-hook/process exit behavior to prevent kubo orphaning.

Poem

🐰 Hops to fix the exit hooks,
No more orphaned kubo—clean teardowns!
Second signals drop the chain,
Async cleanup runs, shutdown's plain.
One more race won, the daemon's sound!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: ensuring async kubo cleanup runs before process.exit on startup failure, which is precisely what the code changes implement.
Linked Issues check ✅ Passed All coding requirements from issue #98 are met: async exit-hook body extracted into shutdownDaemon(), hook unsubscribe handle hoisted, startup-failure catch awaits shutdownDaemon() before rethrowing, force-quit guard drops hook before process.exit(), and test-failure simulation added.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the async kubo cleanup issue (#98); no unrelated modifications to other functionality or files are present.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/daemon-graceful-exit-cleanup-98

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cli/commands/daemon.ts`:
- Around line 748-755: The startup-failure shutdown should be capped with
DAEMON_SHUTDOWN_TIMEOUT_MS just like the signal path: replace the direct await
of runDaemonShutdown() with a timed await (e.g., Promise.race between
runDaemonShutdown() and a timeout promise that resolves/rejects after
DAEMON_SHUTDOWN_TIMEOUT_MS) so shutdownDaemon()/daemonServer.destroy() cannot
hang forever; preserve the existing .catch(...) behavior to swallow shutdown
errors and ensure removeAsyncExitHook?.() still runs afterward.

In `@test/cli/daemon.test.ts`:
- Around line 884-895: Add an assertion that the injected failure reason is
present by checking the combined output for the exact failure message; after
constructing combinedOutput (from result.stdout and result.stderr) add
expect(combinedOutput).toContain("Simulated startup failure after kubo start")
so the test ensures the PKC_CLI_TEST_FAIL_AFTER_KUBO_START branch was hit when
calling runPkcDaemonExpectFailure.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a26718a1-77d0-4425-8b0c-b4bfddb2db18

📥 Commits

Reviewing files that changed from the base of the PR and between df1a29d and 7d94cdb.

📒 Files selected for processing (2)
  • src/cli/commands/daemon.ts
  • test/cli/daemon.test.ts

Comment thread src/cli/commands/daemon.ts
Comment thread test/cli/daemon.test.ts Outdated
…ted path (issue #98)

Address CodeRabbit review on #99:

- Cap the manual startup-failure shutdown with DAEMON_SHUTDOWN_TIMEOUT_MS
  (Promise.race with an unref'd timer), matching the timeout contract the signal
  path gets from exit-hook's `wait`, so a hung daemonServer.destroy() can't
  swallow the original startup error forever. Low live risk (daemonServer is
  ~never set on the catch path and killKuboProcess is already bounded), but it
  keeps both exit paths consistent.

- Assert the test's combined output contains the injected failure message so the
  case stays pinned to the PKC_CLI_TEST_FAIL_AFTER_KUBO_START path instead of
  passing on any non-zero failure. Wrap the spawn in withKuboBindRetry so a lost
  kubo bind race (issue #87) retries with fresh ports rather than flaking the new
  message assertion.

daemon.test.ts (23), update-install-restart-race, and daemon-kubo-restart-race
all pass.
@Rinse12 Rinse12 merged commit cc4e3f2 into master Jun 13, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Daemon uses process.exit() while async exit hooks are pending (exit-hook 'SYNCHRONOUS TERMINATION NOTICE')

1 participant