Skip to content

User-killed sessions can be restored on daemon restart from stale worktree markers #429

Description

@neversettle17-101

Bug

User-killed sessions can reappear as active after the daemon is started again.

Source: local report from Chauhan in Codex chat
Analyzed against: 8241868398217f85cf771b5a97ad15ae2ad349e0
AO version: /tmp/ao version -> dev
Daemon under test: verified ReverbCode binary /tmp/ao; /tmp/ao status reported port: 3001, run file /Users/chauhan/.ao/running.json, data dir /Users/chauhan/.ao/data
Priority: high — explicit session deletion/kill intent is not durable across daemon restart
Confidence: High — traced exact lifecycle code path and confirmed stale restore markers in SQLite

Reproduction

  1. Start the daemon and create one or more sessions.
  2. Let the daemon save/reconcile at least one session so it has a session_worktrees row.
  3. Kill/delete all sessions in the project through the UI or CLI.
  4. Start/spin up the daemon again.
  5. Run /tmp/ao session ls --all.

Observed

Sessions that were killed/deleted show up in the normal active session list.

On the reporter's machine, /tmp/ao session ls --all showed these non-terminated sessions in project agent-orchestrator after the user had deleted/killed all sessions:

agent-orchestrator:
  agent-orchestrator-14  (20m)  [mergeable]  worker
  agent-orchestrator-19  (18m)  [mergeable]  worker
  agent-orchestrator-2  (1d)  [needs_input]  orchestrator
  agent-orchestrator-20  (2d)  [idle]  orchestrator
  agent-orchestrator-4  (20m)  [review_pending]  worker
20 terminated sessions hidden. Use --include-terminated to show.

SQLite state confirmed stale restore markers on most resurrected active sessions:

agent-orchestrator-14|0|idle|1|
agent-orchestrator-19|0|idle|1|
agent-orchestrator-2|0|waiting_input|1|
agent-orchestrator-20|0|idle|0|<empty>
agent-orchestrator-4|0|idle|1|

The columns were session id | is_terminated | activity_state | session_worktrees marker_count | preserved refs.

Expected

Sessions explicitly killed/deleted by the user should stay terminated across daemon restarts and should not appear in /tmp/ao session ls or /tmp/ao session ls --all unless restored or recreated explicitly.

Root Cause

RestoreAll treats the presence of any session_worktrees row as the "shutdown-saved" marker and relaunches the session:

  • backend/internal/session_manager/manager.go:756 documents that session_worktrees row presence means shutdown-saved.
  • backend/internal/session_manager/manager.go:777 checks ListSessionWorktrees.
  • backend/internal/session_manager/manager.go:783 only skips restore when there are no rows.

Kill records terminal intent and tears down runtime/workspace, but it does not clear existing session_worktrees restore markers:

  • backend/internal/session_manager/manager.go:430 starts Kill.
  • backend/internal/session_manager/manager.go:443 calls MarkTerminated.
  • The function then destroys runtime/workspace, but has no call to clear worktree restore markers.

There is already a store method for clearing those markers:

  • backend/internal/storage/sqlite/store/session_worktree_store.go:63 defines DeleteSessionWorktrees.

So a session that was previously shutdown-saved can later be user-killed while retaining its session_worktrees marker. On next daemon boot, Reconcile -> RestoreAll interprets the stale marker as restore intent and relaunches it.

agent-orchestrator-20 had no marker in the live DB snapshot, so it may be a separate active orchestrator creation path or a newly spawned/recreated orchestrator; the marker-backed resurrection mechanism is confirmed for agent-orchestrator-2, -4, -14, and -19.

Fix

Do not implement in this issue filing pass.

Suggested approach:

  • Extend the session manager store contract to include marker deletion, or otherwise expose a command-side method to clear session worktree rows.
  • In Manager.Kill, clear session_worktrees rows for the session as part of explicit user kill intent, before or along with MarkTerminated.
  • Add a regression test where a live session has an existing session_worktrees row, Kill is called, then RestoreAll must not relaunch it.

Impact

  • Users cannot reliably clear a project/session list.
  • Killed sessions can resume and continue consuming runtime/agent resources.
  • The UI/CLI violates user intent by showing sessions as active after explicit deletion/kill.
  • This affects lifecycle/session-manager and storage restore-marker boundaries.

Related

No duplicates found via:

  • session_worktrees RestoreAll kill resurrect session
  • sessions restored after kill
  • RestoreAll session_worktrees

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinglcm-smLifecycle + Session Manager laneneeds-triageMaintainer needs to evaluate this issuepriority: highFix soonstoragePersistence lane

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions