Skip to content

feat(cli): add lk agent session for headless text-mode agent runs#857

Open
toubatbrian wants to merge 9 commits into
mainfrom
feat/agent-session-daemon
Open

feat(cli): add lk agent session for headless text-mode agent runs#857
toubatbrian wants to merge 9 commits into
mainfrom
feat/agent-session-daemon

Conversation

@toubatbrian

@toubatbrian toubatbrian commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds lk agent session start|say|end — a headless, text-mode way to drive a LiveKit agent (Python or JS) straight from the terminal, with no audio/CGO dependency (it lives under the default tag-free build, not the console audio build).

It uses a three-process model that mirrors the existing lk agent console plumbing:

  1. Ephemeral CLI command (start/say/end) — short-lived, talks to the daemon and exits.
  2. Detached singleton daemon — the lk binary re-exec'd into a hidden daemon mode (gated by an env var, never exposed as a subcommand). It binds a fixed loopback TCP port to enforce a single active session, spawns the agent, and applies text mode.
  3. Agent subprocess — the user's agent, connected over the lk.agent.session protobuf protocol.

The CLI↔daemon control protocol reuses pkg/ipc length-prefixed framing on the same TCP port, disambiguated from agent connections by a 4-byte magic preamble. The headless renderer (session_render.go) prints user turns, agent replies, tool calls/outputs, and handoffs.

Command running / IO example

$ lk agent session start examples/voice_agents/basic_agent.py
Detected Python agent (basic_agent.py in .../examples/voice_agents)
Session started. Use `lk agent session say "..."` to talk, `lk agent session end` to stop.

$ lk agent session say "what's the weather in San Francisco?"

  ● You
    what's the weather in San Francisco?

  ● function_tool: lookup_weather
    ✓ sunny with a temperature of 70 degrees.

  ● Agent
    The weather in San Francisco is sunny with a temperature of 70 degrees. Want to know the forecast for any other city?

$ lk agent session say "thanks, that's all"

  ● You
    thanks, that's all

  ● function_tool: end_call
    ✓ say goodbye to the user

  ● Agent
    Goodbye! Have a great day!

$ lk agent session end
Session ended.

Notes

  • Singleton enforcement: a second start while a session is live is rejected (a session is already running on 127.0.0.1:<port>).
  • No CGO/audio: builds in the default tag-free binary; the audio pipeline stays behind the console tag. This drops the temporary //lint:file-ignore U1000 directives that were added while the shared spawn/detect helpers were unused.
  • TODO(node) / TODO(audio) placeholders mark the follow-up surfaces (JS agent detection, audio mode).

Test plan

  • go build ./... (default) and CGO_ENABLED=1 go build -tags console ./...
  • go vet -tags console ./cmd/lk/, gofmt clean
  • End-to-end start → say (tool call) → say (handoff/end_call) → end against basic_agent.py (see IO example above)
  • JS agent run (pending TODO(node))

Introduces a three-process model (ephemeral CLI command, detached
singleton daemon, agent subprocess) that drives a Python/JS agent over
TCP using the lk.agent.session protobuf protocol, with no audio/CGO
dependency:

- `lk agent session start <file>`: re-execs the lk binary as a detached
  daemon bound to a fixed loopback port (singleton), which spawns the
  agent and applies text mode; rejects start if a session already runs.
- `lk agent session say "..."`: streams a user turn and renders the
  agent reply, tool calls/outputs, and handoffs to the terminal.
- `lk agent session end`: tears down the daemon and agent.

The CLI<->daemon control protocol reuses pkg/ipc length-prefixed framing
over the same TCP port, disambiguated from agent connections by a magic
preamble. The headless renderer covers all ChatItem variants plus the
FunctionToolsExecuted event. Drops the now-unnecessary U1000 file-ignore
directives added while the helpers were unused.

Co-authored-by: Cursor <cursoragent@cursor.com>
Tools that return no string (e.g. handoff tools returning an Agent)
produced a bare "✓ " line. Suppress the output line when the summarized
output is empty for successful calls; error outputs still render.

Co-authored-by: Cursor <cursoragent@cursor.com>
theomonnom
theomonnom approved these changes Jun 3, 2026

@theomonnom theomonnom left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh stamped the wrong PR

Comment thread cmd/lk/main.go Outdated
toubatbrian and others added 2 commits June 3, 2026 15:46
Replace the env-gated branch at the top of main() with a dedicated,
hidden `lk agent session daemon` subcommand (mirroring the existing
hidden `generate-fish-completion` command). `start` now re-execs the
binary into that subcommand instead of setting LK_SESSION_DAEMON=1, so
the daemon has its own entrypoint dispatched by the CLI framework rather
than special-casing main(). Re-exec of the same binary is retained
(a separate binary can't be located reliably after `go install`);
runtime params still flow through the LK_SESSION_* env vars.

Co-authored-by: Cursor <cursoragent@cursor.com>
A registered subcommand is always invokable (Hidden only drops it from
help), so a stray `lk agent session daemon` previously spawned a
half-configured daemon (random port, empty project dir) that exited
silently. Guard the entrypoint on the inherited readiness pipe that
`start` always provides: without it, return a clear error directing the
user to `lk agent session start`.

Co-authored-by: Cursor <cursoragent@cursor.com>
@toubatbrian toubatbrian requested a review from theomonnom June 3, 2026 22:52
@rektdeckard

Copy link
Copy Markdown
Member

Sorry for late question, but now that console feature is compiled by default on all platforms, how different is this from just using console in text mode?

@toubatbrian

Copy link
Copy Markdown
Contributor Author

From my understanding, lk agent session gives users/agents a way to send multiple independent commands to a session.

With console, you don’t really get that separation, since it starts an interactive terminal UI. There isn’t a command-line-friendly way to distinguish between starting a session and sending input to that session, which is especially important for AI agents using a bash tool.

cc @theomonnom in case you have more thoughts on the goal/scope of this feature.

@rektdeckard

Copy link
Copy Markdown
Member

@toubatbrian got it, makes sense

@u9g

u9g commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Is this supposed to be used in a way where the agent starts the room (if so, what's the command for that? is it just starting the agent via npm start?) and then the agent uses a different terminal to send text messages to the agent via lk agent session?

@toubatbrian

Copy link
Copy Markdown
Contributor Author

Is this supposed to be used in a way where the agent starts the room (if so, what's the command for that? is it just starting the agent via npm start?) and then the agent uses a different terminal to send text messages to the agent via lk agent session?

It's reuse the console mode, so lk agent session starts the actual agent in a subprocess and communicate via tcp.

u9g and others added 4 commits June 22, 2026 16:06
* test(cli): add e2e test for the agent session lifecycle

Adds an opt-in end-to-end test that drives the real `lk agent session`
start/say/end flow against a minimal one-file echo agent (testdata/echo-agent),
asserting the model echoes a token back and that the detached daemon exits
afterward. The fixture is a uv project so the daemon's `uv run python`
auto-syncs deps; its __main__ dispatches console mode to the TCP console
directly since cli.run_app() doesn't expose --connect-addr on released agents.

Includes a GitHub Actions workflow that runs the test on Linux and Windows,
triggered by workflow_dispatch and pushes to any repo branch. Gated behind
LIVEKIT_API_KEY so it skips without credentials.

* fix(cli): use a readiness file instead of an inherited pipe fd

The session daemon spawn passed the readiness pipe to the detached child via
cmd.ExtraFiles (fd 3), but os/exec's ExtraFiles is unsupported on Windows, so
daemon.Start() failed with "fork/exec ...: not supported by windows" and the
session never started there.

Replace the inherited fd with a temp readiness file: the daemon writes its
status atomically (write + rename) and `start` polls it until it sees a status,
the daemon process exits, or a timeout slightly past the daemon's own
agent-connect deadline. Works identically on Linux and Windows.
Round out the Session E2E workflow: add the macOS arm, wire up the
portaudio submodule + ALSA headers, and build Windows with zig to match
.goreleaser.yaml's toolchain. Native Windows can't link the
webrtc/portaudio cgo objects (the ~560-object link overflows the
command-line limit), so cross-compile lk.exe and the e2e test binary on
Linux and run them natively on Windows. buildLK honors LK_SESSION_E2E_BIN
so the Windows runner drives the prebuilt binary instead of rebuilding.
@u9g u9g force-pushed the feat/agent-session-daemon branch from 649c95e to 264857e Compare June 22, 2026 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants