feat(cli): add lk agent session for headless text-mode agent runs#857
feat(cli): add lk agent session for headless text-mode agent runs#857toubatbrian wants to merge 9 commits into
lk agent session for headless text-mode agent runs#857Conversation
Introduces a three-process model (ephemeral CLI command, detached singleton daemon, agent subprocess) that drives a Python/JS agent over TCP using the lk.agent.session protobuf protocol, with no audio/CGO dependency: - `lk agent session start <file>`: re-execs the lk binary as a detached daemon bound to a fixed loopback port (singleton), which spawns the agent and applies text mode; rejects start if a session already runs. - `lk agent session say "..."`: streams a user turn and renders the agent reply, tool calls/outputs, and handoffs to the terminal. - `lk agent session end`: tears down the daemon and agent. The CLI<->daemon control protocol reuses pkg/ipc length-prefixed framing over the same TCP port, disambiguated from agent connections by a magic preamble. The headless renderer covers all ChatItem variants plus the FunctionToolsExecuted event. Drops the now-unnecessary U1000 file-ignore directives added while the helpers were unused. Co-authored-by: Cursor <cursoragent@cursor.com>
Tools that return no string (e.g. handoff tools returning an Agent) produced a bare "✓ " line. Suppress the output line when the summarized output is empty for successful calls; error outputs still render. Co-authored-by: Cursor <cursoragent@cursor.com>
Replace the env-gated branch at the top of main() with a dedicated, hidden `lk agent session daemon` subcommand (mirroring the existing hidden `generate-fish-completion` command). `start` now re-execs the binary into that subcommand instead of setting LK_SESSION_DAEMON=1, so the daemon has its own entrypoint dispatched by the CLI framework rather than special-casing main(). Re-exec of the same binary is retained (a separate binary can't be located reliably after `go install`); runtime params still flow through the LK_SESSION_* env vars. Co-authored-by: Cursor <cursoragent@cursor.com>
A registered subcommand is always invokable (Hidden only drops it from help), so a stray `lk agent session daemon` previously spawned a half-configured daemon (random port, empty project dir) that exited silently. Guard the entrypoint on the inherited readiness pipe that `start` always provides: without it, return a clear error directing the user to `lk agent session start`. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Sorry for late question, but now that |
|
From my understanding, With console, you don’t really get that separation, since it starts an interactive terminal UI. There isn’t a command-line-friendly way to distinguish between starting a session and sending input to that session, which is especially important for AI agents using a bash tool. cc @theomonnom in case you have more thoughts on the goal/scope of this feature. |
|
@toubatbrian got it, makes sense |
|
Is this supposed to be used in a way where the agent starts the room (if so, what's the command for that? is it just starting the agent via |
It's reuse the console mode, so lk agent session starts the actual agent in a subprocess and communicate via tcp. |
* test(cli): add e2e test for the agent session lifecycle Adds an opt-in end-to-end test that drives the real `lk agent session` start/say/end flow against a minimal one-file echo agent (testdata/echo-agent), asserting the model echoes a token back and that the detached daemon exits afterward. The fixture is a uv project so the daemon's `uv run python` auto-syncs deps; its __main__ dispatches console mode to the TCP console directly since cli.run_app() doesn't expose --connect-addr on released agents. Includes a GitHub Actions workflow that runs the test on Linux and Windows, triggered by workflow_dispatch and pushes to any repo branch. Gated behind LIVEKIT_API_KEY so it skips without credentials. * fix(cli): use a readiness file instead of an inherited pipe fd The session daemon spawn passed the readiness pipe to the detached child via cmd.ExtraFiles (fd 3), but os/exec's ExtraFiles is unsupported on Windows, so daemon.Start() failed with "fork/exec ...: not supported by windows" and the session never started there. Replace the inherited fd with a temp readiness file: the daemon writes its status atomically (write + rename) and `start` polls it until it sees a status, the daemon process exits, or a timeout slightly past the daemon's own agent-connect deadline. Works identically on Linux and Windows.
Round out the Session E2E workflow: add the macOS arm, wire up the portaudio submodule + ALSA headers, and build Windows with zig to match .goreleaser.yaml's toolchain. Native Windows can't link the webrtc/portaudio cgo objects (the ~560-object link overflows the command-line limit), so cross-compile lk.exe and the e2e test binary on Linux and run them natively on Windows. buildLK honors LK_SESSION_E2E_BIN so the Windows runner drives the prebuilt binary instead of rebuilding.
649c95e to
264857e
Compare
Summary
Adds
lk agent session start|say|end— a headless, text-mode way to drive a LiveKit agent (Python or JS) straight from the terminal, with no audio/CGO dependency (it lives under the default tag-free build, not theconsoleaudio build).It uses a three-process model that mirrors the existing
lk agent consoleplumbing:start/say/end) — short-lived, talks to the daemon and exits.lkbinary re-exec'd into a hidden daemon mode (gated by an env var, never exposed as a subcommand). It binds a fixed loopback TCP port to enforce a single active session, spawns the agent, and applies text mode.lk.agent.sessionprotobuf protocol.The CLI↔daemon control protocol reuses
pkg/ipclength-prefixed framing on the same TCP port, disambiguated from agent connections by a 4-byte magic preamble. The headless renderer (session_render.go) prints user turns, agent replies, tool calls/outputs, and handoffs.Command running / IO example
Notes
startwhile a session is live is rejected (a session is already running on 127.0.0.1:<port>).consoletag. This drops the temporary//lint:file-ignore U1000directives that were added while the shared spawn/detect helpers were unused.TODO(node)/TODO(audio)placeholders mark the follow-up surfaces (JS agent detection, audio mode).Test plan
go build ./...(default) andCGO_ENABLED=1 go build -tags console ./...go vet -tags console ./cmd/lk/,gofmtcleanstart → say (tool call) → say (handoff/end_call) → endagainstbasic_agent.py(see IO example above)TODO(node))