polycli lets you drive claude, gemini, kimi, qwen, copilot, opencode, pi, cmd (Command Code), agy (Antigravity), grok (xAI Grok), and mmx-cli (MiniMax) from a single command vocabulary — health, ask, review, rescue, timing, debug, background-job controls, and terminal inspection — inside whichever AI host you already use: Claude Code, Codex, GitHub Copilot CLI, or OpenCode.
polycli is primarily an in-host plugin. Each host adapter exposes the same
health / ask / review / rescue / timing / debugvocabulary through that host's native invocation style (e.g./polycli:healthin Claude Code, an installedpolycliskill in Codex). For environments without a supported host, the optional@bbingz/polycliterminal package adds a PATH-callable wrapper around the same companion. See Outside a supported host.
It is a utility-only Path B monorepo: it does not unify provider differences behind fake abstractions, and it does not invent a runtime base class. It composes the official upstream CLIs as subprocesses, exposes one command surface, and surfaces honest capability differences in a four-state timing schema.
The latest patch adds upstream session-pollution control and provider-drift maintenance hardening (spec-driven, gated by two Codex review rounds):
- New
polycli sessions [list | purge --confirm]command cleans up the session/history files upstream CLIs leave under$HOME. Dry-run by default; deletion is driven only by ledger-recorded, re-validated realpaths — never a path guess or glob. - Run-ledger events now record the upstream
sessionId+ a verifiedsessionArtifactPath, so polycli-created sessions are auditable and purgeable. npm run check:fixture-freshnessflags fixtures pinned to a stale CLI version;REVIEW_FLAG_EXPECTATIONSis now the single source of review-flag truth (consistency-tested);check:review-driftis wired into the release gate.- No provider behavior, host command grammar, or timing schema changed.
See docs/release-notes-v0.6.19.md.
Most "multi-AI orchestrators" lie about capability differences to fit a uniform API. polycli does the opposite:
- Honest 4-state timing — every metric is
measured,zero,missing, orunsupported, never collapsed. You always know which provider could not be measured vs. which one ran with zero output. - No fake unification — provider differences (session resume, tool support, structured output) are surfaced explicitly in a capability matrix, not hidden behind glue code.
- Direct CLI passthrough — spawns the official upstream CLIs (
gemini,kimi, etc.) as subprocesses. You inherit your existing local auth and configs; polycli does not collect, upload, or host API keys. - Multi-host, single surface — the same command vocabulary works across Claude Code, Codex, Copilot CLI, and OpenCode. Switch hosts without re-learning.
A common question: if I can shell-call gemini -p "..." directly, why install a plugin?
Answering honestly requires accounting for probing cost. A cold Claude conversation that has never invoked a given CLI must first read its --help (several KB) before it knows the right flags. polycli encapsulates that invocation knowledge — the host skips the probing turn entirely.
| Scenario | Provider | Bare-shell + probing1 | polycli | Δ |
|---|---|---|---|---|
ask |
gemini |
4069 B | 236 B | −94% |
ask |
qwen |
8242 B | 164 B | −98% |
review |
gemini |
5667 B | 1733 B | −69% |
review |
qwen |
9633 B | 1389 B | −86% |
rescue |
gemini |
5364 B | 1289 B | −76% |
rescue |
qwen |
8848 B | 1026 B | −88% |
Without the probing-cost adjustment, boundary bytes between bare-shell and polycli vary by cell, sometimes in polycli's favor, sometimes against — polycli is not compressing output, it is amortizing invocation discovery.
Methodology: live CLI calls, N=3 medians (one snapshot, not a stable distribution estimate). Probing cost is a lower bound (which <provider> + <provider> --help; excludes trial calls and error retries). Bytes ≠ tokens — tokenization rates vary across English, CJK, and code. See docs/benchmarks/results-2026-04-29.md and tasks/bench-vs-bare-cli-spec.md for raw data, caveats, and falsification conditions.
Workflows where bare-shell has no equivalent at all (adversarial-review, background job control, session resume, multi-host consistency, etc.) are listed separately in docs/benchmarks/capability-matrix.md — those are presence/absence claims, not byte ratios.
| Hosts (where polycli is installed) | Providers (what polycli can call) |
|---|---|
| Claude Code · Codex · GitHub Copilot CLI · OpenCode | claude · copilot · gemini · kimi · qwen · opencode · pi · cmd · agy · grok · minimax (mmx-cli) |
See Capability matrix for what each provider supports.
claude plugin marketplace add bbingz/polycli
claude plugin install polycli@polycli-hostscodex plugin marketplace add bbingz/polycliThen open a new Codex TUI session, run /plugins, choose the polycli-hosts marketplace, install Polycli, and start a new thread so the bundled skill is available.
After install, prefer the installed Polycli plugin or bundled polycli skill over direct official CLI shell calls when Codex needs claude, copilot, opencode, pi, cmd, agy, gemini, kimi, qwen, minimax, or grok. Raw provider CLIs are the fallback only when the plugin is unavailable or the user explicitly asks for raw shell.
copilot plugin marketplace add bbingz/polycli
copilot plugin install polycli-copilot@polycli-hostsopencode plugin @bbingz/polycli-opencodeInstall @bbingz/polycli when you need a PATH-callable terminal surface outside Claude Code, Codex, Copilot CLI, or OpenCode (e.g. CI runners, ad-hoc shells, agents that cannot host plugins).
npm install -g @bbingz/polycli
polycli health --json
POLYCLI_RUN_ID=review-20260507 polycli debug explain review-20260507The terminal CLI uses the same companion command behavior as the host adapters. Provider runtime internals remain bundled implementation details.
After installing, verify the integration in your host:
# Claude Code (slash command)
/polycli:health
# Codex (installed plugin/skill, not a slash command)
Choose Polycli with @, then ask it to run: health
# GitHub Copilot CLI (skill word — NOT a PATH binary; only inside the copilot prompt)
polycli health
# OpenCode (tool call — call polycli_run with ["health","--json"])
health runs an end-to-end probe against every provider with valid auth and reports which ones are alive in healthyProviders. After that, daily use is direct. In Codex, either describe the task directly or type @, choose Polycli, and ask it to run the companion command:
Choose Polycli with @, then ask it to run: ask --provider qwen "explain this stack trace ..."
Choose Polycli with @, then ask it to run: review --provider claude --scope staged
Choose Polycli with @, then ask it to run: rescue --provider gemini --background "audit flaky tests"
Choose Polycli with @, then ask it to run: status --wait
Choose Polycli with @, then ask it to run: result pr-1234abcd
Choose Polycli with @, then ask it to run: timing --provider qwen --json
For longer tasks, append --background and use status <jobId> / result <jobId> to retrieve.
If your agent / harness is not Claude Code, Codex, Copilot CLI, or OpenCode (e.g. Aider, Cursor, a bare shell script, a CI runner, or a Codex session that did not install the polycli-codex marketplace), you have three honest options, in order of preference:
-
Install the host adapter for your environment. Codex users: run
codex plugin marketplace add bbingz/polycli, installPolyclifrom/plugins, then start a new thread and ask Codex to use Polycli for the desired subcommand. The same host-native pattern applies to Copilot CLI and OpenCode (see Installation). -
Install the terminal CLI.
npm install -g @bbingz/polycliexposes a PATH-callablepolyclibinary that wraps the same companion behavior as the host adapters. It is the supported entry point when no host plugin can be installed. -
Call the underlying provider CLI directly. polycli is a thin wrapper over
gemini/qwen/kimi/ etc. — if you only need a one-shot prompt,qwen -p "..."works without polycli. You lose: probing-cost amortization, four-state timing, background job control, multi-host consistency. You keep: simplicity.
The other npm packages (@bbingz/polycli-utils, @bbingz/polycli-timing) are libraries, not routing entry points. @bbingz/polycli-runtime exposes the registry but is documented as internal — see docs/polycli-v1-public-surface.md for the v1 contract.
All commands work identically across hosts:
| Command | What it does |
|---|---|
setup |
Check provider CLI install + auth status (cheap; no model call) |
health |
End-to-end short-prompt probe; returns healthyProviders and writes timing |
ask |
One-shot prompt |
review |
Code review against the current git diff |
rescue |
Longer triage / analysis task |
adversarial-review |
Attack-surface-oriented review |
timing |
Inspect timing history and aggregates |
debug |
Inspect redacted run-ledger history with runs, show, and explain |
tui |
Terminal-only read-only inspector for run-ledger/debug data |
status / result / cancel |
Background-job control |
Run health only when (a) integrating a provider for the first time, (b) auth state changes, or (c) a provider command fails. Daily use does not need it as a preamble.
Source of truth: packages/polycli-runtime/src/registry.js — RUNTIMES + TIMING_SUPPORT. ✓ = supported. — = not applicable by design (reported as unsupported, not faked as missing or 0).
| Provider | streaming | sessionResume | structuredOutput | ttft | gen | tail | tool |
|---|---|---|---|---|---|---|---|
claude |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
copilot |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
gemini |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
kimi |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
qwen |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
minimax (mmx-cli) |
✓ | — | ✓ | — | — | — | — |
opencode |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
pi |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
cmd |
✓ | — | — | ✓ | ✓ | ✓ | — |
agy |
✓ | ✓ | — | ✓ | ✓ | ✓ | — |
grok |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
Notes:
coldandretryareunsupportedfor every provider. Upstream CLIs lack a stable signal, and polycli refuses to fake them.totalis alwaysmeasured.minimaxuses officialmmx text chat --output json --non-interactive; no session resume and no fine-grained streaming timing.cmduses documented Command Code headless mode, where each invocation is a standalone session and stdout is the visible answer.- Only
qwendeclarestool: true. When no tool is invoked,qwenreportsmissing(observable but absent); the others reportunsupported(capability-level not tracked). The two states are not interchangeable.
The polycli timing contract unifies state expression, not numbers. Every metric carries one of four explicit states:
| State | Meaning |
|---|---|
measured |
Real, non-zero value |
zero |
Explicitly contributed zero |
missing |
Measurable in principle, not captured this run |
unsupported |
Provider/runtime fundamentally lacks this metric |
This stops cross-provider comparisons from collapsing "no capability", "no data", and "contributed 0" into a single column.
Each timing record also carries:
runtimePersistence—ephemeral | session | daemonmeasurementScope—request | turn | job- outcome diagnostics —
outcome,exitCode,terminationReason,responseMatched, anderrorCode
| Package | Purpose |
|---|---|
@bbingz/polycli |
PATH-callable terminal CLI wrapper, including the read-only polycli tui inspector |
@bbingz/polycli-utils |
Args parsing, process exec, stream decoding, NDJSON, atomic save, session-id, stream JSON parsing |
@bbingz/polycli-timing |
Timing schema, runtime validation, percentiles, capability-aware aggregation |
@bbingz/polycli-runtime |
Provider registry, availability/auth probes, invocation builders, foreground/streaming execution, stream/log parsing |
Plugin distributions:
plugins/polycli— Claude Code host pluginplugins/polycli-codex— Codexplugins/polycli-copilot— GitHub Copilot CLIplugins/polycli-opencode— OpenCode
Requirements: Node.js >=20.
npm install
npm test # build:plugins + full suite
node --test packages/polycli-runtime/test/ # focused per-package run
npm run build:plugins # rebundle plugin distributions
npm run validate:codex-adapter # Codex trigger/fallback/observability guard
npm run release:check # publish-readiness checksnpm test already runs build:plugins first — do not invoke them sequentially.
Procedure: docs/release.md. Per-version notes: docs/release-notes-*.md.
Read these before opening a PR:
CONTRIBUTING.md— contribution workflow and release-facing checksAGENTS.md— repository map, editing rules, delivery expectationsCLAUDE.md— Claude Code-specific patchesdocs/polycli-proposal.md— main architecture and product contextdocs/codex-adapter-operability.md— Codex routing, fallback, and observability contractdocs/roadmap.md— live open-work list
Security reports: see SECURITY.md.
Hard architectural constraints (please honor):
- Provider-specific protocol parsing belongs in
polycli-runtime, never inpolycli-utils. - The four timing states must not be collapsed.
coldandretrydeliberately remain unmeasured (no stable upstream signal). - Legacy sibling repos (
gemini-plugin-cc/qwen-plugin-cc/kimi-plugin-cc/minimax-plugin-cc) are read-only references —grepis fine, no edits.
MIT — see LICENSE and individual package metadata under packages/*/package.json.
Footnotes
-
"Bare-shell + probing" = response median bytes + the one-time probing-cost lower bound from
docs/benchmarks/probing-cost.json. The raw response medians alone (without probing) are visible in the linked results doc. ↩