feat: add Feishu workspace operations and runtime controls#9
Open
xluos wants to merge 70 commits into
Open
Conversation
One bot, N groups. Each group binds to a workspace directory (multi-repo container) with an active (repo, branch) pointer. Threads inside a group become isolated sessions that follow the live binding at every dispatch. - group_workspaces table + GroupWorkspaceStore (CRUD + cwd/env resolver) - Gateway slash commands that bypass the LLM: /bind /unbind /status /ls /clone /checkout /help - Feishu channel carries chat_id + thread_id on UserMessage; deterministic session_id = feishu:<chat>:<thread> when both known - AgentRunOptions.envExtras threaded through Claude and Codex runners so DEV_ASSETS_PRIMARY_REPO / DEV_ASSETS_PRIMARY_BRANCH reach the CLI - Boot-loader ensures workspaces/ and workspaces/_default/ exist - Default workspace fallback when a group is unbound Design: docs/designs/group-workspace-binding.md Requires the branch-context-skill-suite workspace mode for dev-assets memory integration across multiple repos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues in the gateway slash-command path:
1. Feishu substitutes @mentions in the text body as `@_user_N`
placeholders. `@bot /bind foo main` arrived as
`@_user_1 /bind foo main`, which failed the `startsWith("/")`
check and fell through to the LLM. Strip the placeholder before
command routing.
2. All user-facing reply copy was English; Feishu group audience is
Chinese-speaking. Translate command descriptions and all success /
error / usage messages in handlers.ts, plus the /stop replies and
the command-handler error wrapper in kernel.ts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gateway-level slash commands (/help, /stop, /bind, ...) were failing in two ways: 1. `MultiChannelMessageGateway.replyMessage` resolves the target channel by looking up `channel_id` on the session row, but gateway commands reply before any Session is persisted — the lookup returned null and threw "Cannot resolve channel for session". 2. `FeishuMessageChannel.replyMessage` defaults `streaming: true`, and the card renderer intentionally skips text rendering in streaming mode (only shows "生成中..." dots). Command replies are ready in one shot, so they never displayed. Fix both: add an optional `channelId` to the gateway's post/reply/update options that bypasses the session→channel DB lookup, and pass `channelId` + `streaming: false` from `_tryHandleCommand` and `_handleStopCommand`.
Restrict which inbound messages the bot acts on, so one bot can live
in multiple group chats without reacting to every member.
Two independent filters, combined with AND:
- `require_mention: true` — in group chats, require the bot to be
@mentioned; P2P DMs bypass (already directed). Bot's own open_id is
resolved at startup via `/bot/v3/info`.
- `allowed_user_ids` / `allowed_user_emails` — sender's open_id must
appear in the union of both sets. Emails are resolved to open_ids
at startup via `/contact/v3/users/batch_get_id` (batched 50/req).
Unresolved emails are logged, not fatal.
Both filters fail loud at startup when the required token or lookup
fails, so we never silently accept everything when the operator asked
for restrictions.
`ChannelParams` schema is widened to accept bool/number/array literals
(normalized to strings via transform), so config.yaml stays natural:
params:
require_mention: true
allowed_user_emails: [alice@x.com, bob@x.com]
Also logs every inbound at info level (sender, chat_type, mentioned,
passed) plus a dedicated line on each drop, since the channel was
previously silent and operators could not tell why nothing fired.
Operational logs were stdout-only, so debugging past sessions required re-running the bug or scraping the terminal scrollback. Add a second pino transport that writes to `$AGENTARA_HOME/runtime-logs/YYYY-MM-DD.log` (plain-text, uncolored, append-only). Stdout keeps the pretty-printed output at whatever level AGENTARA_LOG_LEVEL selects; the file always captures `debug` so disk has strictly more detail than the terminal. Path is resolved at startup — a process running across midnight keeps writing to the start-of-day file until restart. Good enough for now.
Introduce the `/init` command: when typed in a Feishu group, the bot
renders an interactive card listing every entry in the new
`predefined_repos` config catalog. Each row is a checker + branch
input; a primary-repo selector and a submit button live at the bottom.
Submit triggers a card.action.trigger callback that clones the checked
repos into the group's workspace, checks out the requested branches,
and persists the binding — all while updating the original card in
place (pending → final).
Specialized, not generic — no reusable "interactive card framework".
Key pieces:
- `PredefinedRepo` schema + optional top-level `predefined_repos`
(empty catalog makes /init respond with a config hint)
- New interactive element types for Feishu Card 2.0 form primitives:
form, checker, input, button (form_submit), column_set, select_static
- `FeishuMessageChannel` gains `sendRawCard` / `updateRawCard` escape
hatches and a `card.action.trigger` subscription that normalizes the
SDK's provider shape into a `CardActionPayload` and emits `card:action`
- The emission is forwarded through `MultiChannelMessageGateway`;
kernel routes by `action_name` (only `init_submit` recognized)
- `InitFlow` owns the flow end-to-end: pending state keyed by card
message_id (in-memory, lost on restart → expired-card toast),
initiator-only interaction, sequential clone, `active_branch` falls
back to the repo's current HEAD when the requested branch is missing
- `/init` is special-cased in `_handleInboundMessage` (like `/stop`)
instead of being a regular `CommandHandler`, since a card reply is
not a plain string
Two non-obvious Feishu constraints hit during development, documented
at the sites where they bite:
- form_submit buttons must not carry `behaviors: [{ type: callback }]`
— combining the two makes Feishu reject the card with "there is no
submit button in the form container"
- `checker.text` only accepts `plain_text`; `markdown` yields
"type of element is not supported tag: markdown"
Card callbacks require long-polling subscription to be enabled in the
Feishu developer console ("卡片回调走长连接") — otherwise Feishu only
delivers card.action.trigger to an HTTP webhook and our handler never
fires.
… session start Two small UX fixes to the slash-command flow that would otherwise pile noise into Feishu topic threads: 1. `reply_in_thread` becomes opt-in on MessageGateway/MessageChannel `replyMessage` (new `replyInThread` option, default `true` to keep session flow untouched). All slash-command reply paths — `_tryHandleCommand`, `_handleStopCommand`, `InitFlow._replyText` — pass `false` so they show up inline in the conversation list instead of spawning a new topic per command. `sendRawCard` flips its default to inline since the only caller (/init) is a command-originated card. `_sendRemainingChunks` carries the flag through so chunk follow-ups match the parent reply's mode. 2. `@_user_N` stripping in `_handleInboundMessage` now only runs on the first message of a session (detected via `sessionManager.existsSession`). The bot needs the strip when a user @-summons it to start a thread (`@bot /bind foo`), but once inside the thread, real @-mentions of other people should survive intact.
Adds `agents.codex.isolate_host_env` (default `false`). When enabled, agentara redirects spawned Codex at `$AGENTARA_HOME/.codex/` via the `CODEX_HOME` env, keeping config / sessions / state / skills off the host's `~/.codex/`. Boot-loader symlinks `auth.json` from the host so the OAuth login is shared bi-directionally. Host `~/.codex/hooks.json` is intentionally not intercepted — Codex climbs cwd ancestors for `.codex/hooks.json` no matter what HOME or CODEX_HOME are set to. The right fix is to relocate the global hooks file out of `~/.codex/`, which is outside agentara's scope.
- `dev:server` now runs with `bun --watch` for auto-reload during development - new `start:server` (no watch) used by `make up` for stable background runs - `make restart` chains down + up to avoid manual two-step - fix `scripts/down.sh` to work on macOS bash 3.2 (drop `declare -A`)
- rename command, files (init-* → setup-*), design doc, card/form field names, and doc cross-references - move repo catalog out of `config.yaml` into `$AGENTARA_HOME/REPOS.md` so the same file doubles as agent-readable context via CLAUDE.md's native `@REPOS.md` import - add `loadPredefinedRepos()` Markdown parser (H2 section per repo, `- git_url: …` bullet, first prose line = description) - boot-loader seeds a starter `REPOS.md` on first run and idempotently appends `@REPOS.md` to CLAUDE.md when missing - catalog is re-read on every `/setup` invocation; edits take effect without kernel restart
The `/setup` card needs a one-line tagline next to each repo name, but the previous parser grabbed the entire first prose line — which in practice was a long paragraph. Let authors supply a dedicated `- description:` bullet for the card copy, and keep the surrounding prose as free-form context the agent reads via `@REPOS.md`. Falls back to the first prose line when the bullet is missing so older files still render. Template updated to document the new field.
`/bind` previously did double duty — ensure a binding row *and* set the active repo/branch — which overlapped with `/setup`. Narrow it to just the binding side: no args, ensures the group has a workspace (creating it on disk if missing), and leaves active_repo/active_branch alone. Active repo/branch selection now lives entirely in `/setup`'s interactive card, which is a better fit for its multi-repo picker and branch inputs.
`/setup` previously refused to run a second time on an already-bound group, forcing an `/unbind` to add a repo or switch a branch. Make it re-runnable: the card is the single place to configure the workspace. On re-run we probe the workspace for catalog repos that are already cloned and render them with `checked: true` + `disabled: true` so they cannot be dropped by accident, pre-fill the branch input with each repo's current HEAD (editing it = branch switch on submit), and default the primary selector to the current active_repo. Since some Feishu clients don't echo disabled checker state back on submit, the flow tracks locked repos in its pending state and force- includes them when parsing form values.
Rework the workspace model so multiple Feishu groups can share a single on-disk workspace, with a stable id that survives renames and unbinds. Schema (`drizzle/0011`, `0012`): - Split `group_workspaces` into a binding table + a new `workspaces` registry keyed by `ws_xxx`. Bindings are thin (chat_id -> workspace_id). - Move `active_repo`/`active_branch` up to the workspace — only one `.git/HEAD` exists per clone, so the active state is workspace-scoped, not binding-scoped. Groups sharing a workspace see the same state. - Backfill migrations mint ids for every pre-existing binding, then fold the active state up from the old binding rows. On-disk layout: - Workspace directories are now named after the stable id (`workspaces/ws_xxx/`) so `name` is a pure display label and can be edited freely without moving files. A boot-time normalizer renames legacy chat-id-keyed or slug-keyed dirs into place. - Each workspace root carries an `AGENTARA.md` meta file with id, name, bound chats, and active repo/branch — auto-refreshed on every bind, unbind, rename, or active state change. Commands: - `/bind` stays as "no-args → ensure a workspace exists"; after unbind it now mints a fresh workspace instead of silently re-adopting the orphan by path. - `/bind <workspace-id>` is new: attach the current chat to an existing workspace (errors if the id doesn't exist). Inherits active repo/ branch from the workspace — no longer resets to null. - `/status` surfaces workspace id and name so operators can share the id across groups. Setup card: - Adds a workspace-name input that defaults to `<group-name>-workspace` (fetched via `im.chat.get`; falls back to a chat-id slug). The field stays editable on re-runs — renaming just updates `workspaces.name`, never touches the directory. - Stacks label/input/id-hint vertically so narrow cards don't squeeze the layout; submit button renamed from "初始化" to "提交" to match the re-runnable semantics.
Operators should be able to run quick commands like `/setup` and `/bind` in a group chat without having to @-mention the bot first — typing a slash is already an explicit-enough intent signal. The sender whitelist still applies, so random users in the room can't invoke commands. Detection is a narrow peek at raw text-type messages: `/^\/[a-zA-Z]/` on the trimmed content. Post/image/file messages never qualify, and parse failures silently fall back to the old mention-enforced path.
Three related threads of workspace-lifecycle work that all land together because they touch the same command and flow files. ## Git sync - New `src/kernel/workspaces/git-sync.ts` with `syncWorkspace`, `listRepoSyncState`, and `formatAheadBehind` helpers. Sync does `git fetch --prune origin` + `git pull --ff-only` per repo, refuses to touch dirty or diverged trees, and returns a per-repo status enum. Timeout-bounded so a flaky network can't hang session start. - `/sync` command exposes it to operators as a manual refresh. - Kernel fires `syncWorkspace` fire-and-forget on the first message of a new session (bound chats only) so the agent starts on fresh code. - Setup flow adds `git fetch --prune` before checkout on already-cloned repos — necessary for switching to a branch that was pushed after the initial clone. Also attempts a best-effort ff-only pull after a successful checkout so re-running /setup without changing branch still picks up new commits. ## /status and formatting - `/status` now lists each cloned repo as `\`name branch\`` with an optional `↑a ↓b` suffix and a `•` dirty marker; the active repo is still flagged with `← 活跃`. Dropped the `name @ branch` shape across /bind, setup-flow, and the workspace meta file. - `/status` points users at `/sync` when there are repos to refresh. ## /switch flow + P2P gating - New `/switch` interactive card (switch-card.ts / switch-flow.ts): pick from all known workspaces, or clear the binding. Works in both group chats and P2P. - `UserMessage` carries `chat_type` so command gates can distinguish group vs P2P. `/bind`, `/unbind`, `/clone`, `/checkout`, and `/setup` reject P2P with copy pointing at `/switch`; `/status`, `/ls`, `/sync` remain usable in P2P.
Refine the /setup and /switch cards into denser workspace control panels with shared card helpers and compatibility-safe payloads. Remove low-signal explanatory copy, keep the current-state summaries focused, and add targeted tests for the card builders and markdown normalization.
Convert builtin slash command responses to compact Feishu cards with text fallback so workspace and command status replies stay dense without losing delivery resilience. Trim low-value workspace copy, paths, and collapsible sections from setup/switch cards and result cards to keep the interaction focused on current state and next actions.
Format workspace card repo and branch labels as repo@branch instead of separating them with a space.\n\nThis keeps setup, status, sync and binding cards visually consistent and adds a small shared formatter with test coverage.
Follow-up messages inside a thread the bot has already participated in are implicitly directed at the bot, so requiring users to @-mention on every reply is noise. The `feishu_threads` table already tracks every thread the bot has touched (either by starting one via reply/post or by being mentioned into an existing one), so reuse it as the bypass signal. The whitelist still applies — only allow-listed senders benefit.
Previously the kernel force-`checkout`ed the stored `active_branch` on every inbound dispatch, so a manual `git checkout` in the workspace would silently get reverted the next time a message arrived. On top of that, `/status` read the stored hint for the active-repo line while the per-repo list read HEAD — a single card could contradict itself. Unify on HEAD everywhere a user can see the "current branch": - remove the pre-dispatch checkout in the inbound-message handler - `/status`, `/bind`, and `/switch` cards read HEAD via a new `readRepoHead` helper exported from `git-sync` - `DEV_ASSETS_PRIMARY_BRANCH` env reflects HEAD, so the agent sees what it will actually run on The `active_branch` DB column stays (still written by `/setup` and `/checkout` as a last-known hint), but it no longer drives behavior or display. `/checkout <branch>` remains the explicit branch-switch entry.
Three new commands for self-service group / whitelist management: - `/group <name> @user1 @user2 ...` (P2P only): bot creates a new chat with the sender + mentioned users, transfers ownership to the sender, persists the chat to `feishu_bot_groups`, and posts a /setup card so the group is immediately ready for workspace init. - `/ungroup [name|chat_id]`: in-group with no args dismisses the current chat if the bot created it and the sender is the original `/group` caller; in P2P requires a name/chat_id and only matches groups the sender themselves created. - `/allow @user1 @user2 ...`: adds mentions to the channel's whitelist both in memory and by rewriting config.yaml, so admins don't have to edit the file by hand. Supporting changes: - UserMessage now carries `mentions`, populated from the Feishu event's mentions array — the three commands resolve @-placeholders to open_ids via this field. - `CommandContext` gains `feishuChannels` so regular command handlers can reach into channel-specific APIs without being kernel-special-cased. - New `feishu_bot_groups` table tracks which chats the bot created so `/ungroup` can authorize dismissal and look up by name in P2P. - `FeishuMessageChannel` gains `createChat`, `transferChatOwner`, `dismissChat`, `sendPlainText`, `addToWhitelist`, plus bot-group lookup/delete helpers. - `/group` is kernel-special-cased (like /setup, /switch) because it orchestrates across channel ops, DB writes, and SetupFlow. Requires Feishu app scopes: `im:chat:create`, `im:chat.owner:update`, `im:chat:delete` (operator).
Host zshrc never fires because agentara spawns claude/codex directly
with Bun.spawn — no interactive shell. To let users inject proxy vars,
custom certs, feature flags, etc. at launch, add `agents.env: {..}` to
config.yaml, merged into the spawn env of both runners.
Precedence: Bun.env ← agents.env ← (codex isolationEnv) ← envExtras ←
(claude ANTHROPIC_API_KEY blank). Workspace-level per-dispatch overrides
still win; the host env stays the baseline; agents.env fills the gap
the missing shell preamble used to cover.
Previously createAgentRunner was a hardcoded switch over the four built-in types. Extract the switch into a Map-based registry (registerRunner / createRunner / listRunnerTypes) so new runner types can self-register at module load without touching the factory. Plugins now live under src/plugins/ and are wired in by the kernel via a single side-effect import of the plugins barrel. TypeScript enforces the AgentRunner contract at build time — plugins that don't implement it won't compile. Two plugins land with the mechanism: - claude-gated: wraps ClaudeAgentRunner with the same preamble the author's zshrc runs before the real claude — pull the proxy out of agents.env, probe three IP-geolocation endpoints with a 2s timeout, and abort the dispatch when the egress country is not US (or every probe fails). The proxy is forwarded to the inner spawn via envExtras so a misconfigured agents.env doesn't silently disable egress. - codex-gated: symmetric wrapper over CodexAgentRunner, same gate. Enable by setting `agents.default.type: "claude-gated"` or `"codex-gated"` in config.yaml. The original `claude` / `codex` types stay registered; switching is instant.
Having a required `model` with a Claude-flavored default value baked into the schema silently produced a mis-match when the default type was codex (e.g. `--model claude-sonnet-4-6` passed to Codex). Drop the schema default and gate the `--model` flag on the value being set, so omitting `model` from config.yaml leaves model selection to the runner's own CLI (`claude` / `codex`). The bootstrap template is updated too, with the example commented out to make the "optional" stance obvious.
`<b>...</b>` doesn't render reliably in Feishu interactive cards, so steer the agent toward `**text**` for emphasis while keeping `<font color>` scoped to color only. Adds an explicit combo example for the color+emphasis case.
Adds `dangerouslySkipPermissions` to AgentRunOptions and forces it on inside claude-gated so the unattended robot flow stops stalling on per-tool approvals. Base ClaudeAgentRunner keeps the safe default.
Long agent runs were hitting Feishu's card-content caps mid-stream.
Observed failure modes:
- container `elements` > 50 -> error 11310 `element exceeds the limit`
- a single 10 KB+ Bash `description` would push the card JSON toward
the 30 KB body ceiling even below the element cap
Replaces the previous best-effort truncation with a pre-flight splitter
that walks the message content, caps each chunk at 25 steps AND ~20 KB
of estimated rendered bytes, and keeps any single oversized step in its
own chunk rather than dropping data.
The channel now tracks a CardChain per logical assistant message:
- cards[0] stays the caller-visible anchor id
- filled cards are PATCHed once into a frozen state then never touched
- the trailing card absorbs live updates until it fills, at which
point it freezes and a continuation card is posted as a reply
Non-streaming final state locks the whole chain so late updates become
no-ops. The 400 fallback path is preserved as defense-in-depth.
Two paper cuts that compounded into "/status silently does nothing": 1. /allow lock-out Running /allow when no whitelist exists initialized an empty Set and then filtered the sender out of the @-targets — the operator lost access to their own bot on the very next message. Pass the sender to addToWhitelist and auto-seed it whenever we materialize a fresh whitelist from the implicit "everyone allowed" state. 2. Silent drops Rejected messages just logged and returned, so to the user it looked like the bot was broken. Reply with a short reason when the dropped message was clearly directed at the bot (slash command, @mention, bot-owned thread, or p2p). Casual group chatter from non-whitelisted members stays silently dropped.
Reverses the earlier card-chain approach (519f7fc) in favor of what the user actually wanted: tool steps never cause new cards, only the final model output spills when it's too large. Step panel (one card, always): - Clip per-step display text to 200 chars / first line — keeps long Bash descriptions and multi-KB thinking traces from bloating the card JSON on their own. - Drop oldest rows when step count exceeds 25. Pin a single `… N earlier steps` summary row at the top so the user sees both the most recent activity and how much history was hidden. Header still shows the TRUE step count. Final markdown (may spill): - Split into byte-bounded chunks (20 KB each) at paragraph / line boundaries. First chunk rides on the primary card; rest become text-only reply cards. - Combined with the pre-existing 5-tables-per-card split. Fix a latent bug uncovered while sizing the markdown budget: the non-streaming renderer wrote the full markdown into both `config.summary.content` AND `body.elements`, doubling the card's byte cost. Summary now carries a short preview (first line, up to 180 chars) instead — full content stays in the body.
Replace the removed codex-gated plugin with codex-yolo, which injects the fixed local proxy and enables Codex web search while preserving existing Codex exec behavior. Surface Codex resume rollout loss on the Feishu card with an explicit restart button instead of silently starting a new runner session. Legacy codex-gated sessions are mapped to codex-yolo, and restart confirmation clears stale runner ids before re-dispatching.
Aggregate scattered config commands into one interactive Feishu card: global defaults (agent / model / codex isolation / retries) edit in place and persist back to config.yaml, while a workspace list surfaces repo+branch, bindings, and last-active time with cascading delete. - add workspaces.last_active_at column (drizzle 0015, backfilled) - GroupWorkspaceStore: touchLastActive on resolve/bind, deleteWorkspace cascades tasks/sessions/bindings/dir, protects _default - SettingFlow orchestrates main/detail/confirm/result transitions via updateRawCard; actions share the setting_ prefix for routing - help text adds /setting and /workspaces entries
… each workspace Codex resolves CLAUDE.md and @memory imports from cwd (the workspace root). Without these links Codex ran off a stale global AGENTS.md with `<!-- file not found: memory/SOUL.md -->` and wrote its own auto-memory into `<workspace>/memory/USER.md`, producing per-workspace memory islands that never reached the global SOUL/USER context. Only `.claude/skills/` is shared from `.claude/`; runtime state (local settings, todos, ide) stays per-workspace. `_ensureSymlink` is idempotent and backs up any real file/dir found at a link target as `.bak.<timestamp>` instead of overwriting.
- unique button name per workspace so Feishu accepts the card (11310 "name duplicate" otherwise blocks the whole render) - drop unsupported <font color='blue'>; limit to grey/green/red - inline a 删除 button next to 详情 on each row, so listing-level deletion doesn't require drilling into the detail view - re-layout each row into title / active repo / meta lines with id and a colour-coded last-active badge (green <3d, red 超过 N 个月 未活跃 once 30d+) so dormant workspaces surface at a glance - append 返回设置面板 on every result card so the user isn't stranded after save/delete completes - read the active repo branch from on-disk HEAD (matching /status) instead of the stored active_branch hint, which drifts after any user-initiated checkout - surface the real Feishu error code/msg on render failure instead of axios's generic "Request failed with status code 400"
The previous /help dumped all 15+ commands into one bullet list, with a redundant preface and every command wrapped in inline code (which Feishu renders as a separate bordered pill). Hard to scan and visually noisy. - bucket commands into intent groups (常用 / Workspace / 仓库操作 / Agent / 群权限 / 话题响应), each with a bold heading + plain bullet list — no intro, no collapsible panel, no per-command code border - drive the list from a declared HELP_GROUPS table so help copy no longer duplicates handler descriptions and is easy to reorder - separate fallback_text builder so non-card channels still get a readable plain-text dump
The previous detection walked the probe list serially with a 2s per-probe timeout. When ipapi.co was rate-limited or slow (common from shared egress IPs), we'd block the full 2s before even trying the next endpoint, making worst-case detection ~6s and frequently pushing the whole claude-gated dispatch past its tolerance. - race every probe in parallel via a small _firstNonNull helper; the first successful ISO code wins, null only if every probe fails or times out - drop ipapi.co, which is the endpoint that most often forces us to pay the full 2s timeout - worst-case detection is now ~2s regardless of how many endpoints are slow or offline
Result-card summaries often carry an <at id=...></at> mention so the body renders a proper @用户 link. But the subtitle built by buildCardIntro wraps its text in <font color='grey'>…</font>, and Feishu doesn't re-parse the <at> tag inside <font>, so the raw `at id=ou_...` attribute text leaks into the card header (visible on permission-decide results). - summarizeForSubtitle now drops <at …></at> tags before truncating - collapses the resulting double spaces so nothing reads awkwardly
Frequently-used tools (Bash, Edit) trigger a permission card on every call, which gets noisy once the user has already decided the agent can be trusted with that tool in the current session. - add a third button "🔓 批准并在本次会话内不再询问该工具" on the permission card, placed on its own row so the longer copy isn't squashed and users don't mistake it for a single-shot approve - PermissionFlow now keeps an in-memory session_id -> Set<tool_name> allowlist; request() short-circuits to `allow` when a match exists, without sending a card or touching Feishu - scope is strictly per-session, per-tool; the map is dropped on kernel restart so trust never silently carries across boots - expose clearSession(sessionId) for eager cleanup when a session tears down - new result-card outcome `allowed_session` documents the decision on the updated card so the chat history stays auditable - tests cover the auto-allow short-circuit and clearSession behavior
`dummy` and `mock` are registered at boot only so tests can construct sessions without a real provider. They shouldn't appear in the /setting agent dropdown or /agent list. Add `filterUserFacingAgentTypes` and apply it in both surfaces. Selecting by explicit name still works.
Merge the "默认 Agent" select and "Agent Model" input into a single column_set with 2:1 weighting. Model is rarely tweaked once set, so it gets the narrower column. Saves a card row and keeps the primary agent picker visually dominant.
Increase the claude-gated country detection timeout to 5 seconds so slower proxy handshakes do not fail the gate prematurely.
Feishu delivers @ mentions as opaque `@_user_N` placeholders plus a `mentions` array carrying the real open_id and name. Until now the agent (and session logs / first-message preview) only saw the placeholders, so the LLM could not tell who was being addressed. Add `inlineMentions(text, mentions)` and call it in: - Claude and Codex agent runners, before serializing the prompt - SessionLogWriter user branch + formatFileLine pure-text writer - SessionManager firstMessage preview written to the session list Kernel command matching deliberately keeps the raw placeholders, since /group, /allow, etc. resolve open_ids from `message.mentions` directly.
Adds an explicit admin allowlist for the `/setting` interactive panel. `setting.admin_open_ids` in config.yaml defaults to an empty list, which keeps the existing channel-level whitelist as the sole gate (backward compatible). When non-empty, both `SettingFlow.start` and every `setting_*` card action are rejected for users not in the list — the text rejection for the slash command and a no-back result card for stale card clicks. Why: the panel exposes destructive actions (workspace delete, global config rewrite) that the channel whitelist doesn't differentiate from ordinary chat messages, so deployments with multiple whitelisted users need a finer-grained gate.
Adds an interactive `/repos` panel that lists every entry in `$AGENTARA_HOME/REPOS.md` and lets the user add, edit, or delete sections without hand-editing the file. - `repos-writer` mutates sections in-place: name-matched lookup skips H2 headings inside `<!-- ... -->` blocks (mirroring the reader), edits preserve non-bullet prose inside the section, deletes remove the section and collapse blank-line runs. - `repos-card` renders a main listing with inline edit/delete buttons, separate add/edit form cards, and a delete-confirm card. Every card carries the same "返回 + 关闭" affordance as the setup/setting cards. - `repos-flow` is stateless: every card action carries the `repo_name` it targets, so stale cards from before a restart still behave correctly. - Edit mode locks the name (rename = delete + add) so prose context inside the section can't be silently dropped. Wired into the kernel like `/setting`: kernel routes `/repos` to `ReposFlow.start` and any `repos_*` action to `ReposFlow.handleAction`. `/help` gains an entry under "仓库操作".
- Adds a "关闭" callback button on every setup/setting card. Clicking it swaps the card for a no-button dismissed state via `updateRawCard` so the form elements stop being interactable, matching the visual contract of the post-submit result card. - `buildDismissedCard` in `setup/card-ui.ts` renders just title + subtitle. Earlier prototypes routed through `buildResultCard`, which duplicated the same line as both subtitle and body — the dismissed variant ships clean. - `SetupFlow.handleSubmit` no longer rejects a submit with zero repos selected. First-time `/setup` then creates an empty workspace bound to the chat; on a re-run it just renames the workspace and leaves `active_repo` / `active_branch` untouched so popping the card open to fix the display name doesn't clobber state. - `_tryUpdateCard` widened to `Card` (was `ReturnType<...>`) so the same path can push dismissed cards in addition to result cards.
Surface chat_id / thread_id / session_id / agent_type / runner_session_id and the thread's auto-respond flag for the current message. Adds read-only accessors SessionManager.getSession and FeishuMessageChannel.getThreadInfo, and threads sessionManager through CommandContext.
The per-session serial gate chained on the previous task's promise via .then(onFulfilled). A failing task left a rejected promise in _sessionLocks, so every later task for that session skipped its handler and stayed "pending" until restart — the bot went silent to all further messages in that thread. Publish a swallowed view (current.catch) as the lock tail and move cleanup into finally. The task still throws so bunqueue records the failure and the failure reply reaches the user, but successors run regardless. Adds a regression test.
AskUserQuestion previously rode the approve/deny permission card, but its
semantics need answers, not approval. Allowing it echoed back the raw
input with no `answers`, so Claude Code (headless) resolved the call with
empty answers — surfacing as the question being auto-rejected.
Route AskUserQuestion to a dedicated form: single-select questions render
as a dropdown, multi-select as checkers. On submit, form_value is mapped
back to option labels and returned as `updated_input: { questions, answers }`
(the route already maps this to the `updatedInput` shape Claude expects).
Incomplete submissions re-render the form with a warning instead of
resolving; a malformed payload denies with a hint rather than hanging.
Agent-generated relative markdown links (e.g. `workspace/outputs/x.html`) were resolved against the global `$AGENTARA_HOME` instead of the chat's bound sub-workspace, where the agent actually runs and writes files. The files were never found, so attachments were silently dropped — and a same-named file in the global pool could be uploaded by mistake. Add `_resolveWorkspaceBaseDir()` (chat workspace cwd via the injected resolver, falling back to home) and use it in uploadImage, uploadFile, and _extractLocalFilePaths, matching the inbound download base dir.
Permission, clarifying-question, and codex-resume cards held their pending state only in memory, and the agent subprocesses that long-poll for a decision die with the kernel. After a restart an old card looked live but could never resolve. Add a kernel stop() wired to SIGINT/SIGTERM that, while the Feishu channels are still up, marks every outstanding card expired in place and resolves any awaiting permission/question promise with a deny. No persistence — a hard kill still relies on the click-time fallback.
…paces The CLAUDE.md folder-structure doc described a nested `workspace/` wrapper (workspace/uploads, workspace/outputs, workspace/projects) from the old single-global-workspace model. Each chat now runs in its own flat sub-workspace whose root IS the agent cwd — inbound uploads land in `<cwd>/uploads/` and repos check out directly at the root. The agent, following the stale doc, wrote outputs to the global `~/.agentara/workspace/outputs/` and emitted `workspace/...` links the messaging channel could not resolve, so file attachments were dropped. Rewrite the layout and messaging examples to use root-relative paths (uploads/, outputs/, repos at root via REPOS.md) and warn against the `workspace/` prefix.
The previous fix resolved agent-generated relative file/image links against `config.chatId`'s workspace. But a single Feishu channel sends replies for many chats (replies route by message_id, not the channel's configured chat), so this picked the wrong workspace — usually the default — and the file was never found, dropping the attachment with no log. Resolve the base directory from the cwd recorded for the message's owning session instead, which is exactly where the agent ran and wrote the file. Inject a `resolveSessionCwd` resolver from the kernel (SessionManager.getSession().cwd) and thread `session_id` through the file-attachment and inline-image (card + continuation) paths. Falls back to the chat workspace, then home, when the session is unknown.
`make restart` only killed the tracked wrapper PID in .run/*.pid. But `bun run <script>` is a wrapper that spawns the real worker as a child, and killing the wrapper orphaned that child (reparented to init), so the old backend kept its Feishu connection alive on stale code while up.sh started a second one — code changes silently never took effect. down.sh now walks and kills the wrapper's entire descendant tree, and adds a project-scoped orphan sweep (matched by command pattern, filtered to processes whose cwd/args are under the project dir) to reap leftovers from previous buggy stops without touching unrelated `bun run` processes.
Unregistered `/`-prefixed messages used to always get an "unknown command" reply. Add a passthrough whitelist (currently just `/compact`) so those commands are forwarded verbatim to the underlying CLI (Claude Code), which owns them in non-interactive mode. Everything outside the whitelist still returns the unknown-command reply to avoid wasting an agent turn on typos.
… cards Render a small grey footer on the final card: the model that served the turn, context-window occupancy, and the account's rolling 5-hour / 7-day usage limits, each as a unicode progress bar. - Capture per-turn token usage AND the resolved model id from the Claude stream (skipping the `<synthetic>` placeholder); store both on AssistantMessage. - Default the context window to 1M (LONG_CONTEXT_TOKENS) — configured models run on the long-context window and the id can't be reliably mapped, so a fixed denominator keeps occupancy stable across a session. - Extract queryClaudeUsage into community/anthropic/claude-usage.ts with a cached getter; the server usage route reuses it. - Thread CardFooterStats through the messaging gateway/channel; the kernel builds it from the latest turn that reported usage. Best-effort: scoped to runs that report usage (Claude); any lookup failure degrades quietly.
A non-zero exit dumped the entire stream-json stdout (can be megabytes) into the error message and failure card. Keep only a short tail of stdout and a larger tail of stderr (where the real reason lands), with a marker noting how much was dropped.
Replace the country-code gate in claude-gated with a Clash readiness check. The runner now allows dispatch when Mihomo TUN is enabled or the configured local HTTP proxy can reach a lightweight connectivity endpoint.
emit() ran listeners synchronously, so a blocking handler (workspace deletion's recursive rmSync) stalled the event loop before the WS handler could return its toast ack — the card button hit Feishu's callback timeout while work was still running, then updated late via updateRawCard. Dispatch the card:action emit on a later macrotask so the ack returns first.
…ke /clone async A slash-command card reply hitting a Feishu 400 propagated out of the fire-and-forget message:inbound listener as an unhandled rejection and took down the whole backend. Harden the reply path and defer slow clones. - boot-loader: add unhandledRejection/uncaughtException guards (log, stay alive) - kernel: wrap the message:inbound listener with .catch - kernel: _replyTextOrCard now falls back card -> card reply -> real plain text (sendPlainText) and never throws out of the handler - commands: add deferred_card result; /clone shows a "cloning..." card at once and patches it to done/failed when git clone finishes - commands: normalize ssh/http/ported git URLs to one key so predefined repos hit the shared mirror cache regardless of protocol
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
概要
这个 PR 将 dev 分支上的 Feishu 工作区和运行时增强整体合到当前上游
main。主要内容:
/setup、/switch、/setting、工作区删除、仓库选择,以及按工作区派发任务。/new、/topic、/group、/ungroup、/allow、/repos、/agents,以及运行时 agent 切换。迁移说明
上游
main已有0010_remarkable_malice,用于增加 handoff session 字段。合并 dev 时保留了上游迁移,并将 dev 原有迁移整体顺延:0011_lean_enchantress0012_workspace_ids0013_workspace_active_state0014_steady_korvac0015_closed_loki0016_fuzzy_gressill对应 snapshot 保留了上游
sessions.handoff字段,同时叠加 dev 的 workspace schema 变更。测试
bun run checkbun test,220 个测试通过