diff --git a/docs/mkdocs/en/tool.md b/docs/mkdocs/en/tool.md index ed2823cb..b14b235b 100644 --- a/docs/mkdocs/en/tool.md +++ b/docs/mkdocs/en/tool.md @@ -31,6 +31,11 @@ Agents dynamically use tools through the following steps: | [File Tools](#file-tools) | File operations and text processing | Use FileToolSet or individual tools | Read/write files, search, command execution | | [LangChain Tools](#langchain-tools) | Reuse tools from the LangChain ecosystem | Wrap as async functions and package as FunctionTool | Web search (Tavily), etc. | | [Streaming Tools](#streaming-tools) | Real-time preview of long text generation | Use StreamingFunctionTool | Code generation, document writing | +| [WebFetchTool](#webfetchtool) | Fetch and textify a single public URL | Instantiate WebFetchTool and add to tools | Documentation pages, RFCs, changelogs, news | +| [WebSearchTool](#websearchtool) | Public web search engine retrieval | Instantiate WebSearchTool and add to tools | Real-time news, releases, fact/definition lookups | +| [TodoWriteTool](#todowritetool-task-checklist-tool) | Multi-step task planning and progress tracking (full-list replace) | Mount `TodoWriteTool` | Short checklists, no dependency graph, token-insensitive | +| [Task Tool Family](#task-tool-family-structured-task-board) | Structured task board (incremental updates by id + dependencies) | Mount `TaskToolSet` | Long boards, cross-turn tracking, `blockedBy` dependencies | +| [Goal Tool Family](#goal-tool-family-persistent-session-goal) | Single persistent session goal + enforced completion | `setup_goal(agent)` | Cross-turn objectives, host-set goals, no premature finals | | [Agent Code Executor](./code_executor.md) | Automatic code generation and execution scenarios, data processing scenarios | Configure CodeExecutor | Automatic API invocation, tabular data processing | --- @@ -2841,3 +2846,356 @@ The example builds four independent Agents and covers the following scenarios: - **DuckDuckGo raw-hit mode** (`ddg_raw_agent`, `dedup_urls=False`): preserves the provider's raw recall list for downstream processing - **Google baseline** (`google_agent`, `safe=active`): real public web search + server-side single-domain `siteSearch` + client-side multi-domain filtering + blocklist + per-call `lang` override - **Google freshness Agent** (`google_raw_agent`, `dateRestrict=m6` + `dedup_urls=False`): keeps only results indexed within the last 6 months, suitable for "latest / what's new" queries + +--- + +## TodoWriteTool (Task Checklist Tool) + +`TodoWriteTool` is the framework's built-in **structured task checklist tool**, aligned with Claude Code / DeepAgents `TodoWrite` semantics: the model sends the **complete, updated list** in a single `todo_write` call; the tool validates it, fully replaces the previous list, and persists it to session-level state so plans and progress survive across `Runner.run_async` invocations. + +Best for **fewer steps, no explicit dependency edges, and simple implementation**. If you need server-assigned ids, incremental `taskId` patches, or `blockedBy` / `blocks` dependency orchestration, use the [Task Tool Family](#task-tool-family-structured-task-board) below instead. + +### Features + +- **Full-list replace**: each call passes the complete `todos` array; the new list **fully overwrites** the old one (no smart merge). The only valid way to clear is an explicit `todos: []` +- **Session-level persistence**: the checklist is serialized to JSON in `tool_context.state["todos[:]"]` (default prefix `todos`; **do not** use `temp:` — that prefix is stripped by `BaseSessionService` and is not persisted) +- **Sub-agent isolation**: the state key appends `:` so parent / child agents maintain separate lists +- **Hard contract validation (code-enforced)**: non-empty `content` / `activeForm`, at most one `in_progress`, globally unique `content`; violations return `INVALID_ARGS` / `INVALID_TODOS` +- **Layered prompt guidance**: `DEFAULT_TODO_PROMPT` is auto-injected into the system instruction via `process_request`, separate from hard contracts +- **Structured diff in responses**: on success returns `{message, todos, oldTodos}` for front-end / CLI rendering +- **Optional policy hooks**: read-only `nudge_hooks` can append strategy hints to `message` (must not modify the list) +- **Auto-clear when all done**: with `clear_on_all_done=True` (default), an all-`completed` list is persisted as empty to avoid stale accumulation + +### TodoWriteTool Parameters + +| Parameter | Type | Default | Description | +|------|------|--------|------| +| `state_key_prefix` | `str` | `"todos"` | State key prefix; do not use `temp:` | +| `clear_on_all_done` | `bool` | `True` | Clear persisted list when all items are `completed` | +| `default_nudge` | `str` | built-in text | Base hint appended on every successful response | +| `nudge_hooks` | `Optional[List[NudgeHook]]` | `None` | Read-only policy hook list | +| `filters_name` / `filters` | — | `None` | Filters forwarded to `BaseTool` | + +**LLM call parameters** (`todo_write`): + +| Parameter | Type | Required | Description | +|------|------|------|------| +| `todos` | `array` | Yes | Full list; each item has `content` (imperative), `activeForm` (in-progress label), `status` (`pending` / `in_progress` / `completed`) | + +**Successful response fields**: + +| Field | Type | Description | +|------|------|------| +| `message` | `str` | Base nudge + hook-appended text | +| `todos` | `array` | Persisted current list | +| `oldTodos` | `array \| null` | Previous list (`null` on first write) | + +### Usage + +```python +from trpc_agent_sdk.agents import LlmAgent +from trpc_agent_sdk.models import OpenAIModel +from trpc_agent_sdk.tools import TodoWriteTool + +agent = LlmAgent( + name="todo_planner", + model=OpenAIModel(model_name="...", api_key="...", base_url="..."), + instruction="You are a planning assistant; use todo_write for multi-step tasks.", + tools=[TodoWriteTool()], +) +``` + +Read back the persisted checklist (REST / audit): + +```python +from trpc_agent_sdk.tools import get_todos, render_todos + +todos = get_todos(session, branch=agent.name) +print(render_todos(todos)) # ✅ / 🔄 / ⬜ plain-text checklist +``` + +### TodoWriteTool vs Task Tool Family + +| Dimension | `TodoWriteTool` | `TaskToolSet` | +| --- | --- | --- | +| Tool count | 1 (`todo_write`) | 4 (`task_create` / `task_update` / `task_get` / `task_list`) | +| Update style | Full-list replace | Incremental patch by `taskId` | +| Item identity | `content` (unique key) | `id` (server-assigned) | +| Dependencies | None | `blockedBy` / `blocks`; upstream `completed` auto-unblocks | +| State key | `todos[:branch]` | `tasks[:branch]` | +| Parallel tool calls | Full-list overwrite, natural last-write-wins | `task_store_lock` serializes RMW | + +> **Mount one or the other**; mounting both tends to confuse the model. + +### TodoWriteTool Complete Example + +See [examples/todo_tool/run_agent.py](../../../examples/todo_tool/run_agent.py): multiple turns in one session — plan → complete items step by step — with `get_todos` reading back the persisted list after each turn. + +--- + +## Task Tool Family (Structured Task Board) + +`TaskToolSet` exposes four tools — `task_create`, `task_update`, `task_get`, `task_list` — aligned with Claude Code v2.1.142+ structured Task capabilities. Unlike `TodoWriteTool`'s full-list replace, the Task family uses **incremental updates by server-assigned `id`**: creation returns an id; later `task_update` patches status, fields, or dependency edges locally. + +The entire board is serialized as a **single JSON blob** in `tool_context.state["tasks[:]"]`, surviving across turns. `highwatermark` records the highest id ever assigned; soft-deleted tasks (`status: deleted`) **never reuse ids**. + +### Features + +- **Incremental updates**: `task_create` assigns ids; `task_update` patches by `taskId` without resending the whole board +- **Dependency orchestration**: `addBlockedBy` / `removeBlockedBy` (and `addBlocks` / `removeBlocks`) maintain bidirectional edges; upstream `completed` removes ids from downstream `blockedBy` and returns `unblocked` +- **Token optimization**: `task_list` returns summaries only (omits `description`); use `task_get` for full detail +- **Hard contract validation**: non-empty `subject`, valid status, existing dependency refs, **acyclic** graph (`detect_cycle`), default **at most one `in_progress`** (`enforce_single_in_progress`, can disable) +- **Concurrency safety**: `_TaskToolBase` wraps load → mutate → save in `task_store_lock` (per session + branch), compatible with `parallel_tool_calls=True` +- **Auto prompt injection**: `DEFAULT_TASK_PROMPT` injected once when multiple tools are mounted + +### TaskToolSet Constructor Parameters + +| Parameter | Type | Default | Description | +|------|------|--------|------| +| `state_key_prefix` | `str` | `"tasks"` | State key prefix; do not use `temp:` | +| `enforce_single_in_progress` | `bool` | `True` | Reject a second `in_progress` when one already exists | +| `inject_prompt` | `bool` | `True` | Inject `DEFAULT_TASK_PROMPT` into system instruction | + +### LLM Parameters for the Four Tools + +**`task_create`** + +| Parameter | Required | Description | +|------|------|------| +| `subject` | Yes | Short imperative title | +| `description` | No | Free-text detail | +| `activeForm` | No | In-progress label | +| `metadata` | No | Extension key-value map | + +Returns `{task: {id, subject}, message}`. + +**`task_update`** + +| Parameter | Required | Description | +|------|------|------| +| `taskId` | Yes | Task id to update | +| `status` | No | `pending` / `in_progress` / `completed` / `deleted` | +| `subject` / `description` / `activeForm` / `owner` / `metadata` | No | Scalar field patches | +| `addBlockedBy` / `removeBlockedBy` | No | Upstream dependency id lists | +| `addBlocks` / `removeBlocks` | No | Downstream blocked-id lists | + +Returns `{task, unblocked, message}`; `unblocked` lists pending task ids unblocked by this completion. + +**`task_get`**: `taskId` (required) → full record including `description`. + +**`task_list`**: optional `includeDeleted`; returns `{tasks, stats}` with summaries (no `description`). + +**Common error codes**: `INVALID_ARGS`, `INVALID_DEPENDENCY`, `INVALID_STATUS`, `NOT_FOUND`. + +### Usage + +```python +from trpc_agent_sdk.agents import LlmAgent +from trpc_agent_sdk.models import OpenAIModel +from trpc_agent_sdk.tools import TaskToolSet + +agent = LlmAgent( + name="task_planner", + model=OpenAIModel(model_name="...", api_key="...", base_url="..."), + instruction="Use task_create / task_update to maintain the board for multi-step projects.", + tools=[TaskToolSet()], + # With parallel_tool_calls=True, concurrent task tools on the same board are serialized by task_store_lock +) +``` + +Read back the persisted board (REST / audit / demo wrap-up): + +```python +from trpc_agent_sdk.tools import get_task_store, render_task_list + +store = get_task_store(session, branch=agent.name) +print(render_task_list(store)) +# ✅ #1 completed +# 🔄 #2 in progress +# ⬜ #3 pending (blocked by: 2) +``` + +### Dependency and Unblock Example + +```text +#1 Design schema + ├──→ #2 Implement API ──→ #3 Unit tests + └──→ #4 Write docs + +#1 completed → unblocked: ['2', '4'] +#2 completed → unblocked: ['3'] +``` + +### Task Tool Family Best Practices + +- **Separate planning from execution**: `task_create` + `addBlockedBy` first, then `in_progress` → `completed` item by item +- **Do not invent ids**: use only ids returned by `task_create` +- **Parallel calls**: with `parallel_tool_calls=True`, concurrent `task_create` / `task_update` on the same board are serialized by the lock; different `branch` values still run in parallel +- **Pick TodoWrite or Task**: long boards + dependencies → Task; short checklists → TodoWrite + +### Task Tool Family Complete Examples + +| Example | Description | +| --- | --- | +| [examples/task_tools](../../../examples/task_tools/) | Multi-turn dialog: dependency graph, step-by-step completion, `get_task_store` across turns | +| [examples/task_tools_parallel](../../../examples/task_tools_parallel/) | Validates `parallel_tool_calls` + `task_store_lock` (Phase 1–2 needs no API key) | + +--- + +## Goal Tool Family (Persistent Session Goal) + +`GoalToolSet` exposes three tools — `create_goal`, `get_goal`, `update_goal` — aligned with Claude Code **Session Goal** capabilities. Unlike `TodoWriteTool` (multi-item checklist) and `TaskToolSet` (multi-task board), Goal maintains **at most one** persistent objective per session branch: while status is `active`, a response that **looks like a final answer** does **not** mean the work is done — the model must keep working or explicitly call `update_goal('complete' | 'blocked')`. + +The goal is serialized as a **single JSON blob** (`GoalRecord`) in `tool_context.state["goal[:]"]`, surviving across `Runner.run_async` calls. Beyond the three model tools, the full capability requires `setup_goal()` to mount **enforcement callbacks** (`before_model` / `after_model`) that intercept premature final responses and re-run within the **same invocation**. + +### Features + +- **Single-goal contract**: one `GoalRecord` per branch (`objective` + three states `active` / `complete` / `blocked`); `complete` / `blocked` are **irreversible** terminal states +- **Cross-turn persistence**: persisted via function-response state deltas; **do not** use the `temp:` prefix +- **Sub-agent isolation**: state key appends `:` +- **Enforced completion**: while `active`, `after_model` detects premature finals (no tool call, visible text, non-partial), suppresses them, and re-runs in the same invocation; `before_model` injects a user-role nudge +- **Fail-open budget**: after `max_retries` (default 3) interceptions, the final response is allowed so the loop cannot spin forever; counters live in invocation-scoped `agent_context.metadata` and are not persisted +- **Two creation paths**: + - **Model side**: `create_goal(objective=...)` — LLM creates after judging a multi-step task + - **Host side**: `start_goal(session_service, ...)` — application writes the goal before the first turn; the model does not call `create_goal` +- **Layered prompt guidance**: `DEFAULT_GUIDANCE` injected into system instruction via `before_model` when `inject_guidance=True`; hard rules enforced by store validation + callbacks +- **Concurrency safety**: `_GoalToolBase` wraps load → mutate → save in `goal_store_lock` (per session + branch), compatible with `parallel_tool_calls=True` + +### Relationship to Todo / Task + +| Dimension | `TodoWriteTool` | `TaskToolSet` | Goal Tool Family | +| --- | --- | --- | --- | +| Granularity | Multi-item checklist | Multi-task board + deps | **Single** session objective | +| Update style | Full-list replace | Incremental by `taskId` | `create_goal` / `update_goal` | +| Can finish while incomplete? | Prompt guidance | Prompt guidance | **Callback enforcement** | +| State key | `todos[:branch]` | `tasks[:branch]` | `goal[:branch]` | +| Typical use | Step visibility, short lists | Long boards, dependencies | Whether the whole job is truly done | + +> Todo / Task handle **step decomposition**; Goal handles the **overall completion contract**. They can be combined, but avoid mounting too many planning tools at once. + +### GoalOptions Constructor Parameters + +Configure via `setup_goal(agent, GoalOptions(...))`: + +| Parameter | Type | Default | Description | +|------|------|--------|------| +| `state_key_prefix` | `str` | `"goal"` | State key prefix; do not use `temp:` | +| `inject_guidance` | `bool` | `True` | Inject `DEFAULT_GUIDANCE` into system instruction in `before_model` | +| `guidance` | `str` | `DEFAULT_GUIDANCE` | Long guidance text (serial goal-tool calls, etc.) | +| `max_retries` | `int` | `3` | Same-invocation budget for intercepting premature finals; fail-open when exhausted | +| `nudge_template` | `str` | `DEFAULT_NUDGE` | User-role reminder after interception; supports `{attempt}` / `{max_retries}` / `{objective}` | +| `on_retry` | `Callable[[RetryEvent], None]` | `None` | Observability callback on each interception or budget exhaustion | + +Mounting only `GoalToolSet()` without enforcement gives model-facing tools but **not** "no final while active". + +### LLM Parameters for the Three Tools + +**`create_goal`** + +| Parameter | Required | Description | +|------|------|------| +| `objective` | Yes | Completion criteria — what "done" concretely means | + +Success: `{message, goal}`; if an `active` goal already exists: `{error: "INVALID_STATE: ..."}`. + +**`get_goal`** + +No parameters. With a goal: `{message, goal}`; without: `{message: "No session goal is set."}`. + +**`update_goal`** + +| Parameter | Required | Description | +|------|------|------| +| `status` | Yes | `complete` (objective met) or `blocked` (same blocker repeats; cannot proceed without user input) | + +Success: `{message, goal}`; no active goal or already terminal: `{error: "INVALID_STATE: ..."}`. + +**`GoalRecord` fields** (persisted with camelCase JSON aliases): + +| Field | Description | +|------|------| +| `id` | Server-assigned uuid | +| `objective` | Completion criteria text | +| `status` | `active` / `complete` / `blocked` | +| `createdAtUnix` / `updatedAtUnix` | Created / last-updated time (unix seconds) | +| `terminalAtUnix` | Time entered a terminal state (optional) | + +### Enforcement Workflow + +```text +Model outputs final text (no tool call, goal still active) + ↓ +after_model classifies as premature final + ↓ +Suppress final (not committed as answer), retry_count += 1 +before_model injects nudge, same invocation continues agent loop + ↓ +retry_count >= max_retries → fail-open, on_retry(reason="exhausted") +``` + +Interception condition (`_is_premature_final`): non-partial, no error, visible text in content, and **no** `function_call` / `function_response`. + +### Usage + +Recommended one-line mount of tools + callbacks: + +```python +from trpc_agent_sdk.agents import LlmAgent +from trpc_agent_sdk.models import OpenAIModel +from trpc_agent_sdk.tools.goal_tools import GoalOptions, RetryEvent, setup_goal + +def on_retry(event: RetryEvent) -> None: + if event.reason == "blocked": + print(f"Premature final intercepted (attempt {event.attempt_number}/{event.max_retries})") + +agent = LlmAgent( + name="goal_agent", + model=OpenAIModel(model_name="...", api_key="...", base_url="..."), + instruction="Use goal tools to track completion for multi-step engineering tasks.", + tools=[...], # Business tools, e.g. BashTool / WriteTool +) +setup_goal(agent, GoalOptions(max_retries=3, on_retry=on_retry)) +``` + +Host pre-injects a goal before the first turn (model does not call `create_goal`): + +```python +from trpc_agent_sdk.tools.goal_tools import start_goal + +goal = await start_goal( + session_service, + app_name="my_app", + user_id="user_1", + session_id=session_id, + objective="Create notes/ with summary.txt and example.py in the current directory", + agent_name=agent.name, # Match LlmAgent.name for branch isolation +) +``` + +Read back the persisted goal (REST / audit / demo wrap-up): + +```python +from trpc_agent_sdk.tools.goal_tools import get_goal_record, render_goal + +goal = get_goal_record(session, branch=agent.name) +print(render_goal(goal)) +# ✅ Goal [complete] +# objective: ... +# created: 1782893110 +# terminal: 1782893116 +``` + +### Goal Tool Family Best Practices + +- **Use `setup_goal`, not `GoalToolSet` alone**: only callbacks enforce "no final while active" +- **Model vs host**: slash commands, `/goal`, config-driven tasks → `start_goal()`; let the model judge multi-step work → `create_goal` +- **One goal tool per response**: `DEFAULT_GUIDANCE` requires serial semantics; do not call `create_goal` and `update_goal` in the same turn +- **Use `blocked` sparingly**: only when the same blocker repeats across attempts and user input or external state change is required; do not mark blocked because work is hard, slow, or incomplete +- **Observability**: use `on_retry` to log premature-final interceptions and budget exhaustion when tuning prompt or `max_retries` +- **Division of labor with Todo / Task**: Todo / Task show steps and dependencies; Goal constrains whether the whole job is truly finished + +### Goal Tool Family Complete Example + +| Example | Description | +| --- | --- | +| [examples/goal_tools](../../../examples/goal_tools/) | Case 1: model `create_goal`; Case 2: host `start_goal` pre-injection; demonstrates enforcement interception and `update_goal(complete)` | diff --git a/docs/mkdocs/zh/tool.md b/docs/mkdocs/zh/tool.md index ca295afa..54cb3211 100644 --- a/docs/mkdocs/zh/tool.md +++ b/docs/mkdocs/zh/tool.md @@ -35,6 +35,7 @@ Agent 通过以下步骤动态使用工具: | [WebSearchTool](#websearchtool) | 公网搜索引擎检索 | 实例化 WebSearchTool 并加入 tools | 实时资讯、版本发布、事实/定义查询 | | [TodoWriteTool](#todowritetool-任务清单工具) | 多步任务规划与进度跟踪(整表替换) | 挂载 `TodoWriteTool` | 短清单、无依赖编排、token 不敏感 | | [Task 工具族](#task-工具族结构化任务看板) | 结构化任务看板(按 id 增量更新 + 依赖) | 挂载 `TaskToolSet` | 长任务板、跨轮跟踪、blockedBy 依赖 | +| [Goal 工具族](#goal-工具族持久会话目标) | 单会话持久目标 + 强制收尾 | `setup_goal(agent)` | 跨轮大目标、宿主设目标、未完成不许收尾 | | [Agent Code Executor](./code_executor.md) | 自动生成并执行代码场景、数据处理场景 | 配置 CodeExecutor | API 自动调用、表格数据处理 | --- @@ -3040,3 +3041,163 @@ print(render_task_list(store)) | --- | --- | | [examples/task_tools](../../../examples/task_tools/) | 多轮对话:依赖编排、逐项完成、跨轮 `get_task_store` 读回看板 | | [examples/task_tools_parallel](../../../examples/task_tools_parallel/) | 验证 `parallel_tool_calls` 与 `task_store_lock`(Phase 1–2 无需 API Key) | + +--- + +## Goal 工具族(持久会话目标) + +`GoalToolSet` 暴露三个工具——`create_goal`、`get_goal`、`update_goal`——对齐 Claude Code 的 **Session Goal** 能力。与 `TodoWriteTool`(多行待办)和 `TaskToolSet`(多任务看板)不同,Goal 在每个 session branch 上**至多只有一个**持久目标:在目标为 `active` 期间,模型给出「看起来像最终答案」的文本**不算完成**——必须继续执行,或显式调用 `update_goal('complete' | 'blocked')` 收尾。 + +目标序列化为**单个 JSON blob**(`GoalRecord`)写入 `tool_context.state["goal[:]"]`,跨 `Runner.run_async` 调用存活。除三个模型工具外,完整能力还需通过 `setup_goal()` 挂载一对 **enforcement callbacks**(`before_model` / `after_model`),在**同一次 invocation 内**拦截过早的最终回复并自动重试。 + +### 功能特性 + +- **单目标契约**:每个 branch 一个 `GoalRecord`(`objective` + 三态 `active` / `complete` / `blocked`);`complete` / `blocked` 为**不可逆**终态 +- **跨轮持久化**:随 function-response 的 state delta 落库;**勿用** `temp:` 前缀 +- **子 Agent 隔离**:state key 追加 `:`,父 / 子 Agent 各自维护独立目标 +- **强制收尾(enforcement)**:目标 `active` 时,`after_model` 检测「无 tool call、有可见文本、非 partial」的过早 final,抑制该回复并在同 invocation 内 re-run;`before_model` 注入 user-role nudge +- **fail-open 预算**:`max_retries`(默认 3)次拦截后放行最终回复,避免无限循环;计数器存在 invocation 级 `agent_context.metadata`,不持久化 +- **双入口创建**: + - **模型侧**:`create_goal(objective=...)` —— LLM 判断多步任务后自主创建 + - **宿主侧**:`start_goal(session_service, ...)` —— 应用层在首轮前写入 session,模型无需调用 `create_goal` +- **Prompt 引导分层**:`DEFAULT_GUIDANCE` 在目标 active 时经 `before_model` 注入 system instruction(`inject_guidance=True`);硬约束由 store 校验 + callback 共同保证 +- **并发安全**:`_GoalToolBase` 在 load → mutate → save 外包 `goal_store_lock`(按 session + branch),兼容 `parallel_tool_calls=True` + +### 与 Todo / Task 的关系 + +| 维度 | `TodoWriteTool` | `TaskToolSet` | Goal 工具族 | +| --- | --- | --- | --- | +| 粒度 | 多行待办清单 | 多任务看板 + 依赖 | **单个**会话目标 | +| 更新方式 | 整表替换 | 按 `taskId` 增量 | `create_goal` / `update_goal` | +| 未完成能否收尾 | Prompt 引导 | Prompt 引导 | **callback 强制拦截** | +| state key | `todos[:branch]` | `tasks[:branch]` | `goal[:branch]` | +| 典型用途 | 步骤可视、短清单 | 长看板、依赖编排 | 整件事是否算做完 | + +> Todo / Task 管「步骤分解」,Goal 管「整体完成契约」。可组合使用,但避免让模型同时混用过多规划工具。 + +### GoalOptions 构造参数 + +通过 `setup_goal(agent, GoalOptions(...))` 配置: + +| 参数 | 类型 | 默认值 | 说明 | +|------|------|--------|------| +| `state_key_prefix` | `str` | `"goal"` | state key 前缀;勿使用 `temp:` | +| `inject_guidance` | `bool` | `True` | 是否在 `before_model` 向 system instruction 注入 `DEFAULT_GUIDANCE` | +| `guidance` | `str` | `DEFAULT_GUIDANCE` | 注入的长文案(含串行调用 goal 工具等约定) | +| `max_retries` | `int` | `3` | 同 invocation 内拦截过早 final 的预算;耗尽后 fail-open | +| `nudge_template` | `str` | `DEFAULT_NUDGE` | 拦截后以 user-role 追加的提醒模板(支持 `{attempt}` / `{max_retries}` / `{objective}`) | +| `on_retry` | `Callable[[RetryEvent], None]` | `None` | 每次拦截或预算耗尽时的可观测回调 | + +仅挂载模型工具、不要 enforcement 时,可直接 `tools=[GoalToolSet()]`,但不具备「未完成不许收尾」能力。 + +### 三个工具的 LLM 参数概要 + +**`create_goal`** + +| 参数 | 必填 | 说明 | +|------|------|------| +| `objective` | 是 | 完成标准——「done」具体指什么 | + +成功返回 `{message, goal}`;若已有 `active` 目标返回 `{error: "INVALID_STATE: ..."}`。 + +**`get_goal`** + +无参数。有目标时返回 `{message, goal}`;无目标时返回 `{message: "No session goal is set."}`。 + +**`update_goal`** + +| 参数 | 必填 | 说明 | +|------|------|------| +| `status` | 是 | `complete`(目标已达成)或 `blocked`(同一阻塞条件反复出现、无用户输入无法继续) | + +成功返回 `{message, goal}`;无 active 目标或已是终态时返回 `{error: "INVALID_STATE: ..."}`。 + +**`GoalRecord` 字段**(JSON 使用 camelCase 别名持久化): + +| 字段 | 说明 | +|------|------| +| `id` | 服务端分配的 uuid | +| `objective` | 完成标准文本 | +| `status` | `active` / `complete` / `blocked` | +| `createdAtUnix` / `updatedAtUnix` | 创建 / 最后更新时间(unix 秒) | +| `terminalAtUnix` | 进入终态的时间(可选) | + +### enforcement 工作流程 + +```text +模型输出 final 文本(无 tool call,goal 仍 active) + ↓ +after_model 判定为 premature final + ↓ +抑制该 final(不提交为答案),retry_count += 1 +before_model 注入 nudge,同 invocation 继续 agent loop + ↓ +retry_count >= max_retries → fail-open,on_retry(reason="exhausted") +``` + +拦截条件(`_is_premature_final`):非 partial、无 error、content 含可见文本,且**不含** `function_call` / `function_response`。 + +### 使用方式 + +推荐一行挂载工具 + callbacks: + +```python +from trpc_agent_sdk.agents import LlmAgent +from trpc_agent_sdk.models import OpenAIModel +from trpc_agent_sdk.tools.goal_tools import GoalOptions, RetryEvent, setup_goal + +def on_retry(event: RetryEvent) -> None: + if event.reason == "blocked": + print(f"拦截过早收尾 (attempt {event.attempt_number}/{event.max_retries})") + +agent = LlmAgent( + name="goal_agent", + model=OpenAIModel(model_name="...", api_key="...", base_url="..."), + instruction="多步工程任务请用 goal 工具跟踪完成状态。", + tools=[...], # 业务工具,如 BashTool / WriteTool +) +setup_goal(agent, GoalOptions(max_retries=3, on_retry=on_retry)) +``` + +宿主在首轮前预注入目标(模型不调用 `create_goal`): + +```python +from trpc_agent_sdk.tools.goal_tools import start_goal + +goal = await start_goal( + session_service, + app_name="my_app", + user_id="user_1", + session_id=session_id, + objective="在当前目录创建 notes/ 并写入 summary.txt 与 example.py", + agent_name=agent.name, # 与 LlmAgent.name 一致,用于 branch 隔离 +) +``` + +读回持久化目标(REST / 审计 / demo 收尾): + +```python +from trpc_agent_sdk.tools.goal_tools import get_goal_record, render_goal + +goal = get_goal_record(session, branch=agent.name) +print(render_goal(goal)) +# ✅ Goal [complete] +# objective: ... +# created: 1782893110 +# terminal: 1782893116 +``` + +### Goal 工具族最佳实践 + +- **用 `setup_goal` 而非只挂 `GoalToolSet`**:只有 callbacks 才能实现「active 期间不许 final」 +- **模型侧 vs 宿主侧**:Slash command、`/goal`、配置驱动任务用 `start_goal()`;让模型自主判断多步任务时用 `create_goal` +- **一次响应只调一个 goal 工具**:`DEFAULT_GUIDANCE` 要求串行语义;不要同轮 `create_goal` + `update_goal` +- **`blocked` 慎用**:仅当同一阻塞条件跨多次尝试仍无法推进、且需要用户输入或外部状态变化时使用;不要因为任务难、慢或不完整就标记 blocked +- **可观测性**:通过 `on_retry` 记录 `⚡ Premature final intercepted` 与预算耗尽,便于调优 prompt 或 `max_retries` +- **与 Todo / Task 分工**:Todo / Task 展示步骤与依赖;Goal 约束「整件事是否已真正完成」 + +### Goal 工具族完整示例 + +| 示例 | 说明 | +| --- | --- | +| [examples/goal_tools](../../../examples/goal_tools/) | Case 1:模型 `create_goal`;Case 2:宿主 `start_goal` 预注入;演示 enforcement 拦截与 `update_goal(complete)` | diff --git a/examples/goal_tools/.env b/examples/goal_tools/.env new file mode 100644 index 00000000..dc791393 --- /dev/null +++ b/examples/goal_tools/.env @@ -0,0 +1,4 @@ +# Set TRPC_AGENT_API_KEY、TRPC_AGENT_BASE_URL、TRPC_AGENT_MODEL_NAME +TRPC_AGENT_API_KEY=your-api-key +TRPC_AGENT_BASE_URL=your-base-url +TRPC_AGENT_MODEL_NAME=your-model-name diff --git a/examples/goal_tools/README.md b/examples/goal_tools/README.md new file mode 100644 index 00000000..ffd65524 --- /dev/null +++ b/examples/goal_tools/README.md @@ -0,0 +1,100 @@ +# Goal 工具示例 + +演示 **Goal 工具族**(`create_goal` / `get_goal` / `update_goal`):为会话设置一个持久目标,目标未完成前 Agent 应继续执行,而不是过早给出最终回复。示例同时挂载 `Bash` / `Write` / `Read` 完成真实的多步文件任务。 + +## 快速开始 + +```bash +# 在项目根目录安装 +cd trpc-agent-python +python3 -m venv .venv && source .venv/bin/activate +pip3 install -e . + +# 配置模型(examples/goal_tools/.env) +TRPC_AGENT_API_KEY=your-api-key +TRPC_AGENT_BASE_URL=your-base-url +TRPC_AGENT_MODEL_NAME=your-model-name + +# 运行 +cd examples/goal_tools +python3 run_agent.py +``` + +## 两个演示场景 + +脚本会**依次**跑两个独立 session: + +| | Case 1 | Case 2 | +| --- | --- | --- | +| 谁设目标 | 模型调用 `create_goal` | 宿主调用 `start_goal()` 预注入 | +| 用户消息 | 普通多步任务(可提及 goal 能力) | 只描述要做的事,不提 goal 工具 | +| 典型流程 | `create_goal` → 写文件 → 验证 → `update_goal(complete)` | 写文件 → (可能被拦截)→ 补全 → `update_goal(complete)` | + +只跑其中一个时,在 `run_agent.py` 的 `main()` 里注释掉对应 case。 + +## 日志说明 + +| 输出 | 含义 | +| --- | --- | +| `🔧 [Tool call]` | 模型发起的工具调用 | +| `📊 [Tool result]` | 工具返回摘要 | +| `💬 ...` | 工具返回里的 `message`(给模型看的提示) | +| `⚡ [Goal retry]` | 目标仍 active 时模型想提前收尾,被拦截并继续执行 | +| `🤖 Assistant:` | 本轮最终回复(正常应在 `update_goal(complete)` 之后) | +| `🎯 Persisted goal` | 从 session 读回的目标状态 | + +终端较窄时,长参数的工具调用行可能折行,看起来像重复打印,以 `📊` 条数为准。 + +## 运行结果(实测摘录) + +### Case 1:模型自己设 Goal + +```text +🔧 [Tool call] create_goal({...}) +📊 [Tool result] created id=... status=active objective='...' +💬 Goal created and is now active. Keep working until it is genuinely met, then call update_goal('complete'). + +🔧 [Tool call] Write({'path': 'mypkg/__init__.py', ...}) +🔧 [Tool call] Write({'path': 'mypkg/utils.py', ...}) +🔧 [Tool call] Write({'path': 'mypkg/README.md', ...}) +... +🔧 [Tool call] Bash({'command': 'ls -la mypkg/ && ... python -c "import mypkg; ..."'}) +🔧 [Tool call] update_goal({'status': 'complete'}) + +🤖 Assistant: 工具包 `mypkg/` 已搭建完成并验证通过 ✅ + +🎯 Persisted goal (read from session): +✅ Goal [complete] +``` + +本例未触发 `⚡ [Goal retry]`,说明模型在标记完成前没有过早收尾。 + +### Case 2:宿主预注入 Goal + +```text +🎯 Goal pre-injected by host: + objective: '在当前目录创建 notes/ 目录,其中包含两个文件:...' + status: active + +🔧 [Tool call] Write({'path': 'notes/summary.txt', ...}) + ⚡ [Goal retry] Premature final intercepted (attempt 1/3). Objective: '...' + +🔧 [Tool call] Write({'path': 'notes/example.py', ...}) + ⚡ [Goal retry] Premature final intercepted (attempt 2/3). Objective: '...' + +🔧 [Tool call] get_goal({}) +🔧 [Tool call] Bash({'command': 'cd notes && python example.py', ...}) +🔧 [Tool call] update_goal({'status': 'complete'}) + +🤖 Assistant: 已完成。创建了 `notes/` 目录,其中包含两个文件:... + +🎯 Persisted goal (read from session): +✅ Goal [complete] +``` + +Case 2 里写完第一个文件后模型就想总结,**Goal enforcement 拦截了 2 次**,随后补写 `example.py`、运行验证,再 `update_goal(complete)`——这正是 Goal 能力的预期表现。 + +## 关键文件 + +- [`run_agent.py`](./run_agent.py) — 入口,驱动两个 case 并打印事件 +- [`agent/agent.py`](./agent/agent.py) — 组装 Agent,调用 `setup_goal()` 挂载 Goal 能力 diff --git a/examples/goal_tools/agent/__init__.py b/examples/goal_tools/agent/__init__.py new file mode 100644 index 00000000..bc6e483f --- /dev/null +++ b/examples/goal_tools/agent/__init__.py @@ -0,0 +1,5 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. diff --git a/examples/goal_tools/agent/agent.py b/examples/goal_tools/agent/agent.py new file mode 100644 index 00000000..f36184aa --- /dev/null +++ b/examples/goal_tools/agent/agent.py @@ -0,0 +1,76 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Agent module for the Goal tools example.""" + +import os + +from trpc_agent_sdk.agents import LlmAgent +from trpc_agent_sdk.models import LLMModel +from trpc_agent_sdk.models import OpenAIModel +from trpc_agent_sdk.tools.file_tools import BashTool +from trpc_agent_sdk.tools.file_tools import ReadTool +from trpc_agent_sdk.tools.file_tools import WriteTool +from trpc_agent_sdk.tools.goal_tools import GoalOptions +from trpc_agent_sdk.tools.goal_tools import RetryEvent +from trpc_agent_sdk.tools.goal_tools import setup_goal + +from .config import get_model_config +from .prompts import INSTRUCTION + + +def _create_model() -> LLMModel: + api_key, url, model_name = get_model_config() + return OpenAIModel(model_name=model_name, api_key=api_key, base_url=url) + + +def on_retry(event: RetryEvent) -> None: + """Observability callback: called every time the retry intercepts a premature final.""" + if event.reason == "blocked": + print( + f" ⚡ [Goal retry] Premature final intercepted " + f"(attempt {event.attempt_number}/{event.max_retries}). " + f"Objective: {event.goal.objective!r}" + ) + else: + print( + f" ⚠️ [Goal retry] Budget exhausted ({event.max_retries} retries). " + f"Letting final response through." + ) + + +def create_goal_agent(work_dir: str | None = None) -> LlmAgent: + """Build an agent with the Goal capability mounted via ``setup_goal``. + + The agent exposes ``create_goal`` / ``get_goal`` / ``update_goal`` tools + plus file-system tools for actually executing multi-step work. The + retry callbacks (guidance injection + premature-final interception) + are installed automatically by :func:`setup_goal`. + + Args: + work_dir: Working directory for Bash / Write / Read tools. + """ + cwd = work_dir or os.getcwd() + agent = LlmAgent( + name="goal_agent", + description="Engineering assistant that pursues a persistent session goal step by step.", + model=_create_model(), + instruction=INSTRUCTION, + tools=[ + BashTool(cwd=cwd), + WriteTool(cwd=cwd), + ReadTool(cwd=cwd), + ], + ) + opts = GoalOptions( + max_retries=3, + on_retry=on_retry, + ) + setup_goal(agent, opts) + return agent + + +goal_agent = create_goal_agent() +root_agent = goal_agent diff --git a/examples/goal_tools/agent/config.py b/examples/goal_tools/agent/config.py new file mode 100644 index 00000000..7bb17e9c --- /dev/null +++ b/examples/goal_tools/agent/config.py @@ -0,0 +1,21 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Agent config module.""" + +import os + + +def get_model_config() -> tuple[str, str, str]: + """Get LLM model config from environment variables.""" + api_key = os.getenv("TRPC_AGENT_API_KEY", "") + url = os.getenv("TRPC_AGENT_BASE_URL", "") + model_name = os.getenv("TRPC_AGENT_MODEL_NAME", "") + if not api_key or not url or not model_name: + raise ValueError( + "TRPC_AGENT_API_KEY, TRPC_AGENT_BASE_URL, and TRPC_AGENT_MODEL_NAME " + "must be set in environment variables" + ) + return api_key, url, model_name diff --git a/examples/goal_tools/agent/prompts.py b/examples/goal_tools/agent/prompts.py new file mode 100644 index 00000000..70e06f8c --- /dev/null +++ b/examples/goal_tools/agent/prompts.py @@ -0,0 +1,9 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Prompts for the Goal tools demo agent.""" + +INSTRUCTION = """You are a rigorous engineering assistant that can work toward session goals. """ + diff --git a/examples/goal_tools/run_agent.py b/examples/goal_tools/run_agent.py new file mode 100644 index 00000000..a40842b0 --- /dev/null +++ b/examples/goal_tools/run_agent.py @@ -0,0 +1,295 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Demo entry point for the Goal tools example. + +Runs two independent test cases back-to-back, each using a fresh session: + + Case 1 – "模型自己设置 Goal": + The user sends a plain request; the agent decides to call ``create_goal`` + itself, executes the work step by step, and calls ``update_goal('complete')`` + when done. The enforcement callbacks guard every turn: any premature + final response while the goal is still active is intercepted, a nudge is + injected, and the agent loop retries within the same invocation. + + Case 2 – "用户(宿主)设置 Goal": + The host application calls ``start_goal()`` before the first turn — the + goal is written directly into ``session.state``. The agent never calls + ``create_goal``; the enforcement callbacks pick up the pre-existing active + goal from the very first token. The agent only has to do the work and + call ``update_goal('complete')`` when finished. + +Set TRPC_AGENT_API_KEY / TRPC_AGENT_BASE_URL / TRPC_AGENT_MODEL_NAME (see .env) +before running. +""" + +from __future__ import annotations + +import asyncio +import os +import shutil +import time +import uuid + +from dotenv import load_dotenv +from trpc_agent_sdk.runners import Runner +from trpc_agent_sdk.sessions import InMemorySessionService +from trpc_agent_sdk.tools.goal_tools import get_goal_record +from trpc_agent_sdk.tools.goal_tools import render_goal +from trpc_agent_sdk.tools.goal_tools import start_goal +from trpc_agent_sdk.types import Content +from trpc_agent_sdk.types import Part + +load_dotenv() + +APP_NAME = "goal_agent_demo" +USER_ID = "demo_user" + + +# --------------------------------------------------------------------------- +# Shared helpers +# --------------------------------------------------------------------------- + +def _summarise_tool_response(name: str, resp: object) -> str: + if not isinstance(resp, dict): + return str(resp) + if "error" in resp: + return f"error={resp['error']!r}" + match name: + case "Bash": + stdout = (resp.get("stdout") or "").strip() + stdout = stdout[:120] + ("..." if len(stdout) > 120 else "") + return f"success={resp.get('success')} rc={resp.get('return_code')} stdout={stdout!r}" + case "Write": + return f"path={resp.get('path')!r} success={resp.get('success')}" + case "Read": + return f"path={resp.get('path')!r} lines={resp.get('total_lines')}" + case "create_goal": + g = resp.get("goal") or {} + return f"created id={g.get('id')} status={g.get('status')} objective={g.get('objective')!r}" + case "update_goal": + g = resp.get("goal") or {} + return f"updated status={g.get('status')}" + case "get_goal": + g = resp.get("goal") or {} + return f"status={g.get('status')} objective={g.get('objective')!r}" if g else "(no goal)" + case _: + return str(resp) + + +async def _run_turn( + runner: Runner, + *, + session_id: str, + label: str, + query: str, + agent_name: str, +) -> None: + """Drive a single user turn, pretty-print events, then show the persisted goal.""" + print(f"\n{'=' * 56}") + print(f" Turn: {label}") + print(f"{'=' * 56}") + print(f"📝 User: {query}\n") + + # Collect text separately from partial and non-partial events. + # Tool calls/responses arrive as non-partial events and are printed immediately. + # Model text may arrive as streaming partial chunks; we fall back to partial + # collection so pure-text responses are always visible. + streaming_text: list[str] = [] # accumulated from partial text chunks + final_text: list[str] = [] # text from the last non-partial event + + user_content = Content(parts=[Part.from_text(text=query)]) + async for event in runner.run_async( + user_id=USER_ID, + session_id=session_id, + new_message=user_content, + ): + if not event.content or not event.content.parts: + continue + for part in event.content.parts: + if part.thought: + continue + if part.function_call: + if not event.partial: + print(f"\n🔧 [Tool call] {part.function_call.name}({part.function_call.args})") + elif part.function_response: + if not event.partial: + name = part.function_response.name + resp = part.function_response.response + summary = _summarise_tool_response(name, resp) + print(f"\n📊 [Tool result] {summary}") + if isinstance(resp, dict) and resp.get("message"): + print(f"\n💬 {resp['message']}") + elif part.text: + if event.partial: + streaming_text.append(part.text) + else: + final_text.append(part.text) + + # Prefer consolidated non-partial text; fall back to streaming chunks. + text_to_show = "".join(final_text) or "".join(streaming_text) + if text_to_show: + print(f"\n🤖 Assistant: {text_to_show}") + + session = await runner.session_service.get_session( + app_name=APP_NAME, + user_id=USER_ID, + session_id=session_id, + ) + goal = get_goal_record(session, branch=agent_name) + print(f"\n{'-' * 40}") + print("🎯 Persisted goal (read from session):") + print(render_goal(goal)) + print("-" * 40) + + +def _clean(work_dir: str, *names: str) -> None: + for name in names: + path = os.path.join(work_dir, name) + if os.path.isdir(path): + shutil.rmtree(path) + print(f"🧹 Cleaned {path}") + elif os.path.isfile(path): + os.remove(path) + print(f"🧹 Cleaned {path}") + + +# --------------------------------------------------------------------------- +# Case 1: 模型自己设置 Goal +# --------------------------------------------------------------------------- + +CASE1_TURNS = [ + ( + "搭建 Python 工具包(模型设置目标)", + # 多步骤文件操作任务:需要创建目录 + 多个文件 + 验证, + # 足够复杂,观测模型是否自主调用 create_goal 再逐步执行。 + "基于goal能力,请在当前目录搭建一个最小 Python 工具包 mypkg/,需要完成:\n" + "1) 创建 mypkg/__init__.py,内容:__version__ = '0.1.0'\n" + "2) 创建 mypkg/utils.py,包含函数 greet(name: str) -> str\n" + "3) 创建 mypkg/README.md,标题「mypkg」,一句话描述该工具包的功能\n" + "请逐步执行并在完成后验证三个文件都已存在。", + ), +] + + +async def case1_model_sets_goal(work_dir: str) -> None: + """Case 1: the LLM calls create_goal autonomously.""" + from agent.agent import create_goal_agent + + print("\n" + "-" * 56) + print(" Case 1: 模型自己设置 Goal(agent 调用 create_goal)") + print("-" * 56) + + _clean(work_dir, "mypkg", "README.md") + + agent = create_goal_agent(work_dir=work_dir) + runner = Runner( + app_name=APP_NAME, + agent=agent, + session_service=InMemorySessionService(), + ) + + session_id = str(uuid.uuid4()) + await runner.session_service.create_session( + app_name=APP_NAME, + user_id=USER_ID, + session_id=session_id, + ) + print(f"🆔 Session: {session_id[:8]}...") + + for label, query in CASE1_TURNS: + await _run_turn(runner, session_id=session_id, label=label, query=query, agent_name=agent.name) + + await runner.close() + + +# --------------------------------------------------------------------------- +# Case 2: 用户(宿主)设置 Goal +# --------------------------------------------------------------------------- + +CASE2_OBJECTIVE = ( + # 目标与 prompt 差异化,观察模型是否能够正确理解目标并执行任务 + "在当前目录创建 notes/ 目录,其中包含两个文件:\n" + " - summary.txt:用三句话描述 Python 异步编程的核心概念\n" + " - example.py:一个可运行的 asyncio 示例(包含 main 协程和 asyncio.run 调用)" +) + +CASE2_TURNS = [ + ( + "执行任务(用户设置目标)", + # 宿主已通过 start_goal() 设置了目标,消息里不提及任何 goal 工具。 + "请在当前目录创建 notes/ 目录,在其中写文件:\n" + "summary.txt:用三句话描述 Python 异步编程的核心概念\n" + ) +] + + +async def case2_user_sets_goal(work_dir: str) -> None: + """Case 2: the host calls start_goal() before the first turn. + + The goal is written into session.state directly; the agent never calls + create_goal, but the enforcement callbacks are fully active. + """ + from agent.agent import create_goal_agent + + print("\n" + "-" * 56) + print(" Case 2: 用户(宿主)设置 Goal(调用 start_goal())") + print("-" * 56) + + _clean(work_dir, "notes") + + _clean(work_dir, "poem") + + agent = create_goal_agent(work_dir=work_dir) + session_service = InMemorySessionService() + runner = Runner( + app_name=APP_NAME, + agent=agent, + session_service=session_service, + ) + + session_id = str(uuid.uuid4()) + await session_service.create_session( + app_name=APP_NAME, + user_id=USER_ID, + session_id=session_id, + ) + + # ── 应用层直接设置目标,模型无需调用 create_goal ──────────────────────── # + goal = await start_goal( + session_service, + app_name=APP_NAME, + user_id=USER_ID, + session_id=session_id, + objective=CASE2_OBJECTIVE, + agent_name=agent.name, + ) + print(f"🆔 Session: {session_id[:8]}...") + print(f"🎯 Goal pre-injected by host:") + print(f" objective: {goal.objective!r}") + print(f" status: {goal.status.value}") + print( + "\n📌 Note: goal is active from the first token.\n" + " The agent does NOT call create_goal.\n" + ) + + for label, query in CASE2_TURNS: + await _run_turn(runner, session_id=session_id, label=label, query=query, agent_name=agent.name) + + await runner.close() + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + +async def main() -> None: + work_dir = os.getcwd() + await case1_model_sets_goal(work_dir) + await case2_user_sets_goal(work_dir) + + +if __name__ == "__main__": + asyncio.run(main()) diff --git a/tests/tools/goal_tools/test_goal_tools.py b/tests/tools/goal_tools/test_goal_tools.py new file mode 100644 index 00000000..a372c7c9 --- /dev/null +++ b/tests/tools/goal_tools/test_goal_tools.py @@ -0,0 +1,627 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Tests for the Goal capability (:mod:`trpc_agent_sdk.tools.goal_tools`).""" + +from __future__ import annotations + +import asyncio + +import pytest + +from trpc_agent_sdk.agents._base_agent import BaseAgent +from trpc_agent_sdk.agents._constants import TRPC_AGENT_RUNNING_KEY +from trpc_agent_sdk.context import InvocationContext, create_agent_context +from trpc_agent_sdk.events import Event +from trpc_agent_sdk.models import LlmRequest, LlmResponse +from trpc_agent_sdk.sessions import InMemorySessionService +from trpc_agent_sdk.tools.goal_tools import ( + DEFAULT_STATE_KEY_PREFIX, + GoalCreateTool, + GoalGetTool, + GoalOptions, + GoalRecord, + GoalStatus, + GoalToolSet, + GoalUpdateTool, + decode_goal, + encode_goal, + get_goal_record, + render_goal, + setup_goal, + start_goal, + state_key, +) +from trpc_agent_sdk.tools.goal_tools._setup import ( + _REMINDER_PENDING_KEY, + _RETRY_COUNT_KEY, + _GoalCallbacks, +) +from trpc_agent_sdk.tools.goal_tools._prompt import _GUIDANCE_MARKER +from trpc_agent_sdk.types import Content, EventActions, Part + +AGENT_NAME = "goal_agent" + + +class _StubAgent(BaseAgent): + async def _run_async_impl(self, ctx): + yield + + +@pytest.fixture +def bundle(): + service = InMemorySessionService() + session = asyncio.run(service.create_session(app_name="test", user_id="u1", session_id="s1")) + agent = _StubAgent(name=AGENT_NAME) + ctx = InvocationContext( + session_service=service, + invocation_id="inv-1", + agent=agent, + agent_context=create_agent_context(), + session=session, + branch="", + ) + return service, session, agent, ctx + + +async def _create(ctx, objective): + return await GoalCreateTool()._run_async_impl(tool_context=ctx, args={"objective": objective}) + + +async def _update(ctx, status): + return await GoalUpdateTool()._run_async_impl(tool_context=ctx, args={"status": status}) + + +async def _refresh_ctx(service, ctx, *, app_name="test", user_id="u1", session_id="s1"): + """Reload session state into an invocation context after host-side writes.""" + ctx.session = await service.get_session(app_name=app_name, user_id=user_id, session_id=session_id) + ctx.callback_state = None + + +async def _seed_goal(service, *, app_name, user_id, session_id, objective, branch=""): + """Application-layer goal write used by tests (appends a ``state_delta`` event).""" + import time + import uuid + + session = await service.get_session(app_name=app_name, user_id=user_id, session_id=session_id) + now = int(time.time()) + record = GoalRecord( + id=uuid.uuid4().hex, + objective=objective, + status=GoalStatus.ACTIVE, + createdAtUnix=now, + updatedAtUnix=now, + ) + key = state_key(DEFAULT_STATE_KEY_PREFIX, branch) + event = Event( + invocation_id="goal-" + uuid.uuid4().hex, + author="goal", + branch=branch or None, + actions=EventActions(state_delta={key: record.model_dump_json(by_alias=True)}), + ) + await service.append_event(session, event) + return record + + +def _final_text_response(text: str = "All done!") -> LlmResponse: + return LlmResponse(content=Content(role="model", parts=[Part.from_text(text=text)]), partial=False) + + +# --------------------------------------------------------------------------- # +# Tools # +# --------------------------------------------------------------------------- # +class TestGoalCreate: + @pytest.mark.asyncio + async def test_creates_active_goal(self, bundle): + _, _, agent, ctx = bundle + res = await _create(ctx, "Refactor the whole billing service") + assert res["goal"]["status"] == "active" + assert res["goal"]["objective"] == "Refactor the whole billing service" + assert state_key(DEFAULT_STATE_KEY_PREFIX, agent.name) in ctx.state._delta + + @pytest.mark.asyncio + async def test_rejects_empty_objective(self, bundle): + _, _, _, ctx = bundle + res = await GoalCreateTool()._run_async_impl(tool_context=ctx, args={"objective": " "}) + assert "error" in res + + @pytest.mark.asyncio + async def test_rejects_duplicate_active(self, bundle): + _, _, _, ctx = bundle + await _create(ctx, "first") + res = await _create(ctx, "second") + assert "error" in res + assert "active goal already exists" in res["error"] + + @pytest.mark.asyncio + async def test_can_recreate_after_terminal(self, bundle): + _, _, _, ctx = bundle + await _create(ctx, "first") + await _update(ctx, "complete") + res = await _create(ctx, "second") + assert "error" not in res + assert res["goal"]["objective"] == "second" + + +class TestGoalUpdate: + @pytest.mark.asyncio + async def test_complete_sets_terminal_time(self, bundle): + _, _, _, ctx = bundle + await _create(ctx, "obj") + res = await _update(ctx, "complete") + assert res["goal"]["status"] == "complete" + assert res["goal"]["terminalAtUnix"] is not None + + @pytest.mark.asyncio + async def test_blocked_is_terminal(self, bundle): + _, _, _, ctx = bundle + await _create(ctx, "obj") + res = await _update(ctx, "blocked") + assert res["goal"]["status"] == "blocked" + + @pytest.mark.asyncio + async def test_rejects_active_status(self, bundle): + _, _, _, ctx = bundle + await _create(ctx, "obj") + res = await GoalUpdateTool()._run_async_impl(tool_context=ctx, args={"status": "active"}) + assert "error" in res + + @pytest.mark.asyncio + async def test_rejects_when_no_goal(self, bundle): + _, _, _, ctx = bundle + res = await _update(ctx, "complete") + assert "error" in res + + @pytest.mark.asyncio + async def test_terminal_cannot_be_changed(self, bundle): + _, _, _, ctx = bundle + await _create(ctx, "obj") + await _update(ctx, "complete") + res = await _update(ctx, "blocked") + assert "error" in res + assert "terminal" in res["error"] + + +class TestStartGoal: + @pytest.mark.asyncio + async def test_writes_active_goal_to_existing_session(self): + service = InMemorySessionService() + await service.create_session(app_name="test", user_id="u1", session_id="s1") + + goal = await start_goal( + service, + app_name="test", + user_id="u1", + session_id="s1", + objective="Ship the feature", + agent_name=AGENT_NAME, + ) + + assert goal.status == GoalStatus.ACTIVE + assert goal.objective == "Ship the feature" + assert goal.id + + session = await service.get_session(app_name="test", user_id="u1", session_id="s1") + stored = get_goal_record(session, branch=AGENT_NAME) + assert stored is not None + assert stored.id == goal.id + assert stored.objective == "Ship the feature" + assert session.state[state_key(DEFAULT_STATE_KEY_PREFIX, AGENT_NAME)] == encode_goal(goal) + + @pytest.mark.asyncio + async def test_creates_session_when_missing(self): + service = InMemorySessionService() + + goal = await start_goal( + service, + app_name="test", + user_id="u1", + session_id="brand-new-session", + objective="Bootstrap task", + agent_name=AGENT_NAME, + ) + + assert goal.status == GoalStatus.ACTIVE + session = await service.get_session(app_name="test", user_id="u1", session_id="brand-new-session") + assert session is not None + stored = get_goal_record(session, branch=AGENT_NAME) + assert stored is not None + assert stored.objective == "Bootstrap task" + + @pytest.mark.asyncio + async def test_rejects_empty_objective(self): + service = InMemorySessionService() + with pytest.raises(ValueError, match="non-empty"): + await start_goal( + service, + app_name="test", + user_id="u1", + session_id="s1", + objective=" ", + ) + + @pytest.mark.asyncio + async def test_strips_objective_whitespace(self): + service = InMemorySessionService() + + goal = await start_goal( + service, + app_name="test", + user_id="u1", + session_id="trim-session", + objective=" trimmed objective ", + ) + + assert goal.objective == "trimmed objective" + + @pytest.mark.asyncio + async def test_replaces_existing_goal(self): + service = InMemorySessionService() + await service.create_session(app_name="test", user_id="u1", session_id="s1") + + first = await start_goal( + service, + app_name="test", + user_id="u1", + session_id="s1", + objective="first", + agent_name=AGENT_NAME, + ) + second = await start_goal( + service, + app_name="test", + user_id="u1", + session_id="s1", + objective="second", + agent_name=AGENT_NAME, + ) + + assert second.id != first.id + assert second.objective == "second" + session = await service.get_session(app_name="test", user_id="u1", session_id="s1") + stored = get_goal_record(session, branch=AGENT_NAME) + assert stored is not None + assert stored.objective == "second" + + @pytest.mark.asyncio + async def test_honours_custom_state_key_prefix(self): + service = InMemorySessionService() + prefix = "custom_goal" + + await start_goal( + service, + app_name="test", + user_id="u1", + session_id="custom-prefix", + objective="scoped objective", + state_key_prefix=prefix, + agent_name="worker", + ) + + session = await service.get_session(app_name="test", user_id="u1", session_id="custom-prefix") + assert get_goal_record(session, branch="worker", prefix=prefix) is not None + assert get_goal_record(session, branch="worker", prefix=DEFAULT_STATE_KEY_PREFIX) is None + + @pytest.mark.asyncio + async def test_create_goal_rejects_after_start_goal(self, bundle): + service, _, agent, ctx = bundle + await start_goal( + service, + app_name="test", + user_id="u1", + session_id="s1", + objective="host goal", + agent_name=agent.name, + ) + await _refresh_ctx(service, ctx) + + res = await _create(ctx, "model goal") + assert "error" in res + assert "active goal already exists" in res["error"] + + @pytest.mark.asyncio + async def test_get_goal_reads_host_injected_goal(self, bundle): + service, _, agent, ctx = bundle + await start_goal( + service, + app_name="test", + user_id="u1", + session_id="s1", + objective="host objective", + agent_name=agent.name, + ) + await _refresh_ctx(service, ctx) + + res = await GoalGetTool()._run_async_impl(tool_context=ctx, args={}) + assert res["goal"]["objective"] == "host objective" + assert res["goal"]["status"] == "active" + + @pytest.mark.asyncio + async def test_enforcement_intercepts_premature_final(self, bundle): + service, _, agent, ctx = bundle + await start_goal( + service, + app_name="test", + user_id="u1", + session_id="s1", + objective="finish the job", + agent_name=agent.name, + ) + await _refresh_ctx(service, ctx) + + cb = _GoalCallbacks(GoalOptions()) + replaced = await cb.after_model(ctx, _final_text_response()) + + assert replaced is not None + assert replaced.partial is True + assert ctx.agent_context.metadata[TRPC_AGENT_RUNNING_KEY] is True + + +class TestGoalGet: + @pytest.mark.asyncio + async def test_no_goal(self, bundle): + _, _, _, ctx = bundle + res = await GoalGetTool()._run_async_impl(tool_context=ctx, args={}) + assert "goal" not in res + assert "No session goal" in res["message"] + + @pytest.mark.asyncio + async def test_returns_current(self, bundle): + _, _, _, ctx = bundle + await _create(ctx, "obj") + res = await GoalGetTool()._run_async_impl(tool_context=ctx, args={}) + assert res["goal"]["objective"] == "obj" + + +# --------------------------------------------------------------------------- # +# Enforcement callbacks # +# --------------------------------------------------------------------------- # +class TestEnforcement: + @pytest.mark.asyncio + async def test_before_model_injects_guidance_once(self, bundle): + _, _, _, ctx = bundle + cb = _GoalCallbacks(GoalOptions()) + request = LlmRequest() + await cb.before_model(ctx, request) + await cb.before_model(ctx, request) + text = str(request.config.system_instruction) + assert text.count(_GUIDANCE_MARKER) == 1 + + @pytest.mark.asyncio + async def test_premature_final_triggers_rerun(self, bundle): + _, _, _, ctx = bundle + events = [] + cb = _GoalCallbacks(GoalOptions(on_retry=events.append)) + await _create(ctx, "obj") + + replaced = await cb.after_model(ctx, _final_text_response()) + + assert replaced is not None + assert replaced.partial is True + meta = ctx.agent_context.metadata + assert meta[TRPC_AGENT_RUNNING_KEY] is True + assert meta[_RETRY_COUNT_KEY] == 1 + assert meta[_REMINDER_PENDING_KEY] is True + assert len(events) == 1 and events[0].reason == "blocked" + + @pytest.mark.asyncio + async def test_nudge_appended_on_next_turn(self, bundle): + _, _, _, ctx = bundle + cb = _GoalCallbacks(GoalOptions()) + await _create(ctx, "ship the feature") + await cb.after_model(ctx, _final_text_response()) + + request = LlmRequest() + await cb.before_model(ctx, request) + assert len(request.contents) == 1 + nudge_text = request.contents[0].parts[0].text + assert "ship the feature" in nudge_text + assert "attempt 1" in nudge_text + # reminder cleared after being consumed + assert ctx.agent_context.metadata[_REMINDER_PENDING_KEY] is False + + @pytest.mark.asyncio + async def test_fail_open_after_max_retries(self, bundle): + _, _, _, ctx = bundle + events = [] + cb = _GoalCallbacks(GoalOptions(max_retries=2, on_retry=events.append)) + await _create(ctx, "obj") + + assert await cb.after_model(ctx, _final_text_response()) is not None # attempt 1 + assert await cb.after_model(ctx, _final_text_response()) is not None # attempt 2 + # budget exhausted -> let the final response through (fail-open) + passthrough = await cb.after_model(ctx, _final_text_response()) + assert passthrough is None + assert ctx.agent_context.metadata[_RETRY_COUNT_KEY] == 0 + assert events[-1].reason == "exhausted" + + @pytest.mark.asyncio + async def test_partial_chunk_passes_through(self, bundle): + _, _, _, ctx = bundle + cb = _GoalCallbacks(GoalOptions()) + await _create(ctx, "obj") + partial = LlmResponse(content=Content(role="model", parts=[Part.from_text(text="thinking")]), partial=True) + assert await cb.after_model(ctx, partial) is None + assert TRPC_AGENT_RUNNING_KEY not in ctx.agent_context.metadata + + @pytest.mark.asyncio + async def test_tool_call_response_passes_through(self, bundle): + _, _, _, ctx = bundle + cb = _GoalCallbacks(GoalOptions()) + await _create(ctx, "obj") + tool_call = LlmResponse( + content=Content(role="model", parts=[Part.from_function_call(name="update_goal", args={"status": "complete"})]), + partial=False, + ) + assert await cb.after_model(ctx, tool_call) is None + + @pytest.mark.asyncio + async def test_no_interception_when_goal_terminal(self, bundle): + _, _, _, ctx = bundle + cb = _GoalCallbacks(GoalOptions()) + await _create(ctx, "obj") + await _update(ctx, "complete") + # terminal goal: final response is a legitimate wrap-up + assert await cb.after_model(ctx, _final_text_response()) is None + + @pytest.mark.asyncio + async def test_no_interception_when_no_goal(self, bundle): + _, _, _, ctx = bundle + cb = _GoalCallbacks(GoalOptions()) + assert await cb.after_model(ctx, _final_text_response()) is None + + +# --------------------------------------------------------------------------- # +# setup_goal / helpers # +# --------------------------------------------------------------------------- # +class TestSetupGoal: + @pytest.mark.asyncio + async def test_appends_toolset_and_chains_callbacks(self): + from types import SimpleNamespace + + def prior_before(ctx, req): + return None + + # setup_goal only touches ``tools`` and the two model-callback fields. + agent = SimpleNamespace(tools=[], before_model_callback=prior_before, after_model_callback=None) + setup_goal(agent) + + # toolset appended + assert any(isinstance(t, GoalToolSet) for t in agent.tools) + # callbacks chained (prior preserved + new appended) + assert isinstance(agent.before_model_callback, list) + assert prior_before in agent.before_model_callback + assert len(agent.before_model_callback) == 2 + assert isinstance(agent.after_model_callback, list) + tools = await GoalToolSet().get_tools() + assert {t.name for t in tools} == {"get_goal", "create_goal", "update_goal"} + + +class TestPersistence: + @pytest.mark.asyncio + async def test_goal_survives_append_event_and_get_session(self, bundle): + service, session, agent, ctx = bundle + await _create(ctx, "obj") + event = Event( + invocation_id="inv-1", + author=agent.name, + content=Content(parts=[Part.from_text(text="tool result")]), + actions=EventActions(state_delta=dict(ctx.event_actions.state_delta)), + ) + await service.append_event(session, event) + stored = await service.get_session(app_name="test", user_id="u1", session_id="s1") + goal = get_goal_record(stored, branch=agent.name) + assert goal is not None and goal.status == GoalStatus.ACTIVE + + +class TestHelpers: + def test_state_key(self): + assert state_key(DEFAULT_STATE_KEY_PREFIX, "") == "goal" + assert state_key(DEFAULT_STATE_KEY_PREFIX, "agent") == "goal:agent" + + def test_decode_dirty_data_degrades_to_none(self): + assert decode_goal("not json") is None + assert decode_goal(None) is None + + def test_render_goal(self): + assert render_goal(None) == "(no goal)" + rec = GoalRecord(id="x", objective="do it", status=GoalStatus.ACTIVE, createdAtUnix=1, updatedAtUnix=1) + rendered = render_goal(rec) + assert "active" in rendered + assert "do it" in rendered + + +class TestGoalToolSet: + @pytest.mark.asyncio + async def test_returns_three_tools(self): + tools = await GoalToolSet().get_tools() + assert {t.name for t in tools} == {"get_goal", "create_goal", "update_goal"} + + +# --------------------------------------------------------------------------- # +# End-to-end: validates the same-invocation re-run lever (B2) via the Runner. # +# --------------------------------------------------------------------------- # +from typing import List # noqa: E402 + +from trpc_agent_sdk.models import LLMModel, ModelRegistry # noqa: E402 +from trpc_agent_sdk.runners import Runner # noqa: E402 + + +class _ScriptedModel(LLMModel): + """Premature-final on turn 1, ``update_goal(complete)`` on turn 2, final on turn 3.""" + + calls: int = 0 + saw_nudge: List[bool] = [] + + @classmethod + def supported_models(cls) -> List[str]: + return [r"scripted-.*"] + + def validate_request(self, request): + pass + + async def _generate_async_impl(self, request, stream=False, ctx=None): + type(self).calls += 1 + n = type(self).calls + nudged = any( + (c.role == "user") and c.parts and any("goal reminder" in (p.text or "") for p in c.parts) + for c in request.contents + ) + type(self).saw_nudge.append(nudged) + if n == 1: + yield LlmResponse( + content=Content(role="model", parts=[Part.from_text(text="I refactored one function. Done!")]), + partial=False, + ) + elif n == 2: + part = Part.from_function_call(name="update_goal", args={"status": "complete"}) + part.function_call.id = "call-1" + yield LlmResponse(content=Content(role="model", parts=[part]), partial=False) + else: + yield LlmResponse( + content=Content(role="model", parts=[Part.from_text(text="The whole task is finished.")]), + partial=False, + ) + + +class TestEndToEndRerun: + @pytest.mark.asyncio + async def test_premature_final_reruns_until_model_self_reports(self): + from trpc_agent_sdk.agents import LlmAgent + + original = ModelRegistry._registry.copy() + ModelRegistry.register(_ScriptedModel) + _ScriptedModel.calls = 0 + _ScriptedModel.saw_nudge = [] + try: + agent = LlmAgent(name="goal_e2e", model="scripted-1") + setup_goal(agent, GoalOptions(max_retries=3)) + + service = InMemorySessionService() + runner = Runner(app_name="goal_app", agent=agent, session_service=service) + await service.create_session(app_name="goal_app", user_id="u", session_id="sid") + await _seed_goal( + service, app_name="goal_app", user_id="u", session_id="sid", + objective="Refactor the entire service", branch=agent.name, + ) + + async for _ in runner.run_async( + user_id="u", + session_id="sid", + new_message=Content(role="user", parts=[Part.from_text(text="go")]), + ): + pass + await runner.close() + + # The loop re-ran within ONE invocation: model called 3 times. + assert _ScriptedModel.calls == 3 + # Turn 2 saw the nudge injected by before_model. + assert _ScriptedModel.saw_nudge[1] is True + # Goal ended up complete (model self-reported via update_goal). + stored = await service.get_session(app_name="goal_app", user_id="u", session_id="sid") + goal = get_goal_record(stored, branch=agent.name) + assert goal is not None and goal.status == GoalStatus.COMPLETE + finally: + ModelRegistry._registry = original diff --git a/trpc_agent_sdk/tools/__init__.py b/trpc_agent_sdk/tools/__init__.py index 3efb355b..52512406 100644 --- a/trpc_agent_sdk/tools/__init__.py +++ b/trpc_agent_sdk/tools/__init__.py @@ -60,6 +60,15 @@ from .task_tools import TaskUpdateTool from .task_tools import get_task_store from .task_tools import render_task_list +from .goal_tools import GoalOptions +from .goal_tools import GoalRecord +from .goal_tools import GoalStatus +from .goal_tools import GoalToolSet +from .goal_tools import OnRetry +from .goal_tools import RetryEvent +from .goal_tools import get_goal_record +from .goal_tools import setup_goal +from .goal_tools import render_goal from ._transfer_to_agent_tool import transfer_to_agent from ._webfetch_tool import FetchResult from ._webfetch_tool import WebFetchTool @@ -143,6 +152,15 @@ "get_task_store", "render_task_list", "DEFAULT_TASK_PROMPT", + "GoalStatus", + "GoalRecord", + "GoalToolSet", + "GoalOptions", + "RetryEvent", + "OnRetry", + "setup_goal", + "get_goal_record", + "render_goal", "FetchResult", "WebFetchTool", "SearchHit", diff --git a/trpc_agent_sdk/tools/goal_tools/__init__.py b/trpc_agent_sdk/tools/goal_tools/__init__.py new file mode 100644 index 00000000..376560bc --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/__init__.py @@ -0,0 +1,58 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Goal tool family — persistent, cross-turn session objective. + +A goal lets a user set a single objective that survives across LLM calls: +while it is ``active`` a "looks final" response does not end the task — the +model must keep working or explicitly mark the goal ``complete`` / ``blocked``. + +The capability is a lightweight bundle on a single :class:`LlmAgent`: +``GoalToolSet`` (three model tools) + a pair of model callbacks that inject +guidance / nudges and intercept premature final responses to re-run within the +same invocation. Mount everything in one call with :func:`setup_goal`. +""" + +from ._setup import GoalOptions +from ._setup import OnRetry +from ._setup import RetryEvent +from ._setup import setup_goal +from ._goal_create_tool import GoalCreateTool +from ._goal_get_tool import GoalGetTool +from ._goal_toolset import GoalToolSet +from ._goal_update_tool import GoalUpdateTool +from ._helpers import DEFAULT_STATE_KEY_PREFIX +from ._helpers import decode_goal +from ._helpers import encode_goal +from ._helpers import get_goal_record +from ._helpers import render_goal +from ._helpers import start_goal +from ._helpers import state_key +from ._models import GoalRecord +from ._models import GoalStatus +from ._prompt import DEFAULT_GUIDANCE +from ._prompt import DEFAULT_NUDGE + +__all__ = [ + "GoalStatus", + "GoalRecord", + "GoalToolSet", + "GoalGetTool", + "GoalCreateTool", + "GoalUpdateTool", + "GoalOptions", + "RetryEvent", + "OnRetry", + "setup_goal", + "get_goal_record", + "decode_goal", + "encode_goal", + "start_goal", + "state_key", + "render_goal", + "DEFAULT_STATE_KEY_PREFIX", + "DEFAULT_GUIDANCE", + "DEFAULT_NUDGE", +] diff --git a/trpc_agent_sdk/tools/goal_tools/_base.py b/trpc_agent_sdk/tools/goal_tools/_base.py new file mode 100644 index 00000000..3dc05292 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_base.py @@ -0,0 +1,73 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Shared base for the Goal tools: branch resolution and goal load / save.""" + +from __future__ import annotations + +from abc import abstractmethod +from typing import Any +from typing import Optional +from typing_extensions import override + +from trpc_agent_sdk.context import InvocationContext +from trpc_agent_sdk.filter import BaseFilter + +from .._base_tool import BaseTool +from ._helpers import DEFAULT_STATE_KEY_PREFIX +from ._helpers import decode_goal +from ._helpers import encode_goal +from ._helpers import state_key +from ._lock import goal_store_lock +from ._models import GoalRecord + + +class _GoalToolBase(BaseTool): + """Common plumbing shared by the three Goal tools. + + Handles branch-scoped state-key resolution and goal load / save. The + behavioural guidance is injected by the enforcement callbacks (see + :mod:`._setup`), not here, so the tools stay lightweight and can be + mounted independently. + """ + + def __init__( + self, + *, + name: str, + description: str, + state_key_prefix: str = DEFAULT_STATE_KEY_PREFIX, + filters_name: Optional[list[str]] = None, + filters: Optional[list[BaseFilter]] = None, + ) -> None: + super().__init__( + name=name, + description=description, + filters_name=filters_name, + filters=filters, + ) + self._prefix = state_key_prefix or DEFAULT_STATE_KEY_PREFIX + + def _resolve_branch(self, tool_context: InvocationContext) -> str: + return tool_context.branch or tool_context.agent_name or "" + + def _state_key(self, tool_context: InvocationContext) -> str: + return state_key(self._prefix, self._resolve_branch(tool_context)) + + def _load_goal(self, tool_context: InvocationContext) -> Optional[GoalRecord]: + return decode_goal(tool_context.state.get(self._state_key(tool_context))) + + def _save_goal(self, tool_context: InvocationContext, goal: GoalRecord) -> None: + tool_context.state[self._state_key(tool_context)] = encode_goal(goal) + + @override + async def _run_async_impl(self, *, tool_context: InvocationContext, args: dict[str, Any]) -> Any: + branch = self._resolve_branch(tool_context) + async with goal_store_lock(tool_context, prefix=self._prefix, branch=branch): + return await self._run_goal(tool_context=tool_context, args=args) + + @abstractmethod + async def _run_goal(self, *, tool_context: InvocationContext, args: dict[str, Any]) -> Any: + """Tool logic; called under :func:`goal_store_lock` for the branch goal.""" diff --git a/trpc_agent_sdk/tools/goal_tools/_goal_create_tool.py b/trpc_agent_sdk/tools/goal_tools/_goal_create_tool.py new file mode 100644 index 00000000..32b4bacf --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_goal_create_tool.py @@ -0,0 +1,66 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""``create_goal`` — create a persistent, cross-turn session goal.""" + +from __future__ import annotations + +import time +from typing import Any +from typing_extensions import override + +from trpc_agent_sdk.context import InvocationContext +from trpc_agent_sdk.types import FunctionDeclaration +from trpc_agent_sdk.types import Schema +from trpc_agent_sdk.types import Type + +from ._base import _GoalToolBase +from ._prompt import DEFAULT_GOAL_CREATE_DESCRIPTION +from ._store import apply_create + +_TOOL_NAME = "create_goal" + + +class GoalCreateTool(_GoalToolBase): + """Create a new ``active`` session goal (rejects a duplicate active goal).""" + + def __init__(self, *, name: str = _TOOL_NAME, **kwargs: Any) -> None: + super().__init__(name=name, description=DEFAULT_GOAL_CREATE_DESCRIPTION, **kwargs) + + @override + def _get_declaration(self) -> FunctionDeclaration: + return FunctionDeclaration( + name=self.name, + description=DEFAULT_GOAL_CREATE_DESCRIPTION, + parameters=Schema( + type=Type.OBJECT, + properties={ + "objective": + Schema( + type=Type.STRING, + description="The completion criterion: what 'done' concretely means.", + ), + }, + required=["objective"], + ), + ) + + @override + async def _run_goal(self, *, tool_context: InvocationContext, args: dict[str, Any]) -> Any: + objective = args.get("objective") + if not isinstance(objective, str) or not objective.strip(): + return {"error": "INVALID_ARGS: `objective` is required and must be a non-empty string"} + + existing = self._load_goal(tool_context) + record, error = apply_create(existing, objective=objective.strip(), now_unix=int(time.time())) + if error is not None: + return {"error": f"INVALID_STATE: {error}"} + + self._save_goal(tool_context, record) + return { + "message": "Goal created and is now active. Keep working until it is genuinely met, " + "then call update_goal('complete').", + "goal": record.model_dump(mode="json", by_alias=True), + } diff --git a/trpc_agent_sdk/tools/goal_tools/_goal_get_tool.py b/trpc_agent_sdk/tools/goal_tools/_goal_get_tool.py new file mode 100644 index 00000000..936216f2 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_goal_get_tool.py @@ -0,0 +1,46 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""``get_goal`` — read the current persistent session goal.""" + +from __future__ import annotations + +from typing import Any +from typing_extensions import override + +from trpc_agent_sdk.context import InvocationContext +from trpc_agent_sdk.types import FunctionDeclaration +from trpc_agent_sdk.types import Schema +from trpc_agent_sdk.types import Type + +from ._base import _GoalToolBase +from ._prompt import DEFAULT_GOAL_GET_DESCRIPTION + +_TOOL_NAME = "get_goal" + + +class GoalGetTool(_GoalToolBase): + """Return the current session goal, or report that none exists.""" + + def __init__(self, *, name: str = _TOOL_NAME, **kwargs: Any) -> None: + super().__init__(name=name, description=DEFAULT_GOAL_GET_DESCRIPTION, **kwargs) + + @override + def _get_declaration(self) -> FunctionDeclaration: + return FunctionDeclaration( + name=self.name, + description=DEFAULT_GOAL_GET_DESCRIPTION, + parameters=Schema(type=Type.OBJECT, properties={}), + ) + + @override + async def _run_goal(self, *, tool_context: InvocationContext, args: dict[str, Any]) -> Any: + goal = self._load_goal(tool_context) + if goal is None: + return {"message": "No session goal is set."} + return { + "message": f"Current goal is {goal.status.value}.", + "goal": goal.model_dump(mode="json", by_alias=True), + } diff --git a/trpc_agent_sdk/tools/goal_tools/_goal_toolset.py b/trpc_agent_sdk/tools/goal_tools/_goal_toolset.py new file mode 100644 index 00000000..ece19684 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_goal_toolset.py @@ -0,0 +1,53 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""``GoalToolSet`` — bundles ``get_goal`` / ``create_goal`` / ``update_goal``.""" + +from __future__ import annotations + +from typing import List +from typing import Optional +from typing_extensions import override + +from trpc_agent_sdk.abc import ToolSetABC +from trpc_agent_sdk.context import InvocationContext + +from .._base_tool import BaseTool +from ._goal_create_tool import GoalCreateTool +from ._goal_get_tool import GoalGetTool +from ._goal_update_tool import GoalUpdateTool +from ._helpers import DEFAULT_STATE_KEY_PREFIX + + +class GoalToolSet(ToolSetABC): + """Toolset exposing ``get_goal`` / ``create_goal`` / ``update_goal``. + + The goal is a single, session-scoped contract persisted to branch-scoped + session state and surviving across Runner invocations. Mount it together + with the enforcement callbacks via + :func:`trpc_agent_sdk.tools.goal_tools.setup_goal`, or on its own when you + only want the model-facing tools. + + Args: + state_key_prefix: State-key prefix; ``goal`` by default. Avoid + ``temp:`` — that prefix is invocation-only and is not stored. + """ + + def __init__( + self, + *, + state_key_prefix: str = DEFAULT_STATE_KEY_PREFIX, + name: str = "goal_toolset", + ) -> None: + super().__init__(name=name) + self._prefix = state_key_prefix or DEFAULT_STATE_KEY_PREFIX + + @override + async def get_tools(self, invocation_context: Optional[InvocationContext] = None) -> List[BaseTool]: + return [ + GoalGetTool(name="get_goal", state_key_prefix=self._prefix), + GoalCreateTool(name="create_goal", state_key_prefix=self._prefix), + GoalUpdateTool(name="update_goal", state_key_prefix=self._prefix), + ] diff --git a/trpc_agent_sdk/tools/goal_tools/_goal_update_tool.py b/trpc_agent_sdk/tools/goal_tools/_goal_update_tool.py new file mode 100644 index 00000000..c3a78eda --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_goal_update_tool.py @@ -0,0 +1,74 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""``update_goal`` — move the active goal into a terminal state.""" + +from __future__ import annotations + +import time +from typing import Any +from typing_extensions import override + +from trpc_agent_sdk.context import InvocationContext +from trpc_agent_sdk.types import FunctionDeclaration +from trpc_agent_sdk.types import Schema +from trpc_agent_sdk.types import Type + +from ._base import _GoalToolBase +from ._models import GoalStatus +from ._prompt import DEFAULT_GOAL_UPDATE_DESCRIPTION +from ._store import apply_transition + +_TOOL_NAME = "update_goal" + +# Terminal states the model is allowed to set; ``active`` is intentionally rejected. +_ALLOWED = {GoalStatus.COMPLETE.value, GoalStatus.BLOCKED.value} + + +class GoalUpdateTool(_GoalToolBase): + """Transition the active goal to ``complete`` or ``blocked``.""" + + def __init__(self, *, name: str = _TOOL_NAME, **kwargs: Any) -> None: + super().__init__(name=name, description=DEFAULT_GOAL_UPDATE_DESCRIPTION, **kwargs) + + @override + def _get_declaration(self) -> FunctionDeclaration: + return FunctionDeclaration( + name=self.name, + description=DEFAULT_GOAL_UPDATE_DESCRIPTION, + parameters=Schema( + type=Type.OBJECT, + properties={ + "status": + Schema( + type=Type.STRING, + enum=["complete", "blocked"], + description="'complete' if the objective is genuinely met, else 'blocked'.", + ), + }, + required=["status"], + ), + ) + + @override + async def _run_goal(self, *, tool_context: InvocationContext, args: dict[str, Any]) -> Any: + status_raw = args.get("status") + if not isinstance(status_raw, str) or status_raw not in _ALLOWED: + return {"error": "INVALID_ARGS: `status` must be 'complete' or 'blocked'"} + + existing = self._load_goal(tool_context) + record, error = apply_transition( + existing, + status=GoalStatus(status_raw), + now_unix=int(time.time()), + ) + if error is not None: + return {"error": f"INVALID_STATE: {error}"} + + self._save_goal(tool_context, record) + return { + "message": f"Goal marked {record.status.value}.", + "goal": record.model_dump(mode="json", by_alias=True), + } diff --git a/trpc_agent_sdk/tools/goal_tools/_helpers.py b/trpc_agent_sdk/tools/goal_tools/_helpers.py new file mode 100644 index 00000000..d87b8528 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_helpers.py @@ -0,0 +1,168 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""State-key handling, (de)serialisation and rendering for the Goal capability.""" + +from __future__ import annotations + +import time +import uuid +from typing import TYPE_CHECKING +from typing import Any +from typing import Optional + +from trpc_agent_sdk.log import logger + +from ._models import GoalRecord +from ._models import GoalStatus + +if TYPE_CHECKING: + from trpc_agent_sdk.abc import SessionServiceABC + +# Session-scoped (no ``temp:`` prefix, which BaseSessionService strips and +# never persists). Survives across Runner invocations and can be resumed. +DEFAULT_STATE_KEY_PREFIX = "goal" + + +def state_key(prefix: str, branch: str) -> str: + """Build the state key, appending ``:`` for sub-agent isolation.""" + prefix = prefix or DEFAULT_STATE_KEY_PREFIX + return prefix if not branch else f"{prefix}:{branch}" + + +def decode_goal(raw: Any) -> Optional[GoalRecord]: + """Decode a persisted value (JSON string / dict) into a :class:`GoalRecord`. + + Tolerates dirty / legacy data: anything that fails to parse degrades to + ``None`` (i.e. "no goal") rather than raising. + """ + if not raw: + return None + try: + if isinstance(raw, str): + return GoalRecord.model_validate_json(raw) + if isinstance(raw, dict): + return GoalRecord.model_validate(raw) + except (ValueError, TypeError) as e: + logger.warning("Goal tools failed to decode persisted goal: %s", e) + return None + + +def encode_goal(goal: GoalRecord) -> str: + """Serialise a goal to a JSON string (camelCase aliases, non-ASCII kept).""" + return goal.model_dump_json(by_alias=True) + + +def get_goal_record( + session: Any, + branch: str = "", + prefix: str = DEFAULT_STATE_KEY_PREFIX, +) -> Optional[GoalRecord]: + """Read the current goal for ``branch`` from a session. + + Intended for server-side / REST / audit reads. ``session`` only needs a + ``state`` mapping attribute. Malformed data degrades to ``None``. + """ + state = getattr(session, "state", None) or {} + return decode_goal(state.get(state_key(prefix, branch))) + + +async def start_goal( + session_service: "SessionServiceABC", + *, + app_name: str, + user_id: str, + session_id: str, + objective: str, + state_key_prefix: str = DEFAULT_STATE_KEY_PREFIX, + agent_name: str = "", +) -> GoalRecord: + """Create or replace the active goal for a session from application code. + + This is the application-layer counterpart to the model-callable + ``create_goal`` tool. Call it when *your code* (not the LLM) owns the + objective definition — for example after parsing a ``/goal `` + slash command, reading it from a config file, or setting it + programmatically before the first turn. + + The goal is written directly into ``session.state`` and immediately + visible to the enforcement callbacks on the next ``Runner.run_async`` + call; the model does **not** need to call ``create_goal``. + + Mirrors Go's ``goal.Start(ctx, service, key, objective, ...)``. + + Args: + session_service: The session service holding the target session. + app_name: Application name (must match the runner's ``app_name``). + user_id: User id (must match the runner's ``user_id``). + session_id: Existing session id to inject the goal into. If the + session does not yet exist it is created automatically. + objective: Non-empty completion criterion — what "done" concretely + means. + state_key_prefix: State-key prefix (``goal`` by default). Must match + the prefix configured on :class:`GoalOptions`; avoid ``temp:``. + agent_name: Agent name used to scope the state key (matches the + ``LlmAgent.name`` the goal extension is mounted on). + + Returns: + The newly created :class:`GoalRecord` with ``status=active``. + + Raises: + ValueError: If ``objective`` is empty. + """ + objective = objective.strip() + if not objective: + raise ValueError("start_goal: objective must be a non-empty string") + + now = int(time.time()) + goal = GoalRecord( + id=uuid.uuid4().hex, + objective=objective, + status=GoalStatus.ACTIVE, + created_at_unix=now, + updated_at_unix=now, + ) + skey = state_key(state_key_prefix, agent_name) + encoded = encode_goal(goal) + + session = await session_service.get_session( + app_name=app_name, + user_id=user_id, + session_id=session_id, + ) + if session is None: + await session_service.create_session( + app_name=app_name, + user_id=user_id, + session_id=session_id, + state={skey: encoded}, + ) + else: + session.state[skey] = encoded + await session_service.update_session(session) + + return goal + + +def render_goal(goal: Optional[GoalRecord]) -> str: + """Render a compact ASCII status card for CLIs / logs. + + The tools themselves never call this. + """ + if goal is None: + return "(no goal)" + glyph = { + GoalStatus.ACTIVE: "🎯", + GoalStatus.COMPLETE: "✅", + GoalStatus.BLOCKED: "⛔", + }.get(goal.status, "🎯") + lines = [ + f"{glyph} Goal [{goal.status.value}]", + f" objective: {goal.objective}", + f" created: {goal.created_at_unix}", + ] + if goal.terminal_at_unix is not None: + lines.append(f" terminal: {goal.terminal_at_unix}") + return "\n".join(lines) diff --git a/trpc_agent_sdk/tools/goal_tools/_lock.py b/trpc_agent_sdk/tools/goal_tools/_lock.py new file mode 100644 index 00000000..3675e549 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_lock.py @@ -0,0 +1,65 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Per-session / per-branch locks for :class:`GoalRecord` read-modify-write.""" + +from __future__ import annotations + +import asyncio +from contextlib import asynccontextmanager +from typing import AsyncIterator +from typing import Dict +from typing import Tuple + +from trpc_agent_sdk.context import InvocationContext + +from ._helpers import state_key + +_LockKey = Tuple[str, str, str, str] +_locks: Dict[_LockKey, asyncio.Lock] = {} +_registry_lock = asyncio.Lock() + + +def store_lock_key( + tool_context: InvocationContext, + *, + prefix: str, + branch: str, +) -> _LockKey: + """Build a stable key for the branch-scoped goal.""" + session = tool_context.session + return ( + getattr(session, "app_name", "") or "", + getattr(session, "user_id", "") or "", + getattr(session, "id", "") or "", + state_key(prefix, branch), + ) + + +async def _get_lock(key: _LockKey) -> asyncio.Lock: + async with _registry_lock: + lock = _locks.get(key) + if lock is None: + lock = asyncio.Lock() + _locks[key] = lock + return lock + + +@asynccontextmanager +async def goal_store_lock( + tool_context: InvocationContext, + *, + prefix: str, + branch: str, +) -> AsyncIterator[None]: + """Serialize load / mutate / save for one session goal.""" + lock = await _get_lock(store_lock_key(tool_context, prefix=prefix, branch=branch)) + async with lock: + yield + + +def reset_locks_for_tests() -> None: + """Clear the lock registry (tests only).""" + _locks.clear() diff --git a/trpc_agent_sdk/tools/goal_tools/_models.py b/trpc_agent_sdk/tools/goal_tools/_models.py new file mode 100644 index 00000000..a3c11e00 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_models.py @@ -0,0 +1,61 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Data model for the Goal (persistent session objective) capability. + +A goal is a single, session-scoped contract that survives across +``Runner.run_async`` invocations: while it is ``active`` a "looks final" +model response does **not** mean the task is done — the model must either keep +working or explicitly mark the goal ``complete`` / ``blocked`` via +``update_goal``. + +Unlike :mod:`trpc_agent_sdk.tools.task_tools` (a multi-item board) there is at +most **one** goal per session branch, serialised as a single JSON blob into +session-level state. Serialisation mirrors the task tools: camelCase aliases +plus ``model_dump_json(by_alias=True)``. +""" + +from __future__ import annotations + +from enum import Enum +from typing import Optional + +from pydantic import BaseModel +from pydantic import Field + + +class GoalStatus(str, Enum): + """Lifecycle state of a session goal (3-state, terminal states irreversible).""" + + ACTIVE = "active" + """Still needs to be pursued.""" + + BLOCKED = "blocked" + """Progress depends on external input / state change (terminal).""" + + COMPLETE = "complete" + """The objective has genuinely been met (terminal).""" + + +class GoalRecord(BaseModel): + """A single session goal. + + ``created_at_unix`` / ``updated_at_unix`` / ``terminal_at_unix`` are + persisted under the camelCase aliases ``createdAtUnix`` / ``updatedAtUnix`` + / ``terminalAtUnix``. + """ + + id: str = Field(description="Server-assigned unique id (uuid).") + objective: str = Field(description="Completion criteria — the contract text.") + status: GoalStatus = Field(default=GoalStatus.ACTIVE, description="Lifecycle state.") + created_at_unix: int = Field(alias="createdAtUnix", description="Creation time (unix seconds).") + updated_at_unix: int = Field(alias="updatedAtUnix", description="Last update time (unix seconds).") + terminal_at_unix: Optional[int] = Field( + default=None, + alias="terminalAtUnix", + description="Time the goal entered a terminal state (unix seconds).", + ) + + model_config = {"populate_by_name": True} diff --git a/trpc_agent_sdk/tools/goal_tools/_prompt.py b/trpc_agent_sdk/tools/goal_tools/_prompt.py new file mode 100644 index 00000000..30387bdb --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_prompt.py @@ -0,0 +1,64 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Tool descriptions and behavioural guidance for the Goal capability.""" + +from __future__ import annotations + +# Per-tool short descriptions fed to the model as part of each function schema. +DEFAULT_GOAL_GET_DESCRIPTION = ( + "Read the current session goal, if any. Use this before deciding whether a persistent goal is active.") + +DEFAULT_GOAL_CREATE_DESCRIPTION = ( + "Create a session goal for a user-requested multi-step objective that should remain active until it is" + " completed or blocked. Do not call this for ordinary one-turn requests.") + +DEFAULT_GOAL_UPDATE_DESCRIPTION = ( + "Mark the active session goal complete or blocked. Use complete only when the objective has actually been" + " achieved. Use blocked only after the same blocking condition has repeated across goal attempts and" + " progress cannot continue without user input or an external-state change.") + +# Long-form guidance, injected once into the system instruction while a goal is +# being enforced (kept idempotent per turn via ``_GUIDANCE_MARKER``). +DEFAULT_GUIDANCE = """\ +You have access to session goal tools. A goal is a durable objective for this conversation, not a \ +todo list and not a generic memory entry. + +Goal tools require serial semantics. In one model response, call at most one goal tool. Do not call \ +create_goal and update_goal in the same response; create the goal first, then continue in a later \ +model turn before marking it complete or blocked. + +Use create_goal only when the user explicitly asks you to keep working toward a multi-step objective \ +across model-loop boundaries, or when their request clearly requires a persistent session objective. \ +Do not create goals for ordinary one-turn questions. + +Use get_goal when you need to inspect the current session goal. + +Use update_goal to mark the active goal complete only after the objective has actually been achieved. \ +Mark it blocked only when the same blocking condition has repeated across goal attempts and you cannot \ +make meaningful progress without user input or an external-state change. Do not mark a goal blocked \ +merely because the work is hard, slow, uncertain, incomplete, or would benefit from clarification. + +While a goal is active, a final answer is not enough. Either continue working, or call update_goal \ +with complete or blocked.\ +""" + +# Sentinel substring used to avoid injecting the long guidance more than once. +_GUIDANCE_MARKER = "You have access to session goal tools." + +# Nudge appended (as a user-role message) when re-running after a premature +# final response. ``attempt`` / ``max_retries`` / ``objective`` are filled in. +DEFAULT_NUDGE = """\ +[goal reminder] You marked your response as final, but the session goal is still active \ +(attempt {attempt} of {max_retries}). + +Active goal: +{objective} + +You must either continue working toward the goal, or call update_goal with status complete or \ +blocked. Use blocked only when the same blocking condition has repeated across goal attempts and \ +you cannot make meaningful progress without user input or an external-state change. Do not produce \ +a final answer while the goal remains active.\ +""" diff --git a/trpc_agent_sdk/tools/goal_tools/_setup.py b/trpc_agent_sdk/tools/goal_tools/_setup.py new file mode 100644 index 00000000..5c1abe79 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_setup.py @@ -0,0 +1,258 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""Enforcement for the Goal capability: guidance / nudge injection, premature +final-response interception + same-invocation re-run, and the ``setup_goal`` +assembly helper. + +Phase-1 core. While a goal is ``active``: + - ``before_model`` injects the guidance (once per request) and, if a re-run + was requested, appends a user-role nudge to the request. + - ``after_model`` watches each stream chunk; on a *premature* final response + (looks final, no tool call, no error) it suppresses the finalisation, + schedules a nudge and flips the agent loop's ``running`` flag back to + ``True`` so :class:`LlmAgent` re-runs within the same invocation. A + ``max_retries`` budget guarantees fail-open (the loop never spins forever). + +Counters live in invocation-scoped ``agent_context`` metadata, so they reset +naturally when the next ``Runner.run_async`` starts and are never persisted. +""" + +from __future__ import annotations + +from dataclasses import dataclass +from typing import TYPE_CHECKING +from typing import Any +from typing import Callable +from typing import List +from typing import Literal +from typing import Optional + +from pydantic import BaseModel + +from trpc_agent_sdk.context import InvocationContext +from trpc_agent_sdk.log import logger +from trpc_agent_sdk.models import LlmRequest +from trpc_agent_sdk.models import LlmResponse +from trpc_agent_sdk.types import Content +from trpc_agent_sdk.types import Part + +from ._goal_toolset import GoalToolSet +from ._helpers import DEFAULT_STATE_KEY_PREFIX +from ._helpers import decode_goal +from ._helpers import state_key +from ._models import GoalRecord +from ._models import GoalStatus +from ._prompt import DEFAULT_GUIDANCE +from ._prompt import DEFAULT_NUDGE +from ._prompt import _GUIDANCE_MARKER + +if TYPE_CHECKING: + from trpc_agent_sdk.agents import LlmAgent + +# Invocation-scoped metadata keys (stored in ``agent_context.metadata``). +_RETRY_COUNT_KEY = "__goal_enforce_retry_count" +_REMINDER_PENDING_KEY = "__goal_enforce_reminder_pending" +# Tracks whether *we* flipped the loop's running flag, so we only ever reset +# what we set (never clobbering another feature's re-run request). +_RERUN_ARMED_KEY = "__goal_enforce_rerun_armed" + +OnRetry = Callable[["RetryEvent"], None] + + +class RetryEvent(BaseModel): + """Observability payload emitted on every interception / budget exhaustion.""" + + reason: Literal["blocked", "exhausted"] + agent_name: str + goal: GoalRecord + attempt_number: int + max_retries: int + + +@dataclass +class GoalOptions: + """Configuration for the goal capability. + + Attributes: + state_key_prefix: Session state-key prefix (``goal`` by default). Avoid + ``temp:`` — that prefix is invocation-only and is never persisted. + inject_guidance: Inject :data:`DEFAULT_GUIDANCE` into the system + instruction once per request while a goal is active. + guidance: The guidance text to inject. + max_retries: Interception budget. Once this many premature finals have + been intercepted in one invocation, the next one is let through + (fail-open) so the loop never spins forever. + nudge_template: User-role reminder template; receives ``attempt``, + ``max_retries`` and ``objective``. + on_retry: Optional observability callback. Invoked with try/except so + a faulty callback never breaks the main flow. + """ + + state_key_prefix: str = DEFAULT_STATE_KEY_PREFIX + inject_guidance: bool = True + guidance: str = DEFAULT_GUIDANCE + max_retries: int = 3 + nudge_template: str = DEFAULT_NUDGE + on_retry: Optional[OnRetry] = None + + def toolset(self) -> GoalToolSet: + """Build a :class:`GoalToolSet` matching these options.""" + return GoalToolSet(state_key_prefix=self.state_key_prefix) + + +class _GoalCallbacks: + """A pair of model callbacks implementing goal enforcement.""" + + def __init__(self, opts: GoalOptions) -> None: + self._opts = opts + + # -- helpers ------------------------------------------------------------- + def _resolve_branch(self, ctx: InvocationContext) -> str: + return ctx.branch or ctx.agent_name or "" + + def _state_key(self, ctx: InvocationContext) -> str: + return state_key(self._opts.state_key_prefix, self._resolve_branch(ctx)) + + def _load_goal(self, ctx: InvocationContext) -> Optional[GoalRecord]: + return decode_goal(ctx.state.get(self._state_key(ctx))) + + def _emit(self, ctx: InvocationContext, goal: GoalRecord, reason: str, attempt: int) -> None: + callback = self._opts.on_retry + if callback is None: + return + try: + callback( + RetryEvent( + reason=reason, # type: ignore[arg-type] + agent_name=ctx.agent_name, + goal=goal, + attempt_number=attempt, + max_retries=self._opts.max_retries, + )) + except Exception as ex: # pylint: disable=broad-except + logger.warning("goal on_retry callback raised, ignoring: %s", ex) + + @staticmethod + def _is_premature_final(response: Optional[LlmResponse]) -> bool: + """Whether ``response`` is a *final* chunk that prematurely ends the turn. + + A final chunk = not partial, no error, carries visible (non-thought) + text, and is NOT a tool-call / tool-response. Intermediate partial + chunks and tool calls always pass through untouched. + """ + if response is None or response.partial or response.error_code: + return False + content = response.content + if content is None or not content.parts: + return False + has_text = False + for part in content.parts: + if part.function_call or part.function_response: + return False + if getattr(part, "code_execution_result", None) or getattr(part, "executable_code", None): + return False + if part.text and not getattr(part, "thought", False): + has_text = True + return has_text + + # -- callbacks ----------------------------------------------------------- + async def before_model(self, ctx: InvocationContext, request: LlmRequest) -> Optional[LlmResponse]: + """Inject guidance (once) and a pending nudge; never short-circuits.""" + meta = ctx.agent_context.metadata + # Consume any re-run we armed last turn: reset the loop's running flag + # to its default (False) so the loop only continues if THIS turn arms it + # again (or produces a real tool call). We only touch the flag we set. + if meta.get(_RERUN_ARMED_KEY): + from trpc_agent_sdk.agents._constants import TRPC_AGENT_RUNNING_KEY + ctx.agent_context.with_metadata(TRPC_AGENT_RUNNING_KEY, False) + meta[_RERUN_ARMED_KEY] = False + + if self._opts.inject_guidance: + existing = "" + if request.config and request.config.system_instruction: + existing = str(request.config.system_instruction) + if _GUIDANCE_MARKER not in existing: + request.append_instructions([self._opts.guidance]) + + goal = self._load_goal(ctx) + if goal is None or goal.status != GoalStatus.ACTIVE: + return None + + if meta.get(_REMINDER_PENDING_KEY): + attempt = int(meta.get(_RETRY_COUNT_KEY, 0)) + nudge = self._opts.nudge_template.format( + attempt=attempt, + max_retries=self._opts.max_retries, + objective=goal.objective, + ) + request.contents.append(Content(role="user", parts=[Part.from_text(text=nudge)])) + meta[_REMINDER_PENDING_KEY] = False + return None + + async def after_model(self, ctx: InvocationContext, response: Any) -> Optional[LlmResponse]: + """Intercept a premature final response and request a same-invocation re-run.""" + if not isinstance(response, LlmResponse): + return None + if not self._is_premature_final(response): + return None + + goal = self._load_goal(ctx) + if goal is None or goal.status != GoalStatus.ACTIVE: + return None + + meta = ctx.agent_context.metadata + retry = int(meta.get(_RETRY_COUNT_KEY, 0)) + + if retry >= self._opts.max_retries: + # Budget exhausted: fail-open. Let the final response through and + # reset counters so the loop ends naturally. + self._emit(ctx, goal, "exhausted", retry) + meta[_RETRY_COUNT_KEY] = 0 + meta[_REMINDER_PENDING_KEY] = False + return None + + retry += 1 + meta[_RETRY_COUNT_KEY] = retry + meta[_REMINDER_PENDING_KEY] = True + meta[_RERUN_ARMED_KEY] = True + self._emit(ctx, goal, "blocked", retry) + + # The only legal lever for a same-invocation re-run: flip the loop's + # running flag back to True (read at the tail of LlmAgent's while loop). + # ``before_model`` resets it next turn (see _RERUN_ARMED_KEY) so the loop + # ends naturally once the model stops giving premature finals. + from trpc_agent_sdk.agents._constants import TRPC_AGENT_RUNNING_KEY + ctx.agent_context.with_metadata(TRPC_AGENT_RUNNING_KEY, True) + + # Replace the premature final with a content-less, partial control + # response so the finalisation text is not committed as the answer. + return LlmResponse(content=None, partial=True, custom_metadata={"goal_enforced": True}) + + +def _chain_callbacks(existing: Any, new: Callable) -> List[Callable]: + """Append ``new`` after any existing callback(s), preserving order.""" + if existing is None: + return [new] + if isinstance(existing, list): + return [*existing, new] + return [existing, new] + + +def setup_goal(agent: "LlmAgent", opts: Optional[GoalOptions] = None) -> "LlmAgent": + """Mount the goal capability on ``agent`` in one call. + + Appends the :class:`GoalToolSet` to ``agent.tools`` and chains the + enforcement callbacks onto ``before_model_callback`` / ``after_model_callback`` + (B1: tools and callbacks are registered separately). + + Returns the same ``agent`` for chaining. + """ + opts = opts or GoalOptions() + callbacks = _GoalCallbacks(opts) + agent.tools.append(opts.toolset()) + agent.before_model_callback = _chain_callbacks(agent.before_model_callback, callbacks.before_model) + agent.after_model_callback = _chain_callbacks(agent.after_model_callback, callbacks.after_model) + return agent diff --git a/trpc_agent_sdk/tools/goal_tools/_store.py b/trpc_agent_sdk/tools/goal_tools/_store.py new file mode 100644 index 00000000..ad9ee427 --- /dev/null +++ b/trpc_agent_sdk/tools/goal_tools/_store.py @@ -0,0 +1,70 @@ +# Tencent is pleased to support the open source community by making tRPC-Agent-Python available. +# +# Copyright (C) 2026 Tencent. All rights reserved. +# +# tRPC-Agent-Python is licensed under Apache-2.0. +"""In-session create / transition logic for a single :class:`GoalRecord`. + +All functions operate on an in-memory ``Optional[GoalRecord]``; persistence is +the caller's responsibility (tools write the serialised goal back through +``tool_context.state``). They enforce the 3-state contract: + + - at most **one** ``active`` goal per branch (``create`` rejects a duplicate); + - terminal states (``complete`` / ``blocked``) are irreversible + (``transition`` rejects when there is no active goal). +""" + +from __future__ import annotations + +import uuid +from typing import Optional +from typing import Tuple + +from ._models import GoalRecord +from ._models import GoalStatus + + +def apply_create( + existing: Optional[GoalRecord], + *, + objective: str, + now_unix: int, +) -> Tuple[Optional[GoalRecord], Optional[str]]: + """Create a new ``active`` goal. + + Returns ``(record, None)`` on success or ``(None, error)`` when an + ``active`` goal already exists. + """ + if existing is not None and existing.status == GoalStatus.ACTIVE: + return None, "an active goal already exists; complete or block it before creating a new one" + record = GoalRecord( + id=uuid.uuid4().hex, + objective=objective, + status=GoalStatus.ACTIVE, + created_at_unix=now_unix, + updated_at_unix=now_unix, + ) + return record, None + + +def apply_transition( + existing: Optional[GoalRecord], + *, + status: GoalStatus, + now_unix: int, +) -> Tuple[Optional[GoalRecord], Optional[str]]: + """Move the active goal into a terminal state (``complete`` / ``blocked``). + + Returns ``(record, None)`` on success or ``(None, error)`` when there is no + active goal, the goal is already terminal, or ``status`` is not terminal. + """ + if status not in (GoalStatus.COMPLETE, GoalStatus.BLOCKED): + return None, "status must be 'complete' or 'blocked'" + if existing is None: + return None, "no goal exists to update" + if existing.status != GoalStatus.ACTIVE: + return None, f"goal is already terminal (status={existing.status.value}) and cannot be changed" + existing.status = status + existing.updated_at_unix = now_unix + existing.terminal_at_unix = now_unix + return existing, None