Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,19 @@ All notable changes to Agents.KT are documented here. The format follows [Keep a

## [Unreleased]

## [0.8.2] - 2026-07-01

_Standards & trust hardening._ Hardens the experimental x402 buyer (mandatory guardrails, cross-payment
session limits, a signer seam, CAIP-2 network ids) and adds machine-readable release-truth metadata.

### Added — release-truth: a single source of truth for version/provider/protocol claims (#4735)

An external audit found the advertised version + provider + protocol claims drifting across the README,
roadmap, comparison page, and POM (0.8.1 shipped while several surfaces still said 0.8.0/0.7.2). New
`release-metadata.yaml` is now the one place those claims live, and `ReleaseMetadataConsistencyTest` pins
the in-repo surfaces to it — a release that edits only the metadata fails the build until the prose moves.
Kills the drift class the de-slop epic (#3083) flagged.

### Changed — x402 buyer trust hardening: guardrails are now mandatory and bind more (#4528)

An external audit flagged that the "guardrails-first" buyer had **optional** guardrails (an empty
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ The 0.6–0.7 line turns those boundaries into reviewable evidence: deterministi
```kotlin
// build.gradle.kts
dependencies {
implementation("ai.deep-code:agents-kt:0.8.1")
implementation("ai.deep-code:agents-kt:0.8.2")
}
```

Expand Down Expand Up @@ -328,7 +328,7 @@ Topical guides:

## Current Release

The latest published release is `0.8.0` — **interoperable, multimodal agents, with capability grants** (the largest minor since 0.5.0). (Between releases, `main` carries unreleased work under a `-SNAPSHOT` version; see the [CHANGELOG](CHANGELOG.md) for what's landed since.) Highlights: **A2A v1** (agents are A2A servers + typed clients), full **multimodal** (vision across Claude/OpenAI/Ollama, audio STT/TTS tools with self-hosted Whisper/Qwen, image generation), an **eighth model provider — Google Gemini** (`model { gemini("gemini-2.5-flash") }`, a full from-scratch adapter with native SSE, function calling, `responseJsonSchema` decoding, and thought-summary reasoning), **capability grants** (`grants { allow(...); confirm(...) }` — `confirm` tools need the granting agent's authorization, fail-closed), **agent.json** serialization, the **RAG** seam, and richer **composition** (`handoff` / `firstOf` / `.speculative` / `loopUntil` / aggregators / forum captains) plus HITL gates and an eval harness. The Layer-2 **sandbox backends** (Docker / egress proxy / read confinement) move to **0.9.0**. Additive only — drop-in on the 0.x line. Dependency line: Kotlin 2.4.0.
The latest published release is `0.8.2` — **standards & trust hardening** (on the 0.8.1 agentic-web base, itself on the 0.8.0 base of interoperable, multimodal agents with capability grants). (Between releases, `main` carries unreleased work under a `-SNAPSHOT` version; see the [CHANGELOG](CHANGELOG.md) for what's landed since.) **0.8.2 hardened** the *experimental* **x402** buyer: guardrails are now **mandatory** (`X402SpendPolicy` is no longer defaultable), bind tighter (`allowedAssets` / `allowedResourceOrigins` / clamped authorization lifetime), select offers deterministically (cheapest permitted, not the seller's first), enforce **cross-payment session limits** (aggregate spend caps via `X402SpendStore`), and sign through an **`X402Signer`** seam (KMS / HSM / scoped session key — permanent keys never touch the app heap); it also accepts **CAIP-2 network ids** (`eip155:84532`) as a v2-interop step on the v1 wire. Plus **release-truth** — a machine-readable `release-metadata.yaml` as the single source of truth for version/provider/protocol claims, guarded by a consistency test so the surfaces can't drift again. **0.8.1 added** the agentic-web surfaces — **AGNTCY** (OASF export/import + Identity verify + DIR client), **AG-UI** serving (`AgUiServer.from(agent)`, incl. live REASONING + `TOOL_CALL_RESULT`), **NLWeb** serve + search, **x402** seller gate + the *experimental* guardrails-first buyer, audit-ledger misbehaviour folding, and default transient-network retry. On the **0.8.0** base: **A2A v1** (agents are A2A servers + typed clients), full **multimodal** (vision across Claude/OpenAI/Ollama, audio STT/TTS tools with self-hosted Whisper/Qwen, image generation), an **eighth model provider — Google Gemini** (`model { gemini("gemini-2.5-flash") }`, a full from-scratch adapter with native SSE, function calling, `responseJsonSchema` decoding, and thought-summary reasoning), **capability grants** (`grants { allow(...); confirm(...) }` — `confirm` tools need the granting agent's authorization, fail-closed), **agent.json** serialization, the **RAG** seam, and richer **composition** (`handoff` / `firstOf` / `.speculative` / `loopUntil` / aggregators / forum captains) plus HITL gates and an eval harness. The Layer-2 **sandbox backends** (Docker / egress proxy / read confinement) move to **0.9.0**. Additive only — drop-in on the 0.x line. Dependency line: Kotlin 2.4.0.

**0.7.23 — maintainability + a model-error policy.** A behavior-preserving, drop-in release that closes the bulk of the code-smell remediation epic (#2790) and finishes the **AgenticLoop decomposition** begun in 0.7.21: the `Agent` god class splits into `InterceptorChain` + `ListenerRegistry` (#2793); `McpServer` into HTTP intake + a transport-agnostic `McpDispatcher` (#2795); the five composition operators' duplicated streaming-session scaffold collapses into one `agentSessionScope` (#2797); `LiveShow`'s banner + thread-shadowing spinner become their own units (#2798); `Forum`/`Branch` lose their dual-path duplication (#2802); a typed `GenerableCodec` seam collapses the `@Generable` casts to one boundary (#2803); and `executeAgentic`'s last setup block extracts as `resolveAllowedTools` (#3423). The one **new public API** is **`onLLMError`** (#3508): when a model is configured, a failed model call in the agentic loop fails fast and loud by default, with `onLLMError { e -> RespondWith(fallback) | Rethrow }` as the opt-in recovery hook. The detekt-baseline ratchet fell 423 → 415 and `@Suppress("UNCHECKED_CAST")` 42 → 30 across the release. Only #2791 (the turn-loop core of `executeAgentic`) remains open in the epic.

Expand All @@ -338,7 +338,7 @@ The latest published release is `0.8.0` — **interoperable, multimodal agents,

**0.7.1 — verify-gate hardening.** The manifest `verify` gate compares policy **sets** (not coarse scores), so it catches widenings it previously missed (a host added within `hosts` mode, or a write glob broadened without changing the count), keyed per `agentName.toolName`; plus docs/KDoc drift fixes.

**0.7.0 — Boundaries you can enforce externally.** The 0.6 line made tool policies declarable and auditable; 0.7.0 makes them **enforced**. A tool's declared `ToolPolicy` now constrains it at runtime: Layer 1 (in-JVM filesystem-argument gate, #2890) plus Layer 2 OS sandboxing (#1916) — macOS Seatbelt, Linux bubblewrap, a firejail setuid fallback, and a plain-`ProcessBuilder` + loud `UNCONFINED` warning where no tool is present — confine subprocess-shaped tools to their declared write roots, an environment allow-list, a working directory, and a default-deny network. And the deterministic permission manifest is reachable **outside Gradle** via the standalone [`agents-kt` CLI](docs/cli.md) (`generate` / `inspect` / `verify`, #1923) — a drop-in CI gate that fails when a change widens a capability boundary. *Deferred to 0.8: `WasmSandbox` (#2894), `DockerSandbox` (#2895), the network hostname-allowlist proxy (#2893), and the `grants { }` structure DSL.*
**0.7.0 — Boundaries you can enforce externally.** The 0.6 line made tool policies declarable and auditable; 0.7.0 makes them **enforced**. A tool's declared `ToolPolicy` now constrains it at runtime: Layer 1 (in-JVM filesystem-argument gate, #2890) plus Layer 2 OS sandboxing (#1916) — macOS Seatbelt, Linux bubblewrap, a firejail setuid fallback, and a plain-`ProcessBuilder` + loud `UNCONFINED` warning where no tool is present — confine subprocess-shaped tools to their declared write roots, an environment allow-list, a working directory, and a default-deny network. And the deterministic permission manifest is reachable **outside Gradle** via the standalone [`agents-kt` CLI](docs/cli.md) (`generate` / `inspect` / `verify`, #1923) — a drop-in CI gate that fails when a change widens a capability boundary. *(Of the Layer-2 backends pencilled for later: the `grants { }` DSL shipped in 0.8.0 (#4545); `WasmSandbox` (#2894) was closed won't-do — embedded-WASM-for-tools isn't rational; `DockerSandbox` (#2895) and the network hostname-allowlist proxy (#2893) are now planned for 0.9.0.)*

**0.6.6 — Maintainability + cancellation (#2863 + epic #2790).** Session catch now distinguishes `CancellationException` (propagate per structured concurrency — no synthetic `Failed` event) from `TimeoutCancellationException` (real failure — keeps surfacing as `Failed`), plus 10 internal refactors (detekt baseline, `JsonEscape`/`JsonRpc` consolidation, `HttpModelClientSupport.sendBounded`, …). Additive only — every 0.6.5 caller compiles and runs unchanged.

Expand Down
9 changes: 7 additions & 2 deletions build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ plugins {
}

group = "ai.deep-code"
version = "0.8.2-SNAPSHOT"
version = "0.8.2"

repositories {
mavenCentral()
Expand Down Expand Up @@ -809,7 +809,12 @@ publishing {

pom {
name.set("Agents.KT")
description.set("Typed Kotlin DSL framework for AI agent systems")
description.set(
"The typed agent runtime for the JVM — bounded agent systems where authority is " +
"explicit before execution, enforced during execution, and evidenced afterward. " +
"Typed Agent<IN,OUT> contracts, least-privilege tools, permission manifests, and " +
"audit evidence by construction. MCP / A2A / AG-UI / NLWeb / x402 native.",
)
url.set("https://github.com/Deep-CodeAI/Agents.KT")

licenses {
Expand Down
2 changes: 1 addition & 1 deletion docs/comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ A few shortcuts that point at one framework over the others:

## Status notes (2026-06)

- **Agents.KT 0.8.0 (latest release)** — interoperable, multimodal agents with capability grants: **A2A v1** (server + typed client), full **multimodal** (audio STT/TTS, vision, image generation), the **RAG** seam, richer **composition** (`handoff` / `firstOf` / `.speculative` / `loopUntil` / aggregators / forum captains), **HITL** gates + an **eval** harness, an eighth model provider (**Google Gemini**), **agent.json** serialization, and the **capability-grants** DSL (`grants { allow / confirm }`) — on top of 0.7.x's runtime ToolPolicy enforcement (in-JVM gate + OS sandbox), the standalone `agents-kt` CLI, tamper-evident audit ledger, fail-loud skill routing, the `onLLMError` policy, the Perplexity connector + `perplexitySearch` grounded tool, and 0.6.0's permission manifests / JSONL audit export / OTel-LangSmith-Langfuse bridges / constrained decoding. *Deferred to 0.9.0:* the remaining Layer-2 sandbox backends (Docker / egress proxy / read confinement).
- **Agents.KT 0.8.2 (latest release)** — *standards & trust hardening.* Hardens the *experimental* **x402** buyer — guardrails are now **mandatory** (`X402SpendPolicy` no longer defaultable), bind tighter (`allowedAssets` / `allowedResourceOrigins` / clamped authorization lifetime), select the cheapest permitted offer, enforce **cross-payment session limits**, and sign through an **`X402Signer`** seam (KMS / HSM / scoped session key) — and accepts **CAIP-2 network ids** on the v1 wire; plus machine-readable **release-truth** metadata guarding version/provider/protocol claims from drift. On the **0.8.1** agentic-web base — the interop surfaces — **AGNTCY** (OASF export/import + Identity-badge verify + DIR Store/Search/Routing client), **AG-UI** serving (`AgUiServer.from(agent)` → typed SSE incl. live REASONING + `TOOL_CALL_RESULT`), **NLWeb** serve + search (an `/ask` compatibility slice), and **x402** payments (seller gate + *experimental* guardrails-first buyer) — plus audit-ledger misbehaviour folding and default transient-network retry. On the **0.8.0** base: interoperable, multimodal agents with capability grants — **A2A v1** (server + typed client), full **multimodal** (audio STT/TTS, vision, image generation), the **RAG** seam, richer **composition** (`handoff` / `firstOf` / `.speculative` / `loopUntil` / aggregators / forum captains), **HITL** gates + an **eval** harness, an eighth model provider (**Google Gemini**), **agent.json** serialization, and the **capability-grants** DSL (`grants { allow / confirm }`) — itself on 0.7.x's runtime ToolPolicy enforcement (in-JVM gate + OS sandbox), the standalone `agents-kt` CLI, tamper-evident audit ledger, fail-loud skill routing, the `onLLMError` policy, the Perplexity connector + `perplexitySearch` grounded tool, and 0.6.0's permission manifests / JSONL audit export / OTel-LangSmith-Langfuse bridges / constrained decoding. *Next (0.9.0):* typed multi-agent choreography, plus the remaining Layer-2 sandbox backends (Docker / egress proxy / read confinement).
- **LangChain 0.3.x** — stable, ecosystem mature. LCEL is the recommended composition surface.
- **Semantic Kernel 1.x** — stable, MCP integration in preview.
- **AutoGen 0.4.x** — major architectural rewrite landed; the new core/agentchat split is recent.
Expand Down
2 changes: 2 additions & 0 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
0.6.0 Boundaries you can audit — shipped (epic [#1911](../../issues/1911))
0.7.0 Boundaries you can enforce externally — shipped (epic [#2879](../../issues/2879))
0.8.0 Interoperable, multimodal agents (+ grants) — shipped (A2A v1, multimodal, RAG, composition, Gemini, capability grants)
0.8.1 The agentic web: discover, serve, get paid — shipped (AGNTCY, AG-UI, NLWeb, x402 seller + experimental buyer)
0.8.2 Standards & trust hardening — shipped (x402 buyer: mandatory guardrails + session limits + signer seam + CAIP-2 ids; release-truth metadata)
0.9.0 Layer-2 sandbox backends — next (Docker/proxy/read-confinement)
```

Expand Down
27 changes: 27 additions & 0 deletions release-metadata.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Single source of truth for release / version / provider / protocol claims.
#
# Why this file exists: version + provider + protocol claims were duplicated across the README, roadmap,
# comparison page, POM, and website, and drifted (0.8.1 shipped but several surfaces still said 0.8.0/0.7.2).
# `ReleaseMetadataConsistencyTest` validates the in-repo surfaces against this file so a single edit here
# keeps them honest. Update this file as part of every release (RELEASE_RUNBOOK step 6).
#
# Schema is intentionally flat + simple-scalar so a dependency-free test can parse it.

currentRelease: "0.8.2" # latest version resolvable on Maven Central
developmentVersion: "0.8.2-SNAPSHOT"
providers: 8 # Ollama, Claude, OpenAI, DeepSeek, Kimi, OpenRouter, Perplexity, Gemini

nextRelease:
version: "0.9.0"
theme: "choreography you can't deadlock (typed multi-agent coordination)"

# Protocol support is a SLICE, not necessarily a full implementation — state it honestly.
# status: shipped | partial | experimental ; support = the supported subset.
protocols:
mcp: { status: shipped, support: "client + server (McpServer.from(agent), McpRunner)" }
a2a: { status: shipped, version: "0.2", support: "server + typed client; message/send; capabilities.extensions" }
agui: { status: partial, support: "streaming subset (lifecycle/text/reasoning/tool incl. TOOL_CALL_RESULT/step); STATE + client-tool round-trips deferred" }
nlweb: { status: partial, targetVersion: "0.55", support: "/ask compatibility slice (single non-streaming response)" }
x402: { status: experimental, version: "1", support: "seller gate + buyer (mandatory guardrails, cross-payment session limits, X402Signer seam); accepts CAIP-2 network ids (v1 wire; v2 transport deferred)" }
agntcy: { status: shipped, support: "OASF export/import + Identity badge verify + DIR Store/Search/Routing client" }
ap2: { status: spiked, support: "feasibility spike (:agents-kt-ap2): mandate verify + x402 settle + AgentCard advertise" }
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
package agents_engine.core

import agents_engine.model.ModelProvider
import java.nio.file.Files
import java.nio.file.Path
import kotlin.test.Test
import kotlin.test.assertEquals
import kotlin.test.assertTrue

// Single-source-of-truth guard (#0.8.2 release-truth). An external audit found the release version + provider
// claims drifting across README / roadmap / comparison / site (0.8.1 shipped but several surfaces still said
// 0.8.0/0.7.2). `release-metadata.yaml` is now the one place those claims live; this test pins the in-repo
// surfaces to it, so the next release that edits only the metadata file fails the build until the prose moves.
class ReleaseMetadataConsistencyTest {

private fun read(relative: String): String {
val path = Path.of(relative)
assertTrue(Files.exists(path), "expected $relative to exist (tests run from the repo root)")
return Files.readString(path)
}

private val metadata = read("release-metadata.yaml")

// Top-level scalar: the line `key: value [# comment]` at column 0; strip inline comment + quotes.
private fun scalar(key: String): String {
val line = metadata.lineSequence().firstOrNull { it.startsWith("$key:") }
?: error("release-metadata.yaml is missing a top-level scalar '$key'")
return line.substringAfter(':').substringBefore('#').trim().trim('"')
}

private val currentRelease = scalar("currentRelease")
private val developmentVersion = scalar("developmentVersion")
private val providers = scalar("providers").toInt()

@Test
fun `provider count in metadata tracks ModelProvider entries`() {
assertEquals(
ModelProvider.entries.size, providers,
"release-metadata.yaml providers=$providers but ModelProvider has ${ModelProvider.entries.size} entries",
)
}

@Test
fun `gradle development version matches metadata`() {
val gradleVersion = Regex("""^version\s*=\s*"([^"]+)"""", RegexOption.MULTILINE)
.find(read("build.gradle.kts"))?.groupValues?.get(1) ?: error("no version in build.gradle.kts")
// On normal dev `main` the build is the -SNAPSHOT dev version; during a release commit it is exactly the
// currentRelease (checkReadmeVersion / checkSnapshotPolicy own that transition). Accept either.
assertTrue(
gradleVersion == developmentVersion || gradleVersion == currentRelease,
"build.gradle.kts version '$gradleVersion' is neither the metadata developmentVersion " +
"'$developmentVersion' nor currentRelease '$currentRelease'",
)
}

@Test
fun `README dependency snippet and Current Release both name the current release`() {
val readme = read("README.md")
assertTrue(
readme.contains("ai.deep-code:agents-kt:$currentRelease"),
"README dependency snippet must advertise the current release $currentRelease",
)
val currentReleaseSection = readme.substringAfter("## Current Release").substringBefore("\n## ")
assertTrue(
currentReleaseSection.contains(currentRelease),
"README 'Current Release' section must name $currentRelease (found stale content)",
)
}

@Test
fun `roadmap and comparison name the current release`() {
assertTrue(
read("docs/roadmap.md").contains(currentRelease),
"docs/roadmap.md must mention the current release $currentRelease",
)
val comparison = read("docs/comparison.md")
assertTrue(
comparison.contains("$currentRelease (latest release)"),
"docs/comparison.md must mark $currentRelease as the latest release",
)
}
}
Loading