Wire native model serving into agent inference path

## Goal

Let an agent execute a turn against the locally-served native model (LLM backbone with trained projection head, optionally VL-JEPA), instead of always routing through Claude / GPT / Gemini bridges.

## Current state

- LLM-backbone path (`nodes/common/llm_backbone.py`) downloads a frozen HF model and trains a small projection head + JEPA predictor on top. Used for **training only** today.
- The agent framework (`atn/providers/`) supports Claude Max, Codex Max, Anthropic, OpenAI, Gemini, DeepSeek, plus custom OpenAI-compat endpoints. There is no provider that runs the local backbone.

## Scope

- Add `atn/providers/local_backbone.py` implementing the standard `Provider` interface using the loaded backbone for forward passes.
- Decide how generation works (the backbone is encoder-shaped today — generation requires the LLM's native generate path or an ATN-side decoder loop).
- Register as a provider option (e.g. `local`) discoverable via `get_available_models`.
- Document the resource expectations (VRAM/RAM/disk per backbone choice).

## Acceptance

- A non-orchestrator cognitive agent can be created with provider=`local`, model=`<one of the catalog entries>`, and complete a turn end-to-end.
- The LLM-backbone path remains usable for training simultaneously (shared model load, no re-download).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wire native model serving into agent inference path #11

Goal

Current state

Scope

Acceptance

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Wire native model serving into agent inference path #11

Description

Goal

Current state

Scope

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions