Skip to content

Wire native model serving into agent inference path #11

@EightRice

Description

@EightRice

Goal

Let an agent execute a turn against the locally-served native model (LLM backbone with trained projection head, optionally VL-JEPA), instead of always routing through Claude / GPT / Gemini bridges.

Current state

  • LLM-backbone path (nodes/common/llm_backbone.py) downloads a frozen HF model and trains a small projection head + JEPA predictor on top. Used for training only today.
  • The agent framework (atn/providers/) supports Claude Max, Codex Max, Anthropic, OpenAI, Gemini, DeepSeek, plus custom OpenAI-compat endpoints. There is no provider that runs the local backbone.

Scope

  • Add atn/providers/local_backbone.py implementing the standard Provider interface using the loaded backbone for forward passes.
  • Decide how generation works (the backbone is encoder-shaped today — generation requires the LLM's native generate path or an ATN-side decoder loop).
  • Register as a provider option (e.g. local) discoverable via get_available_models.
  • Document the resource expectations (VRAM/RAM/disk per backbone choice).

Acceptance

  • A non-orchestrator cognitive agent can be created with provider=local, model=<one of the catalog entries>, and complete a turn end-to-end.
  • The LLM-backbone path remains usable for training simultaneously (shared model load, no re-download).

Metadata

Metadata

Assignees

No one assigned

    Labels

    track:agentATN runtime, providers, orchestrator, bridgestrack:trainingJEPA, LLM backbone, FedAvg, native modeltype:featureNew capability

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions