Skip to content

plexusone/omniagent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

182 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OmniAgent

Go CI Go Lint Go SAST Go Report Card Docs Docs Visualization License

Your AI representative across communication channels.

OmniAgent is a personal AI assistant that routes messages across multiple communication platforms, processes them via an AI agent, and responds on your behalf.

Features

  • 💬 Multi-Channel Support - Telegram, Discord, Slack, WhatsApp, and more
  • 🤖 AI-Powered Responses - Powered by omnillm (Claude, GPT, Gemini, etc.)
  • 🎤 Voice Notes - Transcribe incoming voice, respond with synthesized speech via OmniVoice
  • 🧩 Skills System - Markdown skills (OpenClaw compatible) and compiled Go skills
  • 💾 Persistent Sessions - Conversation history with SQLite storage via omnistorage-core
  • Scheduled Jobs - Cron expressions, intervals, and one-time job scheduling
  • 🔒 Secure Sandboxing - WASM and Docker isolation with GPU passthrough
  • 🌐 Browser Automation - Built-in browser control with dialog handling via Rod
  • 🔌 WebSocket Gateway - Real-time control plane with tools RPC endpoint
  • 📊 Observability - Integrated tracing via omniobserve
  • 🎭 Agent Profiles - Bootstrap profiles and lean mode for resource optimization
  • 🛡️ Access Policies - Per-sender tool access control and channel conformance
  • 🔐 Vault Credentials - Secure credential storage via 1Password, Bitwarden, Keeper

Installation

go install github.com/plexusone/omniagent/cmd/omniagent@latest

Quick Start

WhatsApp + OpenAI

The fastest way to get started is with WhatsApp and OpenAI:

# Set your OpenAI API key
export OPENAI_API_KEY="sk-..."

# Run with WhatsApp enabled
OMNIAGENT_AGENT_PROVIDER=openai \
OMNIAGENT_AGENT_MODEL=gpt-4o \
WHATSAPP_ENABLED=true \
omniagent gateway run

A QR code will appear in your terminal. Scan it with WhatsApp (Settings -> Linked Devices -> Link a Device) to connect.

Configuration File

For more control, create a configuration file:

# omniagent.yaml
gateway:
  address: "127.0.0.1:18789"

agent:
  provider: openai          # or: anthropic, gemini
  model: gpt-4o             # or: claude-sonnet-4-20250514, gemini-2.0-flash
  api_key: ${OPENAI_API_KEY}
  system_prompt: "You are OmniAgent, responding on behalf of the user."

channels:
  whatsapp:
    enabled: true
    db_path: "whatsapp.db"  # Session storage

  telegram:
    enabled: false
    token: ${TELEGRAM_BOT_TOKEN}

  discord:
    enabled: false
    token: ${DISCORD_BOT_TOKEN}

  twilio_sms:
    enabled: false
    account_sid: ${TWILIO_ACCOUNT_SID}
    auth_token: ${TWILIO_AUTH_TOKEN}
    phone_number: ${TWILIO_PHONE_NUMBER}
    webhook_path: /webhook/twilio/sms

voice:
  enabled: true
  response_mode: auto        # auto, always, never
  stt:
    provider: deepgram
    model: nova-2
  tts:
    provider: deepgram
    model: aura-asteria-en
    voice_id: aura-asteria-en

skills:
  enabled: true
  paths:                     # Additional skill directories
    - ~/.omniagent/skills
  max_injected: 20           # Max skills to inject into prompt

Run with the config file:

omniagent gateway run --config omniagent.yaml

Skills

OmniAgent supports skills compatible with the OpenClaw SKILL.md format. Skills extend the agent's capabilities by injecting domain-specific instructions into the system prompt.

Managing Skills

# List all discovered skills
omniagent skills list

# Show details for a specific skill
omniagent skills info sonoscli

# Check requirements for all skills
omniagent skills check

Skill Format

Skills are defined in SKILL.md files with YAML frontmatter:

---
name: weather
description: Get weather forecasts
metadata:
  emoji: "🌤️"
  requires:
    bins: ["curl"]
  install:
    - name: curl
      brew: curl
      apt: curl
---

# Weather Skill

You can check the weather using the `curl` command...

Skill Discovery

Skills are discovered from:

  1. Built-in skills directory
  2. ~/.omniagent/skills/
  3. Custom paths via skills.paths config

Skills with missing requirements (binaries, env vars) are automatically skipped.

Compiled Skills

For better performance and type safety, register Go functions as LLM tools:

import (
    "github.com/plexusone/omniagent/agent"
    "github.com/plexusone/omniagent/skills/compiled"
)

// Create a skill with tools
type WeatherSkill struct{}

func (s *WeatherSkill) Name() string        { return "weather" }
func (s *WeatherSkill) Description() string { return "Weather forecasts" }
func (s *WeatherSkill) Tools() []compiled.Tool {
    return []compiled.Tool{{
        Name:        "get_weather",
        Description: "Get weather for a location",
        Parameters: map[string]compiled.Parameter{
            "location": {Type: "string", Required: true},
        },
        Handler: func(ctx context.Context, params map[string]any) (any, error) {
            return fetchWeather(params["location"].(string))
        },
    }}
}
func (s *WeatherSkill) Init(ctx context.Context) error { return nil }
func (s *WeatherSkill) Close() error                   { return nil }

// Register with agent
agent.New(config, agent.WithCompiledSkill(&WeatherSkill{}))

Remote Skills

Remote skills connect to external services and expose their capabilities as agent tools.

MCP Skills - Spawn external MCP servers and expose their tools:

import "github.com/plexusone/omniagent/skills/remote/mcp"

agent, err := agent.New(config,
    agent.WithMCPSkill(mcp.Config{
        Name:    "github",
        Command: []string{"npx", "-y", "@modelcontextprotocol/server-github"},
        Env: map[string]string{
            "GITHUB_TOKEN": os.Getenv("GITHUB_TOKEN"),
        },
    }),
)

OpenAPI Skills - Parse OpenAPI 3.x specs and expose operations as tools:

import openapi "github.com/plexusone/omniagent/skills/remote/openapi"

agent, err := agent.New(config,
    agent.WithOpenAPISkill(openapi.Config{
        Name:    "petstore",
        SpecURL: "https://petstore3.swagger.io/api/v3/openapi.json",
        Auth: openapi.AuthConfig{
            Type:   openapi.AuthBearer,
            Token:  os.Getenv("API_TOKEN"),
        },
    }),
)

See the Skills Guide for configuration options.

Sessions

OmniAgent supports persistent conversation sessions:

import (
    "github.com/plexusone/omniagent/agent"
    "github.com/plexusone/omnistorage-core/kvs/backend/sqlite"
)

// Create storage backend
backend, _ := sqlite.New(sqlite.Config{Path: "omniagent.db"})

// Create agent with sessions
a, _ := agent.New(config,
    agent.WithSessionsFromStorage(backend),
)

// Process with conversation history
response1, _ := a.ProcessWithSession(ctx, "user-123", "My name is Alice")
response2, _ := a.ProcessWithSession(ctx, "user-123", "What's my name?")
// Agent remembers: "Your name is Alice"

See Sessions Guide for details.

Scheduled Jobs

OmniAgent supports scheduled job execution via the cron package:

import (
    "github.com/plexusone/omniagent/agent"
    "github.com/plexusone/omnistorage-core/kvs/backend/sqlite"
)

// Create agent with cron support
backend, _ := sqlite.New(sqlite.Config{Path: "omniagent.db"})
a, _ := agent.New(config,
    agent.WithSessionsFromStorage(backend),
    agent.WithCronScheduler(),
)

The LLM can then create scheduled jobs via tool calls:

Tool Description
cron_create Create a new scheduled job
cron_list List all jobs (filterable by status)
cron_get Get job details
cron_delete Delete a job
cron_enable Enable a disabled job
cron_disable Disable without deleting
cron_trigger Run job immediately

Schedule types:

  • Cron expressions: 0 0 9 * * * (9am daily, with seconds)
  • Intervals: 1h, 30m, 24h
  • One-time: RFC3339 timestamp for single execution

Action types:

  • send_message - Send a message to a session
  • call_webhook - Make an HTTP request
  • call_tool - Invoke a registered tool

See Cron Guide for details.

Sandboxing

OmniAgent provides layered security for tool execution:

App-Level Permissions

Capability-based permissions control what tools can do:

  • fs_read - Read files from allowed paths
  • fs_write - Write files to allowed paths
  • net_http - Make HTTP requests to allowed hosts
  • exec_run - Execute allowed commands

Docker Isolation

For OS-level isolation, tools can run inside Docker containers:

sandbox, _ := sandbox.NewDockerSandbox(ctx, sandbox.DockerConfig{
    Image:       "alpine:latest",
    NetworkMode: "none",           // No network access
    CapDrop:     []string{"ALL"},  // Drop all capabilities
    Mounts: []sandbox.DockerMount{
        {HostPath: "/tmp/data", ContainerPath: "/data", ReadOnly: true},
    },
}, &appConfig)

result, _ := sandbox.Run(ctx, "cat", []string{"/data/file.txt"})

GPU Passthrough

For GPU-accelerated workloads, enable NVIDIA GPU passthrough:

sandbox, _ := sandbox.NewDockerSandbox(ctx, sandbox.DockerConfig{
    Image: "nvidia/cuda:12.0-base",
    GPU: &sandbox.GPUConfig{
        Enabled:      true,
        DeviceIDs:    []string{"0"},
        Capabilities: []string{"compute", "utility"},
    },
})

WASM Runtime

For lightweight isolation, tools can run in a WASM sandbox (wazero):

runtime, _ := sandbox.NewRuntime(ctx, sandbox.Config{
    Capabilities:  []sandbox.Capability{sandbox.CapFSRead},
    MemoryLimitMB: 16,
    Timeout:       30 * time.Second,
    AllowedPaths:  []string{"/tmp/data"},
})

Agent Profiles

Profiles customize agent behavior for different use cases:

import "github.com/plexusone/omniagent/agent/profiles"

profile := &profiles.BootstrapProfile{
    Name:               "customer-support",
    SystemPromptPrefix: "You are a customer support agent.\n",
    AllowedTools:       []string{"search_kb", "create_ticket"},
    DeniedTools:        []string{"shell", "browser"},
}

a, _ := agent.New(config, agent.WithProfile(profile))

Lean Mode

Optimize for constrained environments:

leanMode := profiles.NewLeanMode(profiles.LeanLevelModerate)
a, _ := agent.New(config, agent.WithLeanMode(leanMode))
Level Memory Reduction Use Case
Off None Default operation
Light ~15% Slightly constrained
Moderate ~35% Mobile/embedded
Aggressive ~60% Severely constrained

See Agent Profiles Guide for details.

Access Policies

Tool Policies

Control which tools are available per sender:

import "github.com/plexusone/omniagent/tools/policy"

manager := policy.NewManager()
manager.SetPolicy("guest", &policy.Policy{
    AllowedTools: []string{"search", "weather"},
    DeniedTools:  []string{"shell", "browser"},
    RateLimit: &policy.RateLimit{
        MaxCalls: 10,
        Window:   time.Minute,
    },
})

Channel Policies

Validate messages against content rules:

import "github.com/plexusone/omniagent/channels/policy"

checker := policy.NewConformanceChecker(config)
checker.AddRule(policy.ConformanceRule{
    Name:    "rate-limit",
    Action:  policy.ActionRateLimit,
    RateLimit: &policy.RateLimit{MaxMessages: 60, Window: time.Minute},
})

See Access Policies Guide for details.

Environment Variables

Variable Description
OPENAI_API_KEY OpenAI API key
ANTHROPIC_API_KEY Anthropic API key
GEMINI_API_KEY Google Gemini API key
OMNIAGENT_AGENT_PROVIDER LLM provider: openai, anthropic, gemini
OMNIAGENT_AGENT_MODEL Model name (e.g., gpt-4o, claude-sonnet-4-20250514)
WHATSAPP_ENABLED Set to true to enable WhatsApp
WHATSAPP_DB_PATH WhatsApp session storage path
TELEGRAM_BOT_TOKEN Telegram bot token (auto-enables Telegram)
DISCORD_BOT_TOKEN Discord bot token (auto-enables Discord)
TWILIO_ACCOUNT_SID Twilio Account SID (auto-enables SMS)
TWILIO_AUTH_TOKEN Twilio Auth Token
TWILIO_PHONE_NUMBER Twilio phone number in E.164 format
TWILIO_WEBHOOK_PATH SMS webhook path (default: /webhook/twilio/sms)
SERPER_API_KEY Serper API key for web search
SERPAPI_API_KEY SerpAPI key for web search (alternative)
DEEPGRAM_API_KEY Deepgram API key for voice STT/TTS
OMNIAGENT_VOICE_ENABLED Set to true to enable voice processing
OMNIAGENT_VOICE_RESPONSE_MODE Voice response mode: auto, always, never

Vault-Backed Credentials

OmniAgent supports storing credentials in password managers via omnivault and omnitoken.

Supported Vault Providers

Provider URI Scheme Environment Variable
1Password op:// OP_SERVICE_ACCOUNT_TOKEN
Bitwarden bw:// BW_ACCESS_TOKEN, BW_ORGANIZATION_ID
Keeper keeper:// KSM_TOKEN or KSM_CONFIG
File file:// -
Environment env:// -

Static Credentials

API keys and tokens can be stored in vaults instead of config files:

# omniagent.yaml
agent:
  provider: anthropic
  model: claude-sonnet-4-20250514
  api_key: "op://MyVault/anthropic/api-key"  # Resolved from 1Password

channels:
  telegram:
    enabled: true
    token: "bw://org-id/telegram-bot-token"  # Resolved from Bitwarden

  discord:
    enabled: true
    token: "keeper://Discord Bot/token"      # Resolved from Keeper

voice:
  enabled: true
  stt:
    provider: deepgram
    api_key: "op://MyVault/deepgram/api-key"
  tts:
    provider: deepgram
    api_key: "op://MyVault/deepgram/api-key"

Credentials are resolved once at startup. Plain string values still work for development.

OAuth Token Management

For services requiring OAuth token refresh (Google, Zoom, RingCentral), use the tokens configuration:

# omniagent.yaml
tokens:
  vault_uri: "op://MyVault"
  services:
    google:
      credentials_name: "google-service-account"
      scopes:
        - "https://www.googleapis.com/auth/calendar"
    zoom:
      credentials_name: "zoom-oauth"
    ringcentral:
      credentials_name: "ringcentral-oauth"

The token manager handles:

  1. In-memory token caching
  2. Automatic refresh when tokens expire
  3. Vault coordination for multi-process deployments
  4. Refresh token persistence

Vault Environment Variables

Variable Provider Description
OP_SERVICE_ACCOUNT_TOKEN 1Password Service account token (starts with ops_)
BW_ACCESS_TOKEN Bitwarden Access token
BW_ORGANIZATION_ID Bitwarden Organization ID
BW_API_URL Bitwarden Custom API URL (self-hosted)
BW_IDENTITY_URL Bitwarden Custom Identity URL (self-hosted)
KSM_TOKEN Keeper One-time token (format: REGION:TOKEN)
KSM_CONFIG Keeper Base64-encoded config JSON
KSM_CONFIG_FILE Keeper Path to config file

CLI Commands

# Gateway
omniagent gateway run      # Start the gateway server

# Skills
omniagent skills list      # List all discovered skills
omniagent skills info NAME # Show skill details
omniagent skills check     # Validate skill requirements

# Channels
omniagent channels list    # List registered channels
omniagent channels status  # Show channel connection status

# Config
omniagent config show      # Display current configuration

# Version
omniagent version          # Show version information

Architecture

+-------------------------------------------------------------+
|                     Messaging Channels                      |
|     Telegram  |  Discord  |  Slack  |  WhatsApp  |  ...     |
+---------------------------+---------------------------------+
                            |
+---------------------------v---------------------------------+
|              Gateway (WebSocket Control Plane)              |
|              ws://127.0.0.1:18789                           |
+---------------------------+---------------------------------+
                            |
+---------------------------v---------------------------------+
|                      Agent Runtime                          |
|  +------------------+  +------------------+                 |
|  |    Skills        |  |    Sandbox       |                 |
|  |  (SKILL.md)      |  |  (WASM/Docker)   |                 |
|  +------------------+  +------------------+                 |
|  - omnillm (LLM providers)                                  |
|  - omnivoice (STT/TTS)                                      |
|  - omniobserve (tracing)                                    |
|  - Tools (browser, shell, http)                             |
+-------------------------------------------------------------+

Configuration Reference

Gateway

Field Type Default Description
gateway.address string 127.0.0.1:18789 WebSocket server address
gateway.read_timeout duration 30s Read timeout
gateway.write_timeout duration 30s Write timeout
gateway.ping_interval duration 30s WebSocket ping interval

Agent

Field Type Default Description
agent.provider string anthropic LLM provider
agent.model string claude-sonnet-4-20250514 Model name
agent.api_key string - API key (or use env var)
agent.temperature float 0.7 Sampling temperature
agent.max_tokens int 4096 Max response tokens
agent.system_prompt string - Custom system prompt

Skills

Field Type Default Description
skills.enabled bool true Enable skill loading
skills.paths []string [] Additional skill directories
skills.disabled []string [] Skills to skip
skills.max_injected int 20 Max skills in prompt

Voice

Field Type Default Description
voice.enabled bool false Enable voice processing
voice.response_mode string auto auto, always, never
voice.stt.provider string - STT provider (e.g., deepgram)
voice.tts.provider string - TTS provider (e.g., deepgram)

Omni* Library Ecosystem

OmniAgent is built on a modular ecosystem of omni* libraries:

                              OmniAgent
                          (Agent Runtime)
    ┌────────┬────────┬────────┬────────┬────────┬────────┐
    ▼        ▼        ▼        ▼        ▼        ▼        ▼
omnichat  omnillm  omnivoice omniobserve omniserp omnistorage ...
    │         │         │                           │
    │    ┌────┴────┐ ┌──┴──┐              ┌─────────┴────────┐
    │    │         │ │     │              │                  │
    ▼    ▼         ▼ ▼     ▼              ▼                  ▼
      omnillm-core   omnivoice-core    omnistorage-core
                                       ├── /object (files)
                                       └── /kvs (sessions)
    │         │         │                           │
    └─────────┴─────────┴───────────────────────────┘
                        │
              Provider Modules
    ┌───────────────────┼───────────────────┐
    ▼                   ▼                   ▼
omni-aws           omni-google         omni-github
├── /omnillm       ├── /omnillm        └── /omnistorage
├── /omnistorage   └── /omnistorage
└── /omnivoice

See Architecture Overview for detailed documentation.

Dependencies

Omni* Libraries

Package Purpose
omnichat Unified messaging (WhatsApp, Telegram, Discord)
omnillm Multi-provider LLM abstraction
omnivoice Voice STT/TTS interfaces
omniobserve LLM observability
omniserp Web search via Serper/SerpAPI
omnistorage-core Object and key-value storage
omnivault Secure credential storage
omnitoken OAuth token management

Infrastructure

Package Purpose
wazero WASM runtime for sandboxing
moby Docker SDK for container isolation
Rod Browser automation
gorilla/websocket WebSocket server

Related Projects

License

MIT License - see LICENSE for details.

About

Your AI representative across communication channels. OmniAgent is a personal AI assistant that routes messages across multiple communication platforms, processes them via an AI agent, and responds on your behalf.

Resources

License

Stars

Watchers

Forks

Contributors

Languages