diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 25b8091fa..baa4ad12d 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -44,7 +44,7 @@ To enhance LLM responses, LCore leverages **RAG (Retrieval-Augmented Generation) ``` ┌─────────────────────────────────────────────────────────┐ │ Client Applications │ -│ (Web UI, CLI, VS Code Extension, Mobile Apps, etc.) │ +│ (Web UI, CLI, VS Code Extension, Mobile Apps, etc.) │ └────────────────────┬────────────────────────────────────┘ │ REST/A2A/JSON-RPC ▼ @@ -150,14 +150,14 @@ LCore requires two main configuration files: **Authentication Providers:** -| Provider | Use Case | Token Handling | -|----------|----------|----------------| -| **No Auth** | Development, testing | No token (empty string) | -| **No Auth + Token** | Testing with token passthrough | Bearer token passed through | -| **Kubernetes** | K8s service accounts | K8s service account token validated and forwarded | -| **Red Hat SSO** | Red Hat environments | X-RH-Identity header (no separate token) | -| **API Key** | API key authentication | API key from Authorization header | -| **JWK/JWT** | JWT tokens | JWT validated and forwarded | +| Provider | Use Case | Token Handling | +|---------------------|--------------------------------|---------------------------------------------------| +| **No Auth** | Development, testing | No token (empty string) | +| **No Auth + Token** | Testing with token passthrough | Bearer token passed through | +| **Kubernetes** | K8s service accounts | K8s service account token validated and forwarded | +| **Red Hat SSO** | Red Hat environments | X-RH-Identity header (no separate token) | +| **API Key** | API key authentication | API key from Authorization header | +| **JWK/JWT** | JWT tokens | JWT validated and forwarded | **Authentication Result (AuthTuple):** @@ -427,12 +427,12 @@ LCore uses a multi-database strategy to optimize for different data access patte The system uses **4 separate databases** for different purposes: -| Database | Purpose | Technology | Size | -|----------|---------|------------|------| -| **User DB** | Conversation metadata | SQLAlchemy ORM | Small | -| **Cache DB** | Full conversation transcripts | psycopg2/sqlite3 | Large | -| **Quota DB** | Token usage and limits | psycopg2/sqlite3 | Medium | -| **A2A DB** | Agent-to-agent protocol state | SQLAlchemy async | Small | +| Database | Purpose | Technology | Size | +|--------------|-------------------------------|------------------|--------| +| **User DB** | Conversation metadata | SQLAlchemy ORM | Small | +| **Cache DB** | Full conversation transcripts | psycopg2/sqlite3 | Large | +| **Quota DB** | Token usage and limits | psycopg2/sqlite3 | Medium | +| **A2A DB** | Agent-to-agent protocol state | SQLAlchemy async | Small | ### 4.2 Why Separate Databases? diff --git a/docs/a2a_protocol.md b/docs/a2a_protocol.md index 3cbd8e027..af6648d8f 100644 --- a/docs/a2a_protocol.md +++ b/docs/a2a_protocol.md @@ -354,14 +354,14 @@ A2A Request ### Task States -| State | Description | -|-------|-------------| -| `submitted` | Task has been received and queued | -| `working` | Task is being processed | -| `completed` | Task finished successfully | -| `failed` | Task failed with an error | +| State | Description | +|------------------|--------------------------------------------| +| `submitted` | Task has been received and queued | +| `working` | Task is being processed | +| `completed` | Task finished successfully | +| `failed` | Task failed with an error | | `input_required` | Agent needs additional input from the user | -| `auth_required` | Authentication is required to continue | +| `auth_required` | Authentication is required to continue | ### Multi-Turn Conversations @@ -527,10 +527,10 @@ curl -X POST http://localhost:8090/a2a \ A2A messages support an optional `metadata` field that can be used to pass additional parameters to control request routing and behavior. The following metadata fields are supported: -| Field | Type | Description | -|-------|------|-------------| -| `model` | `string` | Specify the LLM model to use for this request (e.g., `"gpt-4"`, `"llama3.1"`) | -| `provider` | `string` | Specify the LLM provider to use (e.g., `"openai"`, `"watsonx"`) | +| Field | Type | Description | +|--------------------|----------------|--------------------------------------------------------------------------------------------------------| +| `model` | `string` | Specify the LLM model to use for this request (e.g., `"gpt-4"`, `"llama3.1"`) | +| `provider` | `string` | Specify the LLM provider to use (e.g., `"openai"`, `"watsonx"`) | | `vector_store_ids` | `list[string]` | Specify which vector stores to query for RAG. If not provided, all available vector stores are queried | #### Example: Using Metadata diff --git a/docs/byok_guide.md b/docs/byok_guide.md index 86acc1c80..34a628fb1 100644 --- a/docs/byok_guide.md +++ b/docs/byok_guide.md @@ -191,15 +191,15 @@ byok_rag: **`byok_rag` field reference:** -| Field | Required | Default | Description | -|-------|----------|---------|-------------| -| `rag_id` | Yes | — | Unique identifier for the knowledge source | -| `rag_type` | No | `inline::faiss` | Vector store provider type | -| `embedding_model` | No | `sentence-transformers/all-mpnet-base-v2` | Embedding model identifier or path | -| `embedding_dimension` | No | `768` | Embedding vector dimensionality | -| `vector_db_id` | Yes | — | Vector store ID generated by rag-content (e.g. `vs_8c94967b-81cc-4028-a294-9cfac6fd9ae2`) | -| `db_path` | Yes | — | Path to the vector database file | -| `score_multiplier` | No | `1.0` | Weight for Inline RAG ranking (values > 1.0 boost; < 1.0 reduce) | +| Field | Required | Default | Description | +|-----------------------|----------|-------------------------------------------|-------------------------------------------------------------------------------------------| +| `rag_id` | Yes | — | Unique identifier for the knowledge source | +| `rag_type` | No | `inline::faiss` | Vector store provider type | +| `embedding_model` | No | `sentence-transformers/all-mpnet-base-v2` | Embedding model identifier or path | +| `embedding_dimension` | No | `768` | Embedding vector dimensionality | +| `vector_db_id` | Yes | — | Vector store ID generated by rag-content (e.g. `vs_8c94967b-81cc-4028-a294-9cfac6fd9ae2`) | +| `db_path` | Yes | — | Path to the vector database file | +| `score_multiplier` | No | `1.0` | Weight for Inline RAG ranking (values > 1.0 boost; < 1.0 reduce) | **Multiple knowledge sources:** @@ -251,10 +251,10 @@ okp: Both modes can be enabled simultaneously. Choose based on your latency and control preferences: -| Mode | When context is fetched | Tool call needed | score_multiplier | -|------|------------------------|------------------|-----------------| -| Inline RAG | With every query | No | Yes (BYOK only) | -| Tool RAG | On LLM demand | Yes | No | +| Mode | When context is fetched | Tool call needed | score_multiplier | +|------------|-------------------------|------------------|------------------| +| Inline RAG | With every query | No | Yes (BYOK only) | +| Tool RAG | On LLM demand | Yes | No | > [!TIP] > A ready-to-use example combining BYOK and OKP is available at diff --git a/docs/config.md b/docs/config.md index 6a5001b9a..ca34b9cc2 100644 --- a/docs/config.md +++ b/docs/config.md @@ -23,10 +23,10 @@ Attributes: postgres: PostgreSQL database configuration for A2A state storage. -| Field | Type | Description | -|-------|------|-------------| -| sqlite | | SQLite database configuration for A2A state storage. | -| postgres | | PostgreSQL database configuration for A2A state storage. | +| Field | Type | Description | +|----------|------|----------------------------------------------------------| +| sqlite | | SQLite database configuration for A2A state storage. | +| postgres | | PostgreSQL database configuration for A2A state storage. | ## APIKeyTokenConfiguration @@ -35,9 +35,9 @@ Attributes: API Key Token configuration. -| Field | Type | Description | -|-------|------|-------------| -| api_key | string | | +| Field | Type | Description | +|---------|--------|-------------| +| api_key | string | | ## AccessRule @@ -46,10 +46,10 @@ API Key Token configuration. Rule defining what actions a role can perform. -| Field | Type | Description | -|-------|------|-------------| -| role | string | Name of the role | -| actions | array | Allowed actions for this role | +| Field | Type | Description | +|---------|--------|-------------------------------| +| role | string | Name of the role | +| actions | array | Allowed actions for this role | ## Action @@ -72,10 +72,10 @@ Attributes: never: Tool names that never require approval (pre-approved). -| Field | Type | Description | -|-------|------|-------------| +| Field | Type | Description | +|--------|-------|-------------------------------------------------------| | always | array | List of tool names that always require human approval | -| never | array | List of tool names that never require approval | +| never | array | List of tool names that never require approval | ## ApprovalsConfiguration @@ -90,10 +90,10 @@ Attributes: purposes before cleanup. -| Field | Type | Description | -|-------|------|-------------| +| Field | Type | Description | +|--------------------------|---------|-------------------------------------------------| | approval_timeout_seconds | integer | Seconds before pending approval requests expire | -| approval_retention_days | integer | Days to retain decided approvals before cleanup | +| approval_retention_days | integer | Days to retain decided approvals before cleanup | ## AuthenticationConfiguration @@ -102,17 +102,17 @@ Attributes: Authentication configuration. -| Field | Type | Description | -|-------|------|-------------| -| module | string | | -| skip_tls_verification | boolean | | +| Field | Type | Description | +|------------------------|---------|------------------------------------------------------| +| module | string | | +| skip_tls_verification | boolean | | | skip_for_health_probes | boolean | Skip authorization for readiness and liveness probes | -| skip_for_metrics | boolean | Skip authorization for the /metrics endpoint | -| k8s_cluster_api | string | | -| k8s_ca_cert_path | string | | -| jwk_config | | | -| api_key_config | | | -| rh_identity_config | | | +| skip_for_metrics | boolean | Skip authorization for the /metrics endpoint | +| k8s_cluster_api | string | | +| k8s_ca_cert_path | string | | +| jwk_config | | | +| api_key_config | | | +| rh_identity_config | | | ## AuthorizationConfiguration @@ -121,8 +121,8 @@ Authentication configuration. Authorization configuration. -| Field | Type | Description | -|-------|------|-------------| +| Field | Type | Description | +|--------------|-------|-------------------------------------| | access_rules | array | Rules for role-based access control | @@ -132,12 +132,12 @@ Authorization configuration. Microsoft Entra ID authentication attributes for Azure. -| Field | Type | Description | -|-------|------|-------------| -| tenant_id | string | | -| client_id | string | | -| client_secret | string | | -| scope | string | Azure Cognitive Services scope for token requests. Override only if using a different Azure service. | +| Field | Type | Description | +|---------------|--------|------------------------------------------------------------------------------------------------------| +| tenant_id | string | | +| client_id | string | | +| client_secret | string | | +| scope | string | Azure Cognitive Services scope for token requests. Override only if using a different Azure service. | ## ByokRag @@ -146,15 +146,15 @@ Microsoft Entra ID authentication attributes for Azure. BYOK (Bring Your Own Knowledge) RAG configuration. -| Field | Type | Description | -|-------|------|-------------| -| rag_id | string | Unique RAG ID | -| rag_type | string | Type of RAG database. | -| embedding_model | string | Embedding model identification | -| embedding_dimension | integer | Dimensionality of embedding vectors. | -| vector_db_id | string | Vector database identification. | -| db_path | string | Path to RAG database. | -| score_multiplier | number | Multiplier applied to relevance scores from this vector store. Used to weight results when querying multiple knowledge sources. Values > 1 boost this store's results; values < 1 reduce them. | +| Field | Type | Description | +|---------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| rag_id | string | Unique RAG ID | +| rag_type | string | Type of RAG database. | +| embedding_model | string | Embedding model identification | +| embedding_dimension | integer | Dimensionality of embedding vectors. | +| vector_db_id | string | Vector database identification. | +| db_path | string | Path to RAG database. | +| score_multiplier | number | Multiplier applied to relevance scores from this vector store. Used to weight results when querying multiple knowledge sources. Values > 1 boost this store's results; values < 1 reduce them. | ## CORSConfiguration @@ -173,12 +173,12 @@ Useful resources: - [What is CORS?](https://dev.to/akshay_chauhan/what-is-cors-explained-8f1) -| Field | Type | Description | -|-------|------|-------------| -| allow_origins | array | A list of origins allowed for cross-origin requests. An origin is the combination of protocol (http, https), domain (myapp.com, localhost, localhost.tiangolo.com), and port (80, 443, 8080). Use ['*'] to allow all origins. | -| allow_credentials | boolean | Indicate that cookies should be supported for cross-origin requests | -| allow_methods | array | A list of HTTP methods that should be allowed for cross-origin requests. You can use ['*'] to allow all standard methods. | -| allow_headers | array | A list of HTTP request headers that should be supported for cross-origin requests. You can use ['*'] to allow all headers. The Accept, Accept-Language, Content-Language and Content-Type headers are always allowed for simple CORS requests. | +| Field | Type | Description | +|-------------------|---------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| allow_origins | array | A list of origins allowed for cross-origin requests. An origin is the combination of protocol (http, https), domain (myapp.com, localhost, localhost.tiangolo.com), and port (80, 443, 8080). Use ['*'] to allow all origins. | +| allow_credentials | boolean | Indicate that cookies should be supported for cross-origin requests | +| allow_methods | array | A list of HTTP methods that should be allowed for cross-origin requests. You can use ['*'] to allow all standard methods. | +| allow_headers | array | A list of HTTP request headers that should be supported for cross-origin requests. You can use ['*'] to allow all headers. The Accept, Accept-Language, Content-Language and Content-Type headers are always allowed for simple CORS requests. | ## CompactionConfiguration @@ -210,13 +210,13 @@ Attributes: buffer_turns. -| Field | Type | Description | -|-------|------|-------------| -| enabled | boolean | When true, older conversation turns are summarized when estimated tokens approach the context window limit. | -| threshold_ratio | number | Trigger compaction when estimated tokens exceed this fraction of the model's context window (0.0-1.0). | -| token_floor | integer | Minimum token count before compaction can trigger. Prevents triggering on very small context windows. | -| buffer_turns | integer | Number of recent turns to keep verbatim. | -| buffer_max_ratio | number | Maximum fraction of context window the buffer zone can occupy, regardless of buffer_turns. | +| Field | Type | Description | +|------------------|---------|-------------------------------------------------------------------------------------------------------------| +| enabled | boolean | When true, older conversation turns are summarized when estimated tokens approach the context window limit. | +| threshold_ratio | number | Trigger compaction when estimated tokens exceed this fraction of the model's context window (0.0-1.0). | +| token_floor | integer | Minimum token count before compaction can trigger. Prevents triggering on very small context windows. | +| buffer_turns | integer | Number of recent turns to keep verbatim. | +| buffer_max_ratio | number | Maximum fraction of context window the buffer zone can occupy, regardless of buffer_turns. | ## Configuration @@ -225,32 +225,32 @@ Attributes: Global service configuration. -| Field | Type | Description | -|-------|------|-------------| -| name | string | Name of the service. That value will be used in REST API endpoints. | -| service | | This section contains Lightspeed Core Stack service configuration. | -| llama_stack | | This section contains Llama Stack configuration. Lightspeed Core Stack service can call Llama Stack in library mode or in server mode. | -| user_data_collection | | This section contains configuration for subsystem that collects user data(transcription history and feedbacks). | -| database | | Configuration for database to store conversation IDs and other runtime data | -| mcp_servers | array | MCP (Model Context Protocol) servers provide tools and capabilities to the AI agents. These are configured in this section. Only MCP servers defined in the lightspeed-stack.yaml configuration are available to the agents. Tools configured in the llama-stack run.yaml are not accessible to lightspeed-core agents. | -| authentication | | Authentication configuration | -| authorization | | Lightspeed Core Stack implements a modular authentication and authorization system with multiple authentication methods. Authorization is configurable through role-based access control. Authentication is handled through selectable modules configured via the module field in the authentication configuration. | -| customization | | It is possible to customize Lightspeed Core Stack via this section. System prompt can be customized and also different parts of the service can be replaced by custom Python modules. | -| inference | | One LLM provider and one its model might be selected as default ones. When no provider+model pair is specified in REST API calls (query endpoints), the default provider and model are used. | -| conversation_cache | | | -| compaction | | Controls when conversation history is summarized to keep the model's input below the context window limit. Disabled by default — when disabled, requests that exceed the window continue to surface as HTTP 413. | -| approvals | | Settings for human-in-the-loop approval of MCP tool invocations | -| byok_rag | array | BYOK RAG configuration. This configuration can be used to reconfigure Llama Stack through its run.yaml configuration file | -| a2a_state | | Configuration for A2A protocol persistent state storage. | -| quota_handlers | | Quota handlers configuration | -| azure_entra_id | | | -| rlsapi_v1 | | Configuration for the rlsapi v1 /infer endpoint used by the RHEL Lightspeed Command Line Assistant (CLA). | -| splunk | | Splunk HEC configuration for sending telemetry events. | -| deployment_environment | string | Deployment environment name (e.g., 'development', 'staging', 'production'). Used in telemetry events. | -| rag | | Configuration for all RAG strategies (inline and tool-based). | -| okp | | OKP provider settings. Only used when 'okp' is listed in rag.inline or rag.tool. | -| reranker | | Configuration for neural reranking of RAG chunks using cross-encoder. | -| skills | | Agent skills configuration. Specifies paths to skill directories. | +| Field | Type | Description | +|------------------------|--------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| name | string | Name of the service. That value will be used in REST API endpoints. | +| service | | This section contains Lightspeed Core Stack service configuration. | +| llama_stack | | This section contains Llama Stack configuration. Lightspeed Core Stack service can call Llama Stack in library mode or in server mode. | +| user_data_collection | | This section contains configuration for subsystem that collects user data(transcription history and feedbacks). | +| database | | Configuration for database to store conversation IDs and other runtime data | +| mcp_servers | array | MCP (Model Context Protocol) servers provide tools and capabilities to the AI agents. These are configured in this section. Only MCP servers defined in the lightspeed-stack.yaml configuration are available to the agents. Tools configured in the llama-stack run.yaml are not accessible to lightspeed-core agents. | +| authentication | | Authentication configuration | +| authorization | | Lightspeed Core Stack implements a modular authentication and authorization system with multiple authentication methods. Authorization is configurable through role-based access control. Authentication is handled through selectable modules configured via the module field in the authentication configuration. | +| customization | | It is possible to customize Lightspeed Core Stack via this section. System prompt can be customized and also different parts of the service can be replaced by custom Python modules. | +| inference | | One LLM provider and one its model might be selected as default ones. When no provider+model pair is specified in REST API calls (query endpoints), the default provider and model are used. | +| conversation_cache | | | +| compaction | | Controls when conversation history is summarized to keep the model's input below the context window limit. Disabled by default — when disabled, requests that exceed the window continue to surface as HTTP 413. | +| approvals | | Settings for human-in-the-loop approval of MCP tool invocations | +| byok_rag | array | BYOK RAG configuration. This configuration can be used to reconfigure Llama Stack through its run.yaml configuration file | +| a2a_state | | Configuration for A2A protocol persistent state storage. | +| quota_handlers | | Quota handlers configuration | +| azure_entra_id | | | +| rlsapi_v1 | | Configuration for the rlsapi v1 /infer endpoint used by the RHEL Lightspeed Command Line Assistant (CLA). | +| splunk | | Splunk HEC configuration for sending telemetry events. | +| deployment_environment | string | Deployment environment name (e.g., 'development', 'staging', 'production'). Used in telemetry events. | +| rag | | Configuration for all RAG strategies (inline and tool-based). | +| okp | | OKP provider settings. Only used when 'okp' is listed in rag.inline or rag.tool. | +| reranker | | Configuration for neural reranking of RAG chunks using cross-encoder. | +| skills | | Agent skills configuration. Specifies paths to skill directories. | ## ConversationHistoryConfiguration @@ -259,12 +259,12 @@ Global service configuration. Conversation history configuration. -| Field | Type | Description | -|-------|------|-------------| -| type | string | Type of database where the conversation history is to be stored. | -| memory | | In-memory cache configuration | -| sqlite | | SQLite database configuration | -| postgres | | PostgreSQL database configuration | +| Field | Type | Description | +|----------|--------|------------------------------------------------------------------| +| type | string | Type of database where the conversation history is to be stored. | +| memory | | In-memory cache configuration | +| sqlite | | SQLite database configuration | +| postgres | | PostgreSQL database configuration | ## CustomProfile @@ -273,10 +273,10 @@ Conversation history configuration. Custom profile customization for prompts and validation. -| Field | Type | Description | -|-------|------|-------------| -| path | string | Path to Python modules containing custom profile. | -| prompts | object | Dictionary containing map of system prompts | +| Field | Type | Description | +|---------|--------|---------------------------------------------------| +| path | string | Path to Python modules containing custom profile. | +| prompts | object | Dictionary containing map of system prompts | ## Customization @@ -285,16 +285,16 @@ Custom profile customization for prompts and validation. Service customization. -| Field | Type | Description | -|-------|------|-------------| -| profile_path | string | | -| disable_query_system_prompt | boolean | | -| disable_shield_ids_override | boolean | | -| system_prompt_path | string | | -| system_prompt | string | | -| agent_card_path | string | | -| agent_card_config | object | | -| custom_profile | | | +| Field | Type | Description | +|-----------------------------|---------|-------------| +| profile_path | string | | +| disable_query_system_prompt | boolean | | +| disable_shield_ids_override | boolean | | +| system_prompt_path | string | | +| system_prompt | string | | +| agent_card_path | string | | +| agent_card_config | object | | +| custom_profile | | | ## DatabaseConfiguration @@ -303,10 +303,10 @@ Service customization. Database configuration. -| Field | Type | Description | -|-------|------|-------------| -| sqlite | | SQLite database configuration | -| postgres | | PostgreSQL database configuration | +| Field | Type | Description | +|----------|------|-----------------------------------| +| sqlite | | SQLite database configuration | +| postgres | | PostgreSQL database configuration | ## InMemoryCacheConfig @@ -315,8 +315,8 @@ Database configuration. In-memory cache configuration. -| Field | Type | Description | -|-------|------|-------------| +| Field | Type | Description | +|-------------|---------|---------------------------------------------------------| | max_entries | integer | Maximum number of entries stored in the in-memory cache | @@ -326,11 +326,11 @@ In-memory cache configuration. Inference configuration. -| Field | Type | Description | -|-------|------|-------------| -| default_model | string | Identification of default model used when no other model is specified. | -| default_provider | string | Identification of default provider used when no other model is specified. | -| context_windows | object | Map of fully-qualified model identifier (e.g., "openai/gpt-4o-mini") to context window size in tokens. Used by the conversation compaction trigger to decide when older turns must be summarized before the input exceeds the window. Models absent from this map have no registered window — callers fall back to their own default or skip the token-based trigger. | +| Field | Type | Description | +|------------------|--------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| default_model | string | Identification of default model used when no other model is specified. | +| default_provider | string | Identification of default provider used when no other model is specified. | +| context_windows | object | Map of fully-qualified model identifier (e.g., "openai/gpt-4o-mini") to context window size in tokens. Used by the conversation compaction trigger to decide when older turns must be summarized before the input exceeds the window. Models absent from this map have no registered window — callers fall back to their own default or skip the token-based trigger. | ## JsonPathOperator @@ -357,10 +357,10 @@ Useful resources: - [RFC 7517](https://www.rfc-editor.org/rfc/rfc7517) -| Field | Type | Description | -|-------|------|-------------| -| url | string | HTTPS URL of the JWK (JSON Web Key) set used to validate JWTs. | -| jwt_configuration | | JWT (JSON Web Token) configuration | +| Field | Type | Description | +|-------------------|--------|----------------------------------------------------------------| +| url | string | HTTPS URL of the JWK (JSON Web Key) set used to validate JWTs. | +| jwt_configuration | | JWT (JSON Web Token) configuration | ## JwtConfiguration @@ -383,11 +383,11 @@ Useful resources: - [JSON Web Tokens](https://auth0.com/docs/secure/tokens/json-web-tokens) -| Field | Type | Description | -|-------|------|-------------| -| user_id_claim | string | JWT claim name that uniquely identifies the user (subject ID). | -| username_claim | string | JWT claim name that provides the human-readable username. | -| role_rules | array | Rules for extracting roles from JWT claims | +| Field | Type | Description | +|----------------|--------|----------------------------------------------------------------| +| user_id_claim | string | JWT claim name that uniquely identifies the user (subject ID). | +| username_claim | string | JWT claim name that provides the human-readable username. | +| role_rules | array | Rules for extracting roles from JWT claims | ## JwtRoleRule @@ -396,13 +396,13 @@ Useful resources: Rule for extracting roles from JWT claims. -| Field | Type | Description | -|-------|------|-------------| -| jsonpath | string | JSONPath expression to evaluate against the JWT payload | -| operator | | JSON path comparison operator | -| negate | boolean | If set to true, the meaning of the rule is negated | -| value | | Value to compare against | -| roles | array | Roles to be assigned if the rule matches | +| Field | Type | Description | +|----------|---------|---------------------------------------------------------| +| jsonpath | string | JSONPath expression to evaluate against the JWT payload | +| operator | | JSON path comparison operator | +| negate | boolean | If set to true, the meaning of the rule is negated | +| value | | Value to compare against | +| roles | array | Roles to be assigned if the rule matches | ## LlamaStackConfiguration @@ -422,16 +422,16 @@ Useful resources: - [Build AI Applications with Llama Stack](https://llamastack.github.io/) -| Field | Type | Description | -|-------|------|-------------| -| url | string | URL to Llama Stack service; used when library mode is disabled. Must be a valid HTTP or HTTPS URL. | -| api_key | string | API key to access Llama Stack service | -| use_as_library_client | boolean | When set to true Llama Stack will be used in library mode, not in server mode (default) | -| library_client_config_path | string | Path to configuration file used when Llama Stack is run in library mode | -| timeout | integer | Timeout in seconds for requests to Llama Stack service. Default is 180 seconds (3 minutes) to accommodate long-running RAG queries. | -| max_retries | integer | Maximum number of connection attempts before giving up. Used on startup to connect to Llama Stack and retrieve its version. Connection attempts are retried with a fixed delay to handle the case where Llama Stack is still starting up (e.g., when running as a sidecar in the same pod). | -| retry_delay | integer | Delay in seconds between retry attempts. Used on startup to connect to Llama Stack and retrieve its version. Connection attempts are retried with a fixed delay to handle the case where Llama Stack is still starting up (e.g., when running as a sidecar in the same pod). | -| allow_degraded_mode | boolean | If enabled, Lightspeed Core can be started even when Llama Stack is not accessible (valid for server mode only) | +| Field | Type | Description | +|----------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| url | string | URL to Llama Stack service; used when library mode is disabled. Must be a valid HTTP or HTTPS URL. | +| api_key | string | API key to access Llama Stack service | +| use_as_library_client | boolean | When set to true Llama Stack will be used in library mode, not in server mode (default) | +| library_client_config_path | string | Path to configuration file used when Llama Stack is run in library mode | +| timeout | integer | Timeout in seconds for requests to Llama Stack service. Default is 180 seconds (3 minutes) to accommodate long-running RAG queries. | +| max_retries | integer | Maximum number of connection attempts before giving up. Used on startup to connect to Llama Stack and retrieve its version. Connection attempts are retried with a fixed delay to handle the case where Llama Stack is still starting up (e.g., when running as a sidecar in the same pod). | +| retry_delay | integer | Delay in seconds between retry attempts. Used on startup to connect to Llama Stack and retrieve its version. Connection attempts are retried with a fixed delay to handle the case where Llama Stack is still starting up (e.g., when running as a sidecar in the same pod). | +| allow_degraded_mode | boolean | If enabled, Lightspeed Core can be started even when Llama Stack is not accessible (valid for server mode only) | ## ModelContextProtocolServer @@ -452,15 +452,15 @@ Useful resources: - [Wikipedia article](https://en.wikipedia.org/wiki/Model_Context_Protocol) -| Field | Type | Description | -|-------|------|-------------| -| name | string | MCP server name that must be unique | -| provider_id | string | MCP provider identification | -| url | string | URL of the MCP server | -| authorization_headers | object | Headers to send to the MCP server. The map contains the header name and the path to a file containing the header value (secret). There are 3 special cases: 1. Usage of the kubernetes token in the header. To specify this use a string 'kubernetes' instead of the file path. 2. Usage of the client-provided token in the header. To specify this use a string 'client' instead of the file path. 3. Usage of the oauth token in the header. To specify this use a string 'oauth' instead of the file path. | -| headers | array | List of HTTP header names to automatically forward from the incoming request to this MCP server. Headers listed here are extracted from the original client request and included when calling the MCP server. This is useful when infrastructure components (e.g. API gateways) inject headers that MCP servers need, such as x-rh-identity in HCC. Header matching is case-insensitive. These headers are additive with authorization_headers and MCP-HEADERS. | -| require_approval | | When to require human approval for tool invocations. 'always' requires approval for all tools, 'never' auto-approves, or use ApprovalFilter for granular control. | -| timeout | integer | Timeout in seconds for requests to the MCP server. If not specified, the default timeout from Llama Stack will be used. Note: This field is reserved for future use when Llama Stack adds timeout support. | +| Field | Type | Description | +|-----------------------|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| name | string | MCP server name that must be unique | +| provider_id | string | MCP provider identification | +| url | string | URL of the MCP server | +| authorization_headers | object | Headers to send to the MCP server. The map contains the header name and the path to a file containing the header value (secret). There are 3 special cases: 1. Usage of the kubernetes token in the header. To specify this use a string 'kubernetes' instead of the file path. 2. Usage of the client-provided token in the header. To specify this use a string 'client' instead of the file path. 3. Usage of the oauth token in the header. To specify this use a string 'oauth' instead of the file path. | +| headers | array | List of HTTP header names to automatically forward from the incoming request to this MCP server. Headers listed here are extracted from the original client request and included when calling the MCP server. This is useful when infrastructure components (e.g. API gateways) inject headers that MCP servers need, such as x-rh-identity in HCC. Header matching is case-insensitive. These headers are additive with authorization_headers and MCP-HEADERS. | +| require_approval | | When to require human approval for tool invocations. 'always' requires approval for all tools, 'never' auto-approves, or use ApprovalFilter for granular control. | +| timeout | integer | Timeout in seconds for requests to the MCP server. If not specified, the default timeout from Llama Stack will be used. Note: This field is reserved for future use when Llama Stack adds timeout support. | ## OkpConfiguration @@ -472,11 +472,11 @@ Controls provider-specific behaviour for the OKP vector store. Only relevant when ``"okp"`` is listed in ``rag.inline`` or ``rag.tool``. -| Field | Type | Description | -|-------|------|-------------| -| rhokp_url | string | Base URL for the OKP server (http or https). Set to `${env.RH_SERVER_OKP}` in YAML to use the environment variable. When unset, the default from constants is used. | -| offline | boolean | When True, use parent_id for OKP chunk source URLs. When False, use reference_url for chunk source URLs. | -| chunk_filter_query | string | Additional OKP filter query applied to every OKP search request. Use Solr boolean syntax, e.g. 'product:ansible AND product:*openshift*'. | +| Field | Type | Description | +|--------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| rhokp_url | string | Base URL for the OKP server (http or https). Set to `${env.RH_SERVER_OKP}` in YAML to use the environment variable. When unset, the default from constants is used. | +| offline | boolean | When True, use parent_id for OKP chunk source URLs. When False, use reference_url for chunk source URLs. | +| chunk_filter_query | string | Additional OKP filter query applied to every OKP search request. Use Solr boolean syntax, e.g. 'product:ansible AND product:*openshift*'. | ## PostgreSQLDatabaseConfiguration @@ -495,17 +495,17 @@ Useful resources: - [How to Use PostgreSQL in Python](https://www.freecodecamp.org/news/postgresql-in-python/) -| Field | Type | Description | -|-------|------|-------------| -| host | string | Database server host or socket directory | -| port | integer | Database server port | -| db | string | Database name to connect to | -| user | string | Database user name used to authenticate | -| password | string | Password used to authenticate | -| namespace | string | Database namespace | -| ssl_mode | string | SSL mode | -| gss_encmode | string | This option determines whether or with what priority a secure GSS TCP/IP connection will be negotiated with the server. | -| ca_cert_path | string | Path to CA certificate | +| Field | Type | Description | +|--------------|---------|-------------------------------------------------------------------------------------------------------------------------| +| host | string | Database server host or socket directory | +| port | integer | Database server port | +| db | string | Database name to connect to | +| user | string | Database user name used to authenticate | +| password | string | Password used to authenticate | +| namespace | string | Database namespace | +| ssl_mode | string | SSL mode | +| gss_encmode | string | This option determines whether or with what priority a secure GSS TCP/IP connection will be negotiated with the server. | +| ca_cert_path | string | Path to CA certificate | ## QuotaHandlersConfiguration @@ -519,12 +519,12 @@ _quota limiter_. It can be of type `user_limiter` or `cluster_limiter` (which is name that makes sense in OpenShift deployment). -| Field | Type | Description | -|-------|------|-------------| -| sqlite | | SQLite database configuration | -| postgres | | PostgreSQL database configuration | -| limiters | array | Quota limiters configuration | -| scheduler | | Quota scheduler configuration | +| Field | Type | Description | +|----------------------|---------|-------------------------------------------------------| +| sqlite | | SQLite database configuration | +| postgres | | PostgreSQL database configuration | +| limiters | array | Quota limiters configuration | +| scheduler | | Quota scheduler configuration | | enable_token_history | boolean | Enables storing information about token usage history | @@ -552,13 +552,13 @@ There are two basic use cases: (for example on a daily basis), set ``quota_increase``. -| Field | Type | Description | -|-------|------|-------------| -| type | string | Quota limiter type, either user_limiter or cluster_limiter | -| name | string | Human readable quota limiter name | -| initial_quota | integer | Quota set at beginning of the period | -| quota_increase | integer | Delta value used to increase quota when period is reached | -| period | string | Period specified in human readable form | +| Field | Type | Description | +|----------------|---------|------------------------------------------------------------| +| type | string | Quota limiter type, either user_limiter or cluster_limiter | +| name | string | Human readable quota limiter name | +| initial_quota | integer | Quota set at beginning of the period | +| quota_increase | integer | Delta value used to increase quota when period is reached | +| period | string | Period specified in human readable form | ## QuotaSchedulerConfiguration @@ -567,10 +567,10 @@ There are two basic use cases: Quota scheduler configuration. -| Field | Type | Description | -|-------|------|-------------| -| period | integer | Quota scheduler period specified in seconds | -| database_reconnection_count | integer | Database reconnection count on startup. When database for quota is not available on startup, the service tries to reconnect N times with specified delay. | +| Field | Type | Description | +|-----------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| period | integer | Quota scheduler period specified in seconds | +| database_reconnection_count | integer | Database reconnection count on startup. When database for quota is not available on startup, the service tries to reconnect N times with specified delay. | | database_reconnection_delay | integer | Database reconnection delay specified in seconds. When database for quota is not available on startup, the service tries to reconnect N times with specified delay. | @@ -580,10 +580,10 @@ Quota scheduler configuration. Red Hat Identity authentication configuration. -| Field | Type | Description | -|-------|------|-------------| -| required_entitlements | array | List of all required entitlements. | -| max_header_size | integer | Maximum allowed size in bytes for the base64-encoded x-rh-identity header. Headers exceeding this size are rejected before decoding. | +| Field | Type | Description | +|-----------------------|---------|--------------------------------------------------------------------------------------------------------------------------------------| +| required_entitlements | array | List of all required entitlements. | +| max_header_size | integer | Maximum allowed size in bytes for the base64-encoded x-rh-identity header. Headers exceeding this size are rejected before decoding. | ## RagConfiguration @@ -604,10 +604,10 @@ If no RAG strategy is defined (inline and tool are empty), the RAG tool will register all stores available to llama-stack. -| Field | Type | Description | -|-------|------|-------------| -| inline | array | RAG IDs whose sources are injected as context before the LLM call. Use 'okp' to enable OKP inline RAG. Empty by default (no inline RAG). | -| tool | array | RAG IDs made available to the LLM as a file_search tool. Use 'okp' to include the OKP vector store. When omitted, all registered BYOK vector stores are used (backward compatibility). | +| Field | Type | Description | +|--------|-------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| inline | array | RAG IDs whose sources are injected as context before the LLM call. Use 'okp' to enable OKP inline RAG. Empty by default (no inline RAG). | +| tool | array | RAG IDs made available to the LLM as a file_search tool. Use 'okp' to include the OKP vector store. When omitted, all registered BYOK vector stores are used (backward compatibility). | ## RerankerConfiguration @@ -616,10 +616,10 @@ the RAG tool will register all stores available to llama-stack. Reranker configuration for RAG chunk reranking. -| Field | Type | Description | -|-------|------|-------------| -| enabled | boolean | When True, reranking applied to RAG chunks. When False, reranking is disabled and original scoring used. | -| model | string | Cross-encoder model name for reranking RAG chunks. Defaults to 'cross-encoder/ms-marco-MiniLM-L6-v2' from sentence-transformers. | +| Field | Type | Description | +|---------|---------|----------------------------------------------------------------------------------------------------------------------------------| +| enabled | boolean | When True, reranking applied to RAG chunks. When False, reranking is disabled and original scoring used. | +| model | string | Cross-encoder model name for reranking RAG chunks. Defaults to 'cross-encoder/ms-marco-MiniLM-L6-v2' from sentence-transformers. | ## RlsapiV1Configuration @@ -632,10 +632,10 @@ stateless inference endpoint. Kept separate from shared configuration sections so that CLA-specific options do not affect other endpoints. -| Field | Type | Description | -|-------|------|-------------| -| allow_verbose_infer | boolean | Allow /v1/infer to return extended metadata (tool_calls, rag_chunks, token_usage) when the client sends "include_metadata": true. Should NOT be enabled in production. If production use is needed, consider RBAC-based access control via an Action.RLSAPI_V1_INFER authorization rule. | -| quota_subject | string | Identity field used as the quota subject for /v1/infer. When set, token quota enforcement is enabled for this endpoint. Requires quota_handlers to be configured. "org_id" and "system_id" require rh-identity authentication; falls back to user_id when rh-identity data is unavailable. | +| Field | Type | Description | +|---------------------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| allow_verbose_infer | boolean | Allow /v1/infer to return extended metadata (tool_calls, rag_chunks, token_usage) when the client sends "include_metadata": true. Should NOT be enabled in production. If production use is needed, consider RBAC-based access control via an Action.RLSAPI_V1_INFER authorization rule. | +| quota_subject | string | Identity field used as the quota subject for /v1/infer. When set, token quota enforcement is enabled for this endpoint. Requires quota_handlers to be configured. "org_id" and "system_id" require rh-identity authentication; falls back to user_id when rh-identity data is unavailable. | ## SQLiteDatabaseConfiguration @@ -644,8 +644,8 @@ sections so that CLA-specific options do not affect other endpoints. SQLite database configuration. -| Field | Type | Description | -|-------|------|-------------| +| Field | Type | Description | +|---------|--------|----------------------------------------------| | db_path | string | Path to file where SQLite database is stored | @@ -660,18 +660,18 @@ and specify the number of Uvicorn workers. When more workers are specified, the service can handle requests concurrently. -| Field | Type | Description | -|-------|------|-------------| -| host | string | Service hostname | -| port | integer | Service port | -| base_url | string | Externally reachable base URL for the service; needed for A2A support. | -| auth_enabled | boolean | Enables the authentication subsystem | -| workers | integer | Number of Uvicorn worker processes to start | -| color_log | boolean | Enables colorized logging | -| access_log | boolean | Enables logging of all access information | -| tls_config | | Transport Layer Security configuration for HTTPS support | -| root_path | string | ASGI root path for serving behind a reverse proxy on a subpath | -| cors | | Cross-Origin Resource Sharing configuration for cross-domain requests | +| Field | Type | Description | +|--------------|---------|------------------------------------------------------------------------| +| host | string | Service hostname | +| port | integer | Service port | +| base_url | string | Externally reachable base URL for the service; needed for A2A support. | +| auth_enabled | boolean | Enables the authentication subsystem | +| workers | integer | Number of Uvicorn worker processes to start | +| color_log | boolean | Enables colorized logging | +| access_log | boolean | Enables logging of all access information | +| tls_config | | Transport Layer Security configuration for HTTPS support | +| root_path | string | ASGI root path for serving behind a reverse proxy on a subpath | +| cors | | Cross-Origin Resource Sharing configuration for cross-domain requests | ## SkillsConfiguration @@ -689,8 +689,8 @@ Each path can point to either: Paths are validated at startup to ensure they exist and contain valid SKILL.md files. -| Field | Type | Description | -|-------|------|-------------| +| Field | Type | Description | +|-------|-------|----------------------------------------------------------------------------| | paths | array | Paths to skill directories or directories containing skill subdirectories. | @@ -709,15 +709,15 @@ Useful resources: - [About HEC](https://docs.splunk.com/Documentation/Splunk/latest/Data) -| Field | Type | Description | -|-------|------|-------------| -| enabled | boolean | Enable or disable Splunk HEC integration. | -| url | string | Splunk HEC endpoint URL. | -| token_path | string | Path to file containing the Splunk HEC authentication token. | -| index | string | Target Splunk index for events. | -| source | string | Event source identifier. | -| timeout | integer | HTTP timeout in seconds for HEC requests. | -| verify_ssl | boolean | Whether to verify SSL certificates for HEC endpoint. | +| Field | Type | Description | +|------------|---------|--------------------------------------------------------------| +| enabled | boolean | Enable or disable Splunk HEC integration. | +| url | string | Splunk HEC endpoint URL. | +| token_path | string | Path to file containing the Splunk HEC authentication token. | +| index | string | Target Splunk index for events. | +| source | string | Event source identifier. | +| timeout | integer | HTTP timeout in seconds for HEC requests. | +| verify_ssl | boolean | Whether to verify SSL certificates for HEC endpoint. | ## TLSConfiguration @@ -738,11 +738,11 @@ Useful resources: - [What is TLS](https://www.ssltrust.eu/learning/ssl/transport-layer-security-tls) -| Field | Type | Description | -|-------|------|-------------| -| tls_certificate_path | string | SSL/TLS certificate file path for HTTPS support. | -| tls_key_path | string | SSL/TLS private key file path for HTTPS support. | -| tls_key_password | string | Path to file containing the password to decrypt the SSL/TLS private key. | +| Field | Type | Description | +|----------------------|--------|--------------------------------------------------------------------------| +| tls_certificate_path | string | SSL/TLS certificate file path for HTTPS support. | +| tls_key_path | string | SSL/TLS private key file path for HTTPS support. | +| tls_key_password | string | Path to file containing the password to decrypt the SSL/TLS private key. | ## UserDataCollection @@ -751,9 +751,9 @@ Useful resources: User data collection configuration. -| Field | Type | Description | -|-------|------|-------------| -| feedback_enabled | boolean | When set to true the user feedback is stored and later sent for analysis. | -| feedback_storage | string | Path to directory where feedback will be saved for further processing. | -| transcripts_enabled | boolean | When set to true the conversation history is stored and later sent for analysis. | -| transcripts_storage | string | Path to directory where conversation history will be saved for further processing. | +| Field | Type | Description | +|---------------------|---------|------------------------------------------------------------------------------------| +| feedback_enabled | boolean | When set to true the user feedback is stored and later sent for analysis. | +| feedback_storage | string | Path to directory where feedback will be saved for further processing. | +| transcripts_enabled | boolean | When set to true the conversation history is stored and later sent for analysis. | +| transcripts_storage | string | Path to directory where conversation history will be saved for further processing. | diff --git a/docs/providers.md b/docs/providers.md index a742ca130..8d5d52625 100644 --- a/docs/providers.md +++ b/docs/providers.md @@ -31,39 +31,39 @@ The tables below summarize each provider category, containing the following atri ## Inference Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| meta-reference | inline | `accelerate`, `fairscale`, `torch`, `torchvision`, `transformers`, `zmq`, `lm-format-enforcer`, `sentence-transformers`, `torchao==0.8.0`, `fbgemm-gpu-genai==1.1.2` | ❌ | -| sentence-transformers | inline | `torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu`, `sentence-transformers --no-deps` | ❌ | -| anthropic | remote | `litellm` | ❌ | -| azure | remote | — | ✅ | -| bedrock | remote | - | ✅ | -| cerebras | remote | `cerebras_cloud_sdk` | ❌ | -| databricks | remote | — | ❌ | -| fireworks | remote | `fireworks-ai<=0.17.16` | ❌ | -| gemini | remote | `litellm` | ❌ | -| groq | remote | `litellm` | ❌ | -| hf::endpoint | remote | `huggingface_hub`, `aiohttp` | ❌ | -| hf::serverless | remote | `huggingface_hub`, `aiohttp` | ❌ | -| llama-openai-compat | remote | `litellm` | ❌ | -| nvidia | remote | — | ❌ | -| ollama | remote | `ollama`, `aiohttp`, `h11>=0.16.0` | ❌ | -| openai | remote | `litellm` | ✅ | -| passthrough | remote | — | ❌ | -| runpod | remote | — | ❌ | -| sambanova | remote | `litellm` | ❌ | -| tgi | remote | `huggingface_hub`, `aiohttp` | ❌ | -| together | remote | `together` | ❌ | -| vertexai | remote | `google-auth` | ✅ | -| watsonx | remote | `litellm` | ✅ | +| Name | Type | Pip Dependencies | Supported in LCS | +|-----------------------|--------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------:| +| meta-reference | inline | `accelerate`, `fairscale`, `torch`, `torchvision`, `transformers`, `zmq`, `lm-format-enforcer`, `sentence-transformers`, `torchao==0.8.0`, `fbgemm-gpu-genai==1.1.2` | ❌ | +| sentence-transformers | inline | `torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu`, `sentence-transformers --no-deps` | ❌ | +| anthropic | remote | `litellm` | ❌ | +| azure | remote | — | ✅ | +| bedrock | remote | - | ✅ | +| cerebras | remote | `cerebras_cloud_sdk` | ❌ | +| databricks | remote | — | ❌ | +| fireworks | remote | `fireworks-ai<=0.17.16` | ❌ | +| gemini | remote | `litellm` | ❌ | +| groq | remote | `litellm` | ❌ | +| hf::endpoint | remote | `huggingface_hub`, `aiohttp` | ❌ | +| hf::serverless | remote | `huggingface_hub`, `aiohttp` | ❌ | +| llama-openai-compat | remote | `litellm` | ❌ | +| nvidia | remote | — | ❌ | +| ollama | remote | `ollama`, `aiohttp`, `h11>=0.16.0` | ❌ | +| openai | remote | `litellm` | ✅ | +| passthrough | remote | — | ❌ | +| runpod | remote | — | ❌ | +| sambanova | remote | `litellm` | ❌ | +| tgi | remote | `huggingface_hub`, `aiohttp` | ❌ | +| together | remote | `together` | ❌ | +| vertexai | remote | `google-auth` | ✅ | +| watsonx | remote | `litellm` | ✅ | Red Hat providers: -| Name | Version Tested | Type | Pip Dependencies | Supported in LCS | -|---|---|---|---|:---:| -| RHOAI (vllm) | latest operator | remote | `openai` | ✅ | -| RHAIIS (vllm) | 3.2.3 (on RHEL 9.20250429.0.4) | remote | `openai` | ✅ | -| RHEL AI (vllm) | 1.5.2 | remote | `openai` | ✅ | +| Name | Version Tested | Type | Pip Dependencies | Supported in LCS | +|----------------|--------------------------------|--------|------------------|:----------------:| +| RHOAI (vllm) | latest operator | remote | `openai` | ✅ | +| RHAIIS (vllm) | 3.2.3 (on RHEL 9.20250429.0.4) | remote | `openai` | ✅ | +| RHEL AI (vllm) | 1.5.2 | remote | `openai` | ✅ | ### Azure Provider - Entra ID Authentication Guide @@ -73,12 +73,12 @@ Lightspeed Core supports secure authentication using Microsoft Entra ID (formerl To enable Entra ID authentication, the `azure_entra_id` block must be included in your LCS configuration. The `tenant_id`, `client_id`, and `client_secret` attributes are required: -| Attribute | Required | Description | -|-----------|----------|-------------| -| `tenant_id` | Yes | Azure AD tenant ID | -| `client_id` | Yes | Application (client) ID | -| `client_secret` | Yes | Client secret value | -| `scope` | No | Token scope (default: `https://cognitiveservices.azure.com/.default`) | +| Attribute | Required | Description | +|-----------------|----------|-----------------------------------------------------------------------| +| `tenant_id` | Yes | Azure AD tenant ID | +| `client_id` | Yes | Application (client) ID | +| `client_secret` | Yes | Client secret value | +| `scope` | No | Token scope (default: `https://cognitiveservices.azure.com/.default`) | Example of LCS config section: @@ -172,116 +172,116 @@ make run CONFIG=examples/lightspeed-stack-azure-entraid-service.yaml ## Agent Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| meta-reference | inline | `matplotlib`, `pillow`, `pandas`, `scikit-learn`, `mcp>=1.8.1` `aiosqlite`, `psycopg2-binary`, `redis`, `pymongo` | ✅ | +| Name | Type | Pip Dependencies | Supported in LCS | +|----------------|--------|-------------------------------------------------------------------------------------------------------------------|:----------------:| +| meta-reference | inline | `matplotlib`, `pillow`, `pandas`, `scikit-learn`, `mcp>=1.8.1` `aiosqlite`, `psycopg2-binary`, `redis`, `pymongo` | ✅ | --- ## Evaluation Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| meta-reference | inline | `tree_sitter`, `pythainlp`, `langdetect`, `emoji`, `nltk` | ✅ | -| meta-reference | remote | `requests` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|----------------|--------|-----------------------------------------------------------|:----------------:| +| meta-reference | inline | `tree_sitter`, `pythainlp`, `langdetect`, `emoji`, `nltk` | ✅ | +| meta-reference | remote | `requests` | ❌ | --- ## Datasetio Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| localfs | inline | `pandas` | ✅ | -| huggingface | remote | `datasets>=4.0.0` | ✅ | -| nvidia | remote | `datasets>=4.0.0` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|-------------|--------|-------------------|:----------------:| +| localfs | inline | `pandas` | ✅ | +| huggingface | remote | `datasets>=4.0.0` | ✅ | +| nvidia | remote | `datasets>=4.0.0` | ❌ | --- ## Safety Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| code-scanner | inline | `codeshield` | ❌ | -| llama-guard | inline | — | ❌ | -| prompt-guard | inline | `transformers[accelerate]`, `torch --index-url https://download.pytorch.org/whl/cpu` | ❌ | -| bedrock | remote | `boto3` | ❌ | -| nvidia | remote | `requests` | ❌ | -| sambanova | remote | `litellm`, `requests` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|--------------|--------|--------------------------------------------------------------------------------------|:----------------:| +| code-scanner | inline | `codeshield` | ❌ | +| llama-guard | inline | — | ❌ | +| prompt-guard | inline | `transformers[accelerate]`, `torch --index-url https://download.pytorch.org/whl/cpu` | ❌ | +| bedrock | remote | `boto3` | ❌ | +| nvidia | remote | `requests` | ❌ | +| sambanova | remote | `litellm`, `requests` | ❌ | --- ## Scoring Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| basic | inline | `requests` | ✅ | -| llm-as-judge | inline | — | ✅ | -| braintrust | inline | `autoevals` | ✅ | +| Name | Type | Pip Dependencies | Supported in LCS | +|--------------|--------|------------------|:----------------:| +| basic | inline | `requests` | ✅ | +| llm-as-judge | inline | — | ✅ | +| braintrust | inline | `autoevals` | ✅ | --- ## Telemetry Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| meta-reference | inline | `opentelemetry-sdk`, `opentelemetry-exporter-otlp-proto-http` | ✅ | +| Name | Type | Pip Dependencies | Supported in LCS | +|----------------|--------|---------------------------------------------------------------|:----------------:| +| meta-reference | inline | `opentelemetry-sdk`, `opentelemetry-exporter-otlp-proto-http` | ✅ | --- ## Post Training Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| torchtune-cpu | inline | `numpy`, `torch torchtune>=0.5.0`, `torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu`| ❌ | -| torchtune-gpu | inline | `numpy`,`torch torchtune>=0.5.0`, `torchao>=0.12.0` | ❌ | -| huggingface-gpu | inline | `trl`, `transformers`, `peft`, `datasets>=4.0.0`, `torch` | ✅ | -| nvidia | remote | `requests`, `aiohttp` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|-----------------|--------|-------------------------------------------------------------------------------------------------------------|:----------------:| +| torchtune-cpu | inline | `numpy`, `torch torchtune>=0.5.0`, `torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu` | ❌ | +| torchtune-gpu | inline | `numpy`,`torch torchtune>=0.5.0`, `torchao>=0.12.0` | ❌ | +| huggingface-gpu | inline | `trl`, `transformers`, `peft`, `datasets>=4.0.0`, `torch` | ✅ | +| nvidia | remote | `requests`, `aiohttp` | ❌ | --- ## VectorIO Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| meta-reference | inline | `faiss-cpu` | ❌ | -| chromadb | inline | `chromadb` | ❌ | -| faiss | inline | `faiss-cpu` | ✅ | -| milvus | inline | `pymilvus>=2.4.10` | ❌ | -| qdrant | inline | `qdrant-client` | ❌ | -| sqlite-vec | inline | `sqlite-vec` | ❌ | -| chromadb | remote | `chromadb-client` | ❌ | -| milvus | remote | `pymilvus>=2.4.10` | ❌ | -| pgvector | remote | `psycopg2-binary` | ❌ | -| qdrant | remote | `qdrant-client` | ❌ | -| weaviate | remote | `weaviate-client` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|----------------|--------|--------------------|:----------------:| +| meta-reference | inline | `faiss-cpu` | ❌ | +| chromadb | inline | `chromadb` | ❌ | +| faiss | inline | `faiss-cpu` | ✅ | +| milvus | inline | `pymilvus>=2.4.10` | ❌ | +| qdrant | inline | `qdrant-client` | ❌ | +| sqlite-vec | inline | `sqlite-vec` | ❌ | +| chromadb | remote | `chromadb-client` | ❌ | +| milvus | remote | `pymilvus>=2.4.10` | ❌ | +| pgvector | remote | `psycopg2-binary` | ❌ | +| qdrant | remote | `qdrant-client` | ❌ | +| weaviate | remote | `weaviate-client` | ❌ | --- ## Tool Runtime Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| rag-runtime | inline | `chardet`,`pypdf`, `tqdm`, `numpy`, `scikit-learn`, `scipy`, `nltk`, `sentencepiece`, `transformers` | ✅ | -| bing-search | remote | `requests` | ❌ | -| brave-search | remote | `requests` | ❌ | -| model-context-protocol | remote | `mcp>=1.8.1` | ✅ | -| tavily-search | remote | `requests` | ❌ | -| wolfram-alpha | remote | `requests` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|------------------------|--------|------------------------------------------------------------------------------------------------------|:----------------:| +| rag-runtime | inline | `chardet`,`pypdf`, `tqdm`, `numpy`, `scikit-learn`, `scipy`, `nltk`, `sentencepiece`, `transformers` | ✅ | +| bing-search | remote | `requests` | ❌ | +| brave-search | remote | `requests` | ❌ | +| model-context-protocol | remote | `mcp>=1.8.1` | ✅ | +| tavily-search | remote | `requests` | ❌ | +| wolfram-alpha | remote | `requests` | ❌ | --- ## Files Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| localfs | inline | `sqlalchemy[asyncio]`, `aiosqlite`, `asyncpg` | ❌ | -| s3 | remote | `sqlalchemy[asyncio]`, `aiosqlite`, `asyncpg`, `boto3` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|---------|--------|--------------------------------------------------------|:----------------:| +| localfs | inline | `sqlalchemy[asyncio]`, `aiosqlite`, `asyncpg` | ❌ | +| s3 | remote | `sqlalchemy[asyncio]`, `aiosqlite`, `asyncpg`, `boto3` | ❌ | --- ## Batches Providers -| Name | Type | Pip Dependencies | Supported in LCS | -|---|---|---|:---:| -| reference | inline | `openai` | ❌ | +| Name | Type | Pip Dependencies | Supported in LCS | +|-----------|--------|------------------|:----------------:| +| reference | inline | `openai` | ❌ | ---