Skip to content

docs: add RAG service documentation and deployment guide#359

Merged
tsivaprasad merged 12 commits intomainfrom
PLAT-495-rag-service-database-preparation-guide
Apr 23, 2026
Merged

docs: add RAG service documentation and deployment guide#359
tsivaprasad merged 12 commits intomainfrom
PLAT-495-rag-service-database-preparation-guide

Conversation

@tsivaprasad
Copy link
Copy Markdown
Contributor

@tsivaprasad tsivaprasad commented Apr 21, 2026

Summary

This PR adds complete documentation for the pgEdge RAG Server service, including
configuration reference, and a step-by-step deployment guide covering database creation, document loading, and pipeline
querying.

Changes

  • Add docs/services/rag.md with full configuration reference for
    pipelines, embedding_llm, rag_llm, tables, search, and
    defaults fields
  • Add Deployment Guide section: create database → check service status →
    load documents → query pipeline → update config
  • Update docs/services/index.md: link rag.md (remove "coming soon")

Testing

Verification:

  1. Created Cluster
  2. Create DB with following config
curl -X POST http://localhost:3000/v1/databases \
  -H 'Content-Type: application/json' \
  --data '{
    "id": "rag-test",
    "spec": {
      "database_name": "rag_test",
      "database_users": [
        {
          "username": "admin",
          "password": "admin_password",
          "db_owner": true,
          "attributes": ["SUPERUSER", "LOGIN"]
        },
        {
          "username": "app_read_only",
          "password": "readonly_password",
          "attributes": ["LOGIN"]
        }
      ],
      "port": 5432,
      "nodes": [{ "name": "n1", "host_ids": ["host-1"] }],
      "scripts": {
        "post_database_create": [
          "CREATE EXTENSION IF NOT EXISTS vector",
          "CREATE TABLE IF NOT EXISTS documents_content_chunks (id BIGSERIAL PRIMARY KEY, content TEXT NOT NULL, embedding vector(1536), title TEXT, source TEXT)",
          "CREATE INDEX ON documents_content_chunks USING hnsw (embedding vector_cosine_ops)",
          "CREATE INDEX ON documents_content_chunks USING gin (to_tsvector('"'"'english'"'"', content))",
          "GRANT SELECT ON documents_content_chunks TO app_read_only"
        ]
      },
      "services": [{
        "service_id": "rag",
        "service_type": "rag",
        "version": "latest",
        "host_ids": ["host-1"],
        "port": 9200,
        "connect_as": "app_read_only",
        "config": {
          "pipelines": [{
            "name": "default",
            "description": "Main RAG pipeline",
            "tables": [{
              "table": "documents_content_chunks",
              "text_column": "content",
              "vector_column": "embedding"
            }],
            "embedding_llm": {
              "provider": "openai",
              "model": "text-embedding-3-small",
              "api_key": "sk-proj-1"
            },
            "rag_llm": {
              "provider": "anthropic",
              "model": "claude-sonnet-4-5",
              "api_key": "sk-ant-api03-"
            },
            "token_budget": 4000,
            "top_n": 10
          }]
        }
      }]
    }
  }'

  1. Read database and services info
restish control-plane-local-1 get-database rag-test
HTTP/1.1 200 OK
Content-Type: application/json
Date: Tue, 21 Apr 2026 18:45:43 GMT

{
  created_at: "2026-04-21T18:43:10Z"
  id: "rag-test"
  instances: [
    {
      connection_info: {
        addresses: ["127.0.0.1"]
        port: 5432
      }
      created_at: "2026-04-21T18:43:13Z"
      host_id: "host-1"
      id: "rag-test-n1-689qacsi"
      node_name: "n1"
      postgres: {
        patroni_state: "running"
        role: "primary"
        version: "18.3"
      }
      spock: {
        read_only: "off"
        version: "5.0.6"
      }
      state: "available"
      status_updated_at: "2026-04-21T18:45:42Z"
      updated_at: "2026-04-21T18:43:46Z"
    }
  ]
  service_instances: [
    {
      created_at: "2026-04-21T18:43:49Z"
      database_id: "rag-test"
      host_id: "host-1"
      service_id: "rag"
      service_instance_id: "rag-test-rag-host-1"
      state: "running"
      status: {
        addresses: ["127.0.0.1"]
        container_id: "56b2d2657c095697a6baf428d8919678b109937abe31d730b0aec66d53405d67"
        image_version: "ghcr.io/pgedge/rag-server:latest"
        last_health_at: "2026-04-21T18:45:40Z"
        ports: [
          {
            container_port: 8080
            host_port: 9200
            name: "tcp"
          }
        ]
        service_ready: true
      }
      updated_at: "2026-04-21T18:44:09Z"
    }
  ]
  spec: {
    database_name: "rag_test"
    database_users: [
      {
        attributes: ["SUPERUSER", "LOGIN"]
        db_owner: true
        username: "admin"
      }
      {
        attributes: ["LOGIN"]
        db_owner: false
        username: "app_read_only"
      }
    ]
    nodes: [
      {
        host_ids: ["host-1"]
        name: "n1"
      }
    ]
    port: 5432
    postgres_version: "18.3"
    scripts: {
      post_database_create: [
        "CREATE EXTENSION IF NOT EXISTS vector"
        "CREATE TABLE IF NOT EXISTS documents_content_chunks (id BIGSERIAL PRIMARY KEY, content TEXT NOT NULL, embedding vector(1536), title TEXT, source TEXT)"
        "CREATE INDEX ON documents_content_chunks USING hnsw (embedding vector_cosine_ops)"
        "CREATE INDEX ON documents_content_chunks USING gin (to_tsvector('english', content))"
        "GRANT SELECT ON documents_content_chunks TO app_read_only"
      ]
    }
    services: [
      {
        config: {
          pipelines: [
            {
              description: "Main RAG pipeline"
              embedding_llm: {
                model: "text-embedding-3-small"
                provider: "openai"
              }
              name: "default"
              rag_llm: {
                model: "claude-sonnet-4-5"
                provider: "anthropic"
              }
              tables: [
                {
                  table: "documents_content_chunks"
                  text_column: "content"
                  vector_column: "embedding"
                }
              ]
              token_budget: 4000
              top_n: 10
            }
          ]
        }
        connect_as: "app_read_only"
        host_ids: ["host-1"]
        port: 9200
        service_id: "rag"
        service_type: "rag"
        version: "latest"
      }
    ]
    spock_version: "5"
  }
  state: "available"
  updated_at: "2026-04-21T18:43:10Z"
}
  1. Set env values
OPENAI_API_KEY="sk-proj-" \
DB_HOST="::1" \
DB_PORT="5432" \
DB_USER="admin" \
DB_PASSWORD="admin_password" \
DB_NAME="rag_test" \
  1. Run the script to load documents
    load.py

  2. Confirm the db entries

PGPASSWORD=admin_password psql "postgresql://admin@[::1]:5432/rag_test" \
  -c "SELECT COUNT(*), COUNT(embedding) FROM documents_content_chunks;"
 count | count 
-------+-------
    16 |    16
(1 row)

  1. Query pipeline
curl -X POST http://localhost:9200/v1/pipelines/default \
  -H "Content-Type: application/json" \
  -d '{"query": "How does pgEdge multi-active replication work?", "include_sources": true}'

Response from pipeline:

{
  "answer": "Based on the provided context, pgEdge multi-active replication works as follows:\n\n**Core Mechanism:**\n- pgEdge uses the **Spock extension** for logical replication to synchronize data between nodes\n- Multiple PostgreSQL nodes can accept **read and write operations simultaneously**\n- Each node is a **fully autonomous PostgreSQL instance** without requiring a single primary\n\n**Replication Method:**\n- Spock uses **logical decoding** to replicate individual rows (unlike physical/streaming replication)\n- Each node can have its own timeline and accept writes independently\n- Replication is configured **per-database and per-table**\n\n**Publish/Subscribe Model:**\n- Each node **publishes its changes** to a replication slot\n- Each node **subscribes to changes** from other nodes\n\n**Conflict Resolution:**\n- Conflicts occur when the same row is modified on two nodes simultaneously\n- Conflicts are resolved automatically using configurable resolution strategies (e.g., last-update-wins or error)\n\n**Important Limitation:**\n- **DDL changes** (such as CREATE TABLE and ALTER TABLE) are NOT automatically replicated by Spock\n- DDL changes must be run on each node separately\n\nThis design makes pgEdge ideal for applications requiring low-latency writes across geographically distributed regions.",
  "sources": [
    {
      "content": "pgEdge is a distributed PostgreSQL platform designed for global, multi-active deployments. It enables multiple PostgreSQL nodes to accept read and write operations simultaneously, making it ideal for applications that require low-latency writes across geographically distributed regions. pgEdge uses the Spock extension for logical replication to synchronize data between nodes without requiring a single primary. Each node is a fully autonomous PostgreSQL instance, and conflicts are resolved automa",
      "score": 0.00819672131147541
    },
    {
      "id": "5",
      "content": "Spock is a logical replication extension for PostgreSQL that powers pgEdge's multi-active replication. Unlike physical replication (streaming replication), Spock replicates individual rows using logical decoding, allowing each node to have its own timeline and accept writes. Spock replication is configured per-database and per-table. DDL changes such as CREATE TABLE and ALTER TABLE are NOT automatically replicated by Spock — they must be run on each node separately. Spock uses a publish/subscrib",
      "score": 0.00819672131147541
    },
    {
      "content": "Spock is a logical replication extension for PostgreSQL that powers pgEdge's multi-active replication. Unlike physical replication (streaming replication), Spock replicates individual rows using logical decoding, allowing each node to have its own timeline and accept writes. Spock replication is configured per-database and per-table. DDL changes such as CREATE TABLE and ALTER TABLE are NOT automatically replicated by Spock — they must be run on each node separately. Spock uses a publish/subscrib",
      "score": 0.008064516129032258
    },
    {
      "id": "1",
      "content": "pgEdge is a distributed PostgreSQL platform designed for global, multi-active deployments. It enables multiple PostgreSQL nodes to accept read and write operations simultaneously, making it ideal for applications that require low-latency writes across geographically distributed regions. pgEdge uses the Spock extension for logical replication to synchronize data between nodes without requiring a single primary. Each node is a fully autonomous PostgreSQL instance, and conflicts are resolved automa",
      "score": 0.008064516129032258
    },
    {
      "id": "12",
      "content": "restore workflow, stopping the database instance, running pgBackRest restore, and restarting Patroni. In a multi-node setup, pgBackRest is configured on a designated backup host and all nodes register with it for coordinated backup scheduling.",
      "score": 0.007936507936507936
    },
    {
      "content": "greSQL instance, and conflicts are resolved automatically using configurable conflict resolution policies. pgEdge supports standard PostgreSQL features including extensions, stored procedures, and full SQL compatibility.",
      "score": 0.007936507936507936
    },
    {
      "id": "6",
      "content": "ach node separately. Spock uses a publish/subscribe model where each node publishes its changes to a replication slot and subscribes to changes from other nodes. Conflicts occur when the same row is modified on two nodes simultaneously and are resolved by the configured resolution strategy (e.g., last-update-wins or error).",
      "score": 0.0078125
    },
    {
      "content": "pgEdge integrates with pgBackRest for backup and restore operations. pgBackRest supports full, differential, and incremental backups, and can store backups in local storage or cloud object storage (S3, GCS, Azure Blob). Backups are triggered via the Control Plane API using POST /v1/databases/{id}/backup. Restores can recover to a specific point in time using the restore_to_time parameter in the restore request. The Control Plane orchestrates the restore workflow, stopping the database instance, ",
      "score": 0.0078125
    },
    {
      "content": "The pgEdge Control Plane is a management server that orchestrates distributed PostgreSQL databases. It uses Docker Swarm to deploy database containers across multiple hosts and provides a declarative REST API for database lifecycle management. Users define the desired state of a database (nodes, users, services, scripts) in a JSON specification and submit it via POST /v1/databases. The Control Plane handles provisioning, configuration, and reconciliation automatically. It uses embedded etcd for ",
      "score": 0.007692307692307693
    },
    {
      "id": "2",
      "content": "greSQL instance, and conflicts are resolved automatically using configurable conflict resolution policies. pgEdge supports standard PostgreSQL features including extensions, stored procedures, and full SQL compatibility.",
      "score": 0.007692307692307693
    }
  ],
  "tokens_used": 1307
}

Checklist

  • Documentation updated

PLAT-495

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an unreleased changelog entry and expands documentation: updates the services index and introduces a comprehensive pgEdge RAG Server guide covering provisioning, configuration, hybrid vector+keyword retrieval, LLM answer synthesis, request/response contracts, deployment, and troubleshooting.

Changes

Cohort / File(s) Summary
Changelog Entry
changes/unreleased/Added-20260422-004204.yaml
Adds a new unreleased changelog item (kind: Added) describing the RAG service addition with timestamp.
Services Index
docs/services/index.md
Edits the pgEdge RAG Server description to state it returns LLM-synthesized answers grounded in your data and links to the new RAG doc.
RAG Documentation
docs/services/rag.md
Adds a full RAG service page: control plane provisioning, DB prerequisites and example SQL, hybrid vector+BM25 retrieval flow, config/pipelines reference, query request/response contract, deployment/update steps, and troubleshooting guidance.

Poem

🐰 I tunneled through docs at dawn's first light,
Vectors hum and token prompts take flight,
I stitched the pipeline, rows and prompts aligned,
Hybrid search and answers neatly combined,
A rabbit's tiny clap — the RAG just shined!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main change: adding comprehensive RAG service documentation and deployment guide.
Description check ✅ Passed The PR description follows the template with all required sections: Summary, Changes, Testing, and Checklist. Testing includes detailed verification steps with curl examples, database setup, and query results.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch PLAT-495-rag-service-database-preparation-guide

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 21, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/services/rag.md`:
- Around line 333-337: The examples after the first one are missing the required
DB schema provisioning described in the intro; update examples 2–5 to include
the scripts.post_database_create entry (same semantics as the Minimal example)
so the vector extension, documents_content_chunks table, and related indexes are
created, or alternatively amend the intro to state only the Minimal example
includes the schema setup; reference the scripts.post_database_create field and
ensure it provisions CREATE EXTENSION IF NOT EXISTS vector, the
documents_content_chunks table (with embedding vector(1536)), and the HNSW and
tsvector indexes so runtime queries against documents_content_chunks succeed.
- Line 772: Replace the invalid Anthropic model identifier "claude-sonnet-4-5"
with the correct ID "claude-sonnet-4-20250514" in the RAG service configuration
entries that currently reference that string; ensure both occurrences (the one
shown in the diff and the other matching instance) are updated so they match the
working model ID used elsewhere ("claude-sonnet-4-20250514").
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5b216a36-bf55-4fd3-9651-f30fa0fff6c7

📥 Commits

Reviewing files that changed from the base of the PR and between a682f3e and 95698a3.

📒 Files selected for processing (3)
  • changes/unreleased/Added-20260422-004204.yaml
  • docs/services/index.md
  • docs/services/rag.md

Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md
@mmols mmols requested a review from moizpgedge April 22, 2026 03:50
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/services/rag.md (1)

124-124: ⚠️ Potential issue | 🟠 Major

Use explicit, currently supported Anthropic model IDs consistently.

claude-sonnet-4-5 is ambiguous in docs/examples and can break as aliases change. Please standardize on an official current ID format across the reference and all curl examples (either a stable alias or a dated snapshot), and add a short note to check Anthropic deprecation timelines.

As of April 2026, what are the valid Anthropic Claude API model identifiers for Sonnet 4.5 and Sonnet 4.6, and which are aliases versus snapshot IDs? Please use Anthropic official documentation links only.

Also applies to: 291-291, 420-420, 556-557, 664-664, 891-891

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/services/rag.md` at line 124, The docs use an ambiguous Anthropic model
ID ("claude-sonnet-4-5") — update the `model` examples in docs/services/rag.md
and every matching occurrence (the `model` table entry and all curl/example
usages) to a specific, currently supported Anthropic model identifier (choose
either the official stable alias or the dated snapshot ID) and make the ID
consistent across all references; add a one-line note below the `model`
description reminding readers to confirm Anthropic deprecation timelines and
link to Anthropic's official model docs for verification so examples remain
valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/services/rag.md`:
- Line 124: The docs use an ambiguous Anthropic model ID ("claude-sonnet-4-5") —
update the `model` examples in docs/services/rag.md and every matching
occurrence (the `model` table entry and all curl/example usages) to a specific,
currently supported Anthropic model identifier (choose either the official
stable alias or the dated snapshot ID) and make the ID consistent across all
references; add a one-line note below the `model` description reminding readers
to confirm Anthropic deprecation timelines and link to Anthropic's official
model docs for verification so examples remain valid.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 14730477-48bd-485a-af23-a457e0717e4e

📥 Commits

Reviewing files that changed from the base of the PR and between 95698a3 and 453f5a7.

📒 Files selected for processing (1)
  • docs/services/rag.md

tsivaprasad and others added 3 commits April 22, 2026 20:04
- Resolve index.md merge conflict: keep RAG Server link, adopt
  main's connect_as-based Database Credentials section and
  updated Next Steps
- Apply pgEdge stylesheet to rag.md: 79-char wrap, hyphens for
  em-dashes, table intro sentences, bullet periods, no bold
  headings, Next Steps as doc links
- Remove redundant sections: Automation & Responsibilities,
  Loading Documents (duplicated in Step 3), User-Managed
  Responsibilities, Search Configuration (duplicated in
  Config Reference)

PLAT-495

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep fuller service descriptions for MCP and RAG entries;
adopt main's restructured intro and connect_as wording.

PLAT-495

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Restore main's trimmed MCP description; our PR only adds
the RAG Server entry.

PLAT-495

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Score values like 0.82/0.87 implied cosine similarity but the
RAG service returns RRF scores which are much smaller (~0.008).
Update both example responses to use realistic RRF score values.

PLAT-495

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/services/rag.md (1)

124-124: ⚠️ Potential issue | 🟠 Major

Verify and normalize Anthropic model IDs in examples.

claude-sonnet-4-5 appears repeatedly, but Anthropic examples/docs typically use dated IDs (for example claude-sonnet-4-20250514) or the official alias pattern. Please verify what the RAG server accepts and update all occurrences to a valid, consistent identifier to avoid copy/paste failures.

Does Anthropic's Messages API accept "claude-sonnet-4-5" as a valid model ID, or should docs use "claude-sonnet-4-20250514" / official alias names? Please cite Anthropic official model list docs.

Also applies to: 291-291, 420-420, 556-556, 577-577, 664-664, 891-891

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/services/rag.md` at line 124, Update all occurrences of the example
Anthropic model ID `claude-sonnet-4-5` used in the `model` examples to a
verified, valid Anthropic model identifier (e.g., the dated ID
`claude-sonnet-4-20250514` or the official alias) and ensure consistency across
the document; verify the exact accepted IDs against Anthropic's official model
list and replace every instance (including the other noted occurrences of
`claude-sonnet-4-5`) with the canonical ID or alias the RAG server accepts, and
add a short parenthetical note next to the `model` field explaining that the
value must match Anthropic's official model name/alias.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/services/rag.md`:
- Around line 1025-1031: The docs currently suggest checking the wrong DB user;
update the verification commands to check the service user used by this
deployment (connect_as) instead of admin: replace the example command that runs
"\du+ admin" with "\du+ app_read_only" so the guide validates the actual service
user (app_read_only) and keep the table check "\dt documents_content_chunks"
as-is.

---

Duplicate comments:
In `@docs/services/rag.md`:
- Line 124: Update all occurrences of the example Anthropic model ID
`claude-sonnet-4-5` used in the `model` examples to a verified, valid Anthropic
model identifier (e.g., the dated ID `claude-sonnet-4-20250514` or the official
alias) and ensure consistency across the document; verify the exact accepted IDs
against Anthropic's official model list and replace every instance (including
the other noted occurrences of `claude-sonnet-4-5`) with the canonical ID or
alias the RAG server accepts, and add a short parenthetical note next to the
`model` field explaining that the value must match Anthropic's official model
name/alias.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2a66fb37-373b-4721-ad85-ccd867e38523

📥 Commits

Reviewing files that changed from the base of the PR and between 453f5a7 and 7f29558.

📒 Files selected for processing (2)
  • docs/services/index.md
  • docs/services/rag.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/services/index.md

Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/services/rag.md`:
- Around line 1023-1025: Replace the hardcoded "-h localhost" in the psql
connectivity check command with the same host placeholder used throughout the
guide (e.g., "host-1" or the shared DB host variable) so the example reflects
distributed deployments; update the command shown (the psql -h ... -U admin -d
knowledge_base -c "SELECT 1") to use that placeholder instead of localhost.
- Around line 923-924: The example uses two different host
conventions—"localhost:9200" in the curl example and "host-1:9200"
earlier—causing copy/paste errors; pick one convention (preferably the earlier
"host-1:9200") and update all occurrences of "localhost:9200" (including the
curl example "curl http://localhost:9200/v1/pipelines" and the other instance
around the later snippet) to use the same host string "host-1:9200" so all
service URL examples are consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 61ee05e5-bbb1-47d1-84fe-1585474c65b5

📥 Commits

Reviewing files that changed from the base of the PR and between 7f29558 and 2d4c6c0.

📒 Files selected for processing (1)
  • docs/services/rag.md

Comment thread docs/services/rag.md Outdated
Comment thread docs/services/rag.md
@tsivaprasad tsivaprasad requested a review from moizpgedge April 23, 2026 06:21
Copy link
Copy Markdown
Contributor

@moizpgedge moizpgedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread docs/services/rag.md Outdated
### OpenAI End-to-End

In the following example, OpenAI is used for both embeddings and
answer generation:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence would read more clearly as:

In the following example, OpenAI is used for both embeddings and to generate answers:

Clarified wording in the documentation regarding shared default values for pipelines.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
docs/services/rag.md (2)

438-500: Add dimension reminder to Ollama example.

Similar to the Voyage AI example, users of the Ollama example need to know that nomic-embed-text produces 768-dimensional vectors, not 1536. Without this reminder, document insertions will fail with dimension mismatch errors.

📝 Suggested inline reminder

Add a note after the example heading:

### Ollama (Self-Hosted)

In the following example, the RAG service uses a self-hosted Ollama
server for both embeddings and answer generation. No API key is
required; the Ollama server URL is provided via `base_url`:

!!! note
    When using `nomic-embed-text`, adjust the database schema to use `vector(768)` 
    instead of `vector(1536)`. See [Vector Dimensions](`#vector-dimensions`).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/services/rag.md` around lines 438 - 500, Add a short note under the
"Ollama (Self-Hosted)" example explaining that the embedding model
"nomic-embed-text" produces 768-dimensional vectors (not 1536) and that users
must update their schema/vector column type (e.g., use vector(768)) before
inserting documents; place the reminder near the embedding_llm block in the
example so it’s visible when configuring the Ollama provider.

371-436: Add dimension reminder to Voyage AI example.

While lines 224-231 explain that users should adjust vector(N) dimensions, the Voyage AI example doesn't include an inline reminder that voyage-3 requires vector(1024) instead of the vector(1536) shown in the first example's schema. Users who jump directly to this example might miss the dimension mismatch, resulting in insertion failures.

📝 Suggested inline reminder

Add a note after the example heading:

### Voyage AI with Vector-Only Search

In the following example, Voyage AI is used for embeddings and the
service is configured for vector-only search (disabling BM25 keyword
matching):

!!! note
    When using `voyage-3`, adjust the database schema to use `vector(1024)` 
    instead of `vector(1536)`. See [Vector Dimensions](`#vector-dimensions`).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/services/rag.md` around lines 371 - 436, The Voyage AI example is
missing the schema-dimension reminder; update the "Voyage AI with Vector-Only
Search" section to add a short note after the heading that tells users that the
voyage-3 model requires vector(1024) (not vector(1536)) and to adjust the
database schema accordingly (reference the Vector Dimensions anchor), so readers
who jump straight to this example won't get failed inserts due to dimension
mismatch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/services/rag.md`:
- Line 124: Replace every example model identifier "claude-sonnet-4-5" with the
valid identifier "claude-sonnet-4-6" in the docs for the `model` field;
specifically update all seven occurrences referenced in the review (examples
shown alongside the `model` property in docs/services/rag.md) so examples use
"claude-sonnet-4-6" (do not change other models like `gpt-4o` or `llama3.2`).

---

Nitpick comments:
In `@docs/services/rag.md`:
- Around line 438-500: Add a short note under the "Ollama (Self-Hosted)" example
explaining that the embedding model "nomic-embed-text" produces 768-dimensional
vectors (not 1536) and that users must update their schema/vector column type
(e.g., use vector(768)) before inserting documents; place the reminder near the
embedding_llm block in the example so it’s visible when configuring the Ollama
provider.
- Around line 371-436: The Voyage AI example is missing the schema-dimension
reminder; update the "Voyage AI with Vector-Only Search" section to add a short
note after the heading that tells users that the voyage-3 model requires
vector(1024) (not vector(1536)) and to adjust the database schema accordingly
(reference the Vector Dimensions anchor), so readers who jump straight to this
example won't get failed inserts due to dimension mismatch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 25822863-81d1-467e-8745-bca1003c7225

📥 Commits

Reviewing files that changed from the base of the PR and between 2d4c6c0 and c8668ef.

📒 Files selected for processing (1)
  • docs/services/rag.md

Comment thread docs/services/rag.md
| Field | Type | Description |
|---|---|---|
| `provider` | string | Required. The LLM provider. One of: `anthropic`, `openai`, `ollama`. |
| `model` | string | Required. The model name (e.g., `claude-sonnet-4-5`, `gpt-4o`, `llama3.2`). |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What are the current valid Claude Sonnet model identifiers for the Anthropic API in 2026?

💡 Result:

The current valid Claude Sonnet model identifiers for the Anthropic API as of April 23, 2026, are claude-sonnet-4-6 (primary current model, alias claude-sonnet-4-6) and claude-sonnet-4-5-20250929 (still available but recommended to migrate). Older versions like claude-sonnet-4-20250514 are deprecated and scheduled for retirement on June 15, 2026. Previous Claude 3.x Sonnet models (e.g., claude-3-7-sonnet-20250219, claude-3-5-sonnet-*) have been retired.

Citations:


Update all model identifier examples from claude-sonnet-4-5 to claude-sonnet-4-6.

The model identifier claude-sonnet-4-5 is not a valid Anthropic Claude model. Current valid identifiers are claude-sonnet-4-6 (primary) and claude-sonnet-4-5-20250929. The incorrect identifier appears in 7 locations throughout the documentation (lines 124, 294, 423, 558, 579, 666, 893) and will cause all examples to fail at runtime. Update these references to use claude-sonnet-4-6.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/services/rag.md` at line 124, Replace every example model identifier
"claude-sonnet-4-5" with the valid identifier "claude-sonnet-4-6" in the docs
for the `model` field; specifically update all seven occurrences referenced in
the review (examples shown alongside the `model` property in
docs/services/rag.md) so examples use "claude-sonnet-4-6" (do not change other
models like `gpt-4o` or `llama3.2`).

Copy link
Copy Markdown
Member

@susan-pgedge susan-pgedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I committed the change that I noted below... this all looks good to me!

Comment thread docs/services/rag.md Outdated
### Minimal (OpenAI + Anthropic)

In the following example, a `curl` command provisions a RAG service
with OpenAI for embeddings and Anthropic Claude for answer generation:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

In the following example, a curl command provisions a RAG service using OpenAI for embeddings and Anthropic Claude to generate answers:

@tsivaprasad tsivaprasad merged commit d8aaf53 into main Apr 23, 2026
3 checks passed
@tsivaprasad tsivaprasad deleted the PLAT-495-rag-service-database-preparation-guide branch April 23, 2026 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants