Skip to content

docs: improve llama stack pgvector docs#199

Open
davidwtf wants to merge 2 commits intomasterfrom
imp/llamastack-pgvector
Open

docs: improve llama stack pgvector docs#199
davidwtf wants to merge 2 commits intomasterfrom
imp/llamastack-pgvector

Conversation

@davidwtf
Copy link
Copy Markdown
Contributor

@davidwtf davidwtf commented Apr 24, 2026

Summary by CodeRabbit

  • Documentation
    • Added end-to-end PGVector vector store setup and validation guidance, including server enablement and connection configuration.
    • Updated quickstart and notebook with an optional PGVector workflow showing vector store creation, file upload, and hybrid search examples.
    • Clarified embedding model provisioning and offline modes, including how the default model is obtained and how to pre-populate caches for offline use.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

Walkthrough

Documentation adds optional PGVector-backed vector store support and detailed embedding model provisioning options (Hugging Face online/offline modes). Changes include env var examples for PGVector and HF, Kubernetes containerSpec.env snippets, validation/workflow steps, and an updated quickstart notebook demonstrating PGVector creation and hybrid search.

Changes

Cohort / File(s) Summary
Installation & Features & Quickstart Docs
docs/en/llama_stack/install.mdx, docs/en/llama_stack/overview/features.mdx, docs/en/llama_stack/quickstart.mdx
Added PGVector enablement docs (env vars ENABLE_PGVECTOR, PGVECTOR_*), Kubernetes containerSpec.env examples, PGVector validation workflow, and detailed Hugging Face embedding model access modes (online, cached, fully offline) and related env vars (HF_ENDPOINT, cache/offline notes). Quickstart now pins llama-stack-client==0.6.0.
Quickstart Notebook
docs/public/llama-stack/llama-stack_quickstart.ipynb
Inserted Section 4 demonstrating embedding model discovery, deriving embedding dims, uploading a sample file, creating a provider_id="pgvector" vector store with resolved embedding settings, running a hybrid search, and printing payloads. Renumbered downstream sections.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Client
    participant Server
    participant PGVectorDB as PostgreSQL (pgvector)
    participant HF as Hugging Face

    rect rgba(200,230,201,0.5)
    User->>Client: Run quickstart / notebook (create/search)
    Client->>Server: API request (create vector store / search)
    end

    rect rgba(187,222,251,0.5)
    Server->>HF: Request embedding model / embeddings (HF_ENDPOINT or local cache)
    HF-->>Server: Return embeddings
    end

    rect rgba(255,224,178,0.5)
    Server->>PGVectorDB: Store vectors / Run hybrid search (pgvector)
    PGVectorDB-->>Server: Return vector search results
    end

    Server-->>Client: Return search results
    Client-->>User: Display results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • zhaomingkun1030
  • typhoonzero

Poem

🐰 I hopped through docs both near and far,
PGVector nestled like a star.
Embeddings fetched or cached with care,
Hybrid searches dancing in the air. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: improvements to PGVector documentation in the Llama Stack installation and usage guides.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch imp/llamastack-pgvector

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Ruff (0.15.11)
docs/public/llama-stack/llama-stack_quickstart.ipynb

Unexpected end of JSON input


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/public/llama-stack/llama-stack_quickstart.ipynb`:
- Around line 512-518: The code silently falls back to a hardcoded 768 when
determining embedding_dimension; instead, change the logic around
embedding_dimension (the block using
embedding_metadata.get("embedding_dimension") / get("dimensions") and
getattr(embedding_model, "embedding_dimension"/"dimensions")) to fail fast: if
none of those values exist, raise a clear error (e.g., ValueError) or log an
explicit warning and require the user to supply an explicit embedding_dimension
parameter; ensure the error message references the embedding_model and
embedding_metadata so users know to verify their model (e.g., mention common
dims like 384/1024/1536) and avoid creating the pgvector store with an incorrect
dimension.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 648bff00-c121-4c69-b176-58221b71c508

📥 Commits

Reviewing files that changed from the base of the PR and between d4c4a48 and ed34ea3.

📒 Files selected for processing (4)
  • docs/en/llama_stack/install.mdx
  • docs/en/llama_stack/overview/features.mdx
  • docs/en/llama_stack/quickstart.mdx
  • docs/public/llama-stack/llama-stack_quickstart.ipynb

Comment thread docs/public/llama-stack/llama-stack_quickstart.ipynb Outdated
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 24, 2026

Deploying alauda-ai with  Cloudflare Pages  Cloudflare Pages

Latest commit: c0d9a1b
Status: ✅  Deploy successful!
Preview URL: https://f27b5259.alauda-ai.pages.dev
Branch Preview URL: https://imp-llamastack-pgvector.alauda-ai.pages.dev

View logs

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/en/llama_stack/install.mdx (1)

148-182: Optional: consolidate HF configuration guidance.

The Hugging Face setup is described in two places — the commented block inside the YAML (lines 90–104) and this standalone section — with partly overlapping content. Consider either (a) trimming the YAML comments to just cross-reference this section, or (b) removing the standalone section and keeping the YAML comments as the single source of truth. Not a blocker; purely a readability / maintenance nit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/llama_stack/install.mdx` around lines 148 - 182, This file duplicates
Hugging Face configuration guidance across the YAML comment block and the
standalone "Hugging Face Access For Embedding Models" section; pick one
canonical location and remove the duplicate: either trim the YAML comment block
to a brief cross-reference that points to this standalone section, or delete
this standalone section and expand the YAML comments to be the single source of
truth. While editing, ensure the documented environment variables (HF_ENDPOINT,
HF_HUB_CACHE, HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE, HF_HUB_DISABLE_XET) and the
recommended cache path (/home/lls/.lls/huggingface/hub) are preserved in the
chosen location and update any internal links or comment markers so readers can
find the full configuration in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/en/llama_stack/install.mdx`:
- Around line 148-182: This file duplicates Hugging Face configuration guidance
across the YAML comment block and the standalone "Hugging Face Access For
Embedding Models" section; pick one canonical location and remove the duplicate:
either trim the YAML comment block to a brief cross-reference that points to
this standalone section, or delete this standalone section and expand the YAML
comments to be the single source of truth. While editing, ensure the documented
environment variables (HF_ENDPOINT, HF_HUB_CACHE, HF_HUB_OFFLINE,
TRANSFORMERS_OFFLINE, HF_HUB_DISABLE_XET) and the recommended cache path
(/home/lls/.lls/huggingface/hub) are preserved in the chosen location and update
any internal links or comment markers so readers can find the full configuration
in one place.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f29828f7-972d-4599-89ed-5a3ecdc0cfc2

📥 Commits

Reviewing files that changed from the base of the PR and between ed34ea3 and c0d9a1b.

📒 Files selected for processing (3)
  • docs/en/llama_stack/install.mdx
  • docs/en/llama_stack/quickstart.mdx
  • docs/public/llama-stack/llama-stack_quickstart.ipynb
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/en/llama_stack/quickstart.mdx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant