docs: improve llama stack pgvector docs by davidwtf · Pull Request #199 · alauda/aml-docs

davidwtf · 2026-04-24T05:55:40Z

Summary by CodeRabbit

Documentation
- Added end-to-end PGVector vector store setup and validation guidance, including server enablement and connection configuration.
- Updated quickstart and notebook with an optional PGVector workflow showing vector store creation, file upload, and hybrid search examples.
- Clarified embedding model provisioning and offline modes, including how the default model is obtained and how to pre-populate caches for offline use.

coderabbitai · 2026-04-24T05:55:53Z

Walkthrough

Documentation adds optional PGVector-backed vector store support and detailed embedding model provisioning options (Hugging Face online/offline modes). Changes include env var examples for PGVector and HF, Kubernetes containerSpec.env snippets, validation/workflow steps, and an updated quickstart notebook demonstrating PGVector creation and hybrid search.

Changes

Cohort / File(s)	Summary
Installation & Features & Quickstart Docs `docs/en/llama_stack/install.mdx`, `docs/en/llama_stack/overview/features.mdx`, `docs/en/llama_stack/quickstart.mdx`	Added PGVector enablement docs (env vars `ENABLE_PGVECTOR`, `PGVECTOR_*`), Kubernetes `containerSpec.env` examples, PGVector validation workflow, and detailed Hugging Face embedding model access modes (online, cached, fully offline) and related env vars (`HF_ENDPOINT`, cache/offline notes). Quickstart now pins `llama-stack-client==0.6.0`.
Quickstart Notebook `docs/public/llama-stack/llama-stack_quickstart.ipynb`	Inserted Section 4 demonstrating embedding model discovery, deriving embedding dims, uploading a sample file, creating a `provider_id="pgvector"` vector store with resolved embedding settings, running a hybrid search, and printing payloads. Renumbered downstream sections.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Client
    participant Server
    participant PGVectorDB as PostgreSQL (pgvector)
    participant HF as Hugging Face

    rect rgba(200,230,201,0.5)
    User->>Client: Run quickstart / notebook (create/search)
    Client->>Server: API request (create vector store / search)
    end

    rect rgba(187,222,251,0.5)
    Server->>HF: Request embedding model / embeddings (HF_ENDPOINT or local cache)
    HF-->>Server: Return embeddings
    end

    rect rgba(255,224,178,0.5)
    Server->>PGVectorDB: Store vectors / Run hybrid search (pgvector)
    PGVectorDB-->>Server: Return vector search results
    end

    Server-->>Client: Return search results
    Client-->>User: Display results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

add llama-stack introduciton and simple usage #111: Modifies the same Llama Stack documentation files with overlapping PGVector and embedding model provisioning changes.

Suggested reviewers

zhaomingkun1030
typhoonzero

Poem

🐰 I hopped through docs both near and far,
PGVector nestled like a star.
Embeddings fetched or cached with care,
Hybrid searches dancing in the air. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly describes the main change: improvements to PGVector documentation in the Llama Stack installation and usage guides.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch imp/llamastack-pgvector

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Ruff (0.15.11)

docs/public/llama-stack/llama-stack_quickstart.ipynb

Unexpected end of JSON input

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/public/llama-stack/llama-stack_quickstart.ipynb`:
- Around line 512-518: The code silently falls back to a hardcoded 768 when
determining embedding_dimension; instead, change the logic around
embedding_dimension (the block using
embedding_metadata.get("embedding_dimension") / get("dimensions") and
getattr(embedding_model, "embedding_dimension"/"dimensions")) to fail fast: if
none of those values exist, raise a clear error (e.g., ValueError) or log an
explicit warning and require the user to supply an explicit embedding_dimension
parameter; ensure the error message references the embedding_model and
embedding_metadata so users know to verify their model (e.g., mention common
dims like 384/1024/1536) and avoid creating the pgvector store with an incorrect
dimension.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 648bff00-c121-4c69-b176-58221b71c508

📥 Commits

Reviewing files that changed from the base of the PR and between d4c4a48 and ed34ea3.

📒 Files selected for processing (4)

docs/en/llama_stack/install.mdx
docs/en/llama_stack/overview/features.mdx
docs/en/llama_stack/quickstart.mdx
docs/public/llama-stack/llama-stack_quickstart.ipynb

cloudflare-workers-and-pages · 2026-04-24T06:03:45Z

Deploying alauda-ai with Cloudflare Pages

Latest commit:	`c0d9a1b`
Status:	✅ Deploy successful!
Preview URL:	https://f27b5259.alauda-ai.pages.dev
Branch Preview URL:	https://imp-llamastack-pgvector.alauda-ai.pages.dev

View logs

coderabbitai

🧹 Nitpick comments (1)

docs/en/llama_stack/install.mdx (1)
148-182: Optional: consolidate HF configuration guidance.

The Hugging Face setup is described in two places — the commented block inside the YAML (lines 90–104) and this standalone section — with partly overlapping content. Consider either (a) trimming the YAML comments to just cross-reference this section, or (b) removing the standalone section and keeping the YAML comments as the single source of truth. Not a blocker; purely a readability / maintenance nit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/llama_stack/install.mdx` around lines 148 - 182, This file duplicates
Hugging Face configuration guidance across the YAML comment block and the
standalone "Hugging Face Access For Embedding Models" section; pick one
canonical location and remove the duplicate: either trim the YAML comment block
to a brief cross-reference that points to this standalone section, or delete
this standalone section and expand the YAML comments to be the single source of
truth. While editing, ensure the documented environment variables (HF_ENDPOINT,
HF_HUB_CACHE, HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE, HF_HUB_DISABLE_XET) and the
recommended cache path (/home/lls/.lls/huggingface/hub) are preserved in the
chosen location and update any internal links or comment markers so readers can
find the full configuration in one place.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/en/llama_stack/install.mdx`:
- Around line 148-182: This file duplicates Hugging Face configuration guidance
across the YAML comment block and the standalone "Hugging Face Access For
Embedding Models" section; pick one canonical location and remove the duplicate:
either trim the YAML comment block to a brief cross-reference that points to
this standalone section, or delete this standalone section and expand the YAML
comments to be the single source of truth. While editing, ensure the documented
environment variables (HF_ENDPOINT, HF_HUB_CACHE, HF_HUB_OFFLINE,
TRANSFORMERS_OFFLINE, HF_HUB_DISABLE_XET) and the recommended cache path
(/home/lls/.lls/huggingface/hub) are preserved in the chosen location and update
any internal links or comment markers so readers can find the full configuration
in one place.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f29828f7-972d-4599-89ed-5a3ecdc0cfc2

📥 Commits

Reviewing files that changed from the base of the PR and between ed34ea3 and c0d9a1b.

📒 Files selected for processing (3)

docs/en/llama_stack/install.mdx
docs/en/llama_stack/quickstart.mdx
docs/public/llama-stack/llama-stack_quickstart.ipynb

🚧 Files skipped from review as they are similar to previous changes (1)

docs/en/llama_stack/quickstart.mdx

docs: improve llama stack pgvector docs

ed34ea3

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread docs/public/llama-stack/llama-stack_quickstart.ipynb Outdated

update

c0d9a1b

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: improve llama stack pgvector docs#199

docs: improve llama stack pgvector docs#199
davidwtf wants to merge 2 commits intomasterfrom
imp/llamastack-pgvector

davidwtf commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidwtf commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying alauda-ai with Cloudflare Pages

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

davidwtf commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Apr 24, 2026 •

edited

Loading