Skip to content

fix: RAG filtering error with multiple where conditions (fixes #1728)#1729

Open
praisonai-triage-agent[bot] wants to merge 3 commits into
mainfrom
claude/issue-1728-20260523-1153
Open

fix: RAG filtering error with multiple where conditions (fixes #1728)#1729
praisonai-triage-agent[bot] wants to merge 3 commits into
mainfrom
claude/issue-1728-20260523-1153

Conversation

@praisonai-triage-agent
Copy link
Copy Markdown
Contributor

@praisonai-triage-agent praisonai-triage-agent Bot commented May 23, 2026

Summary

Fixes the RAG filtering error where len(where) != 1 fails when the where parameter contains multiple conditions.

Problem

When using RAG with multiple filter conditions (user_id, agent_id, run_id), the ChromaKnowledgeAdapter creates a dictionary with multiple keys, but ChromaDB's internal validation expects exactly one condition in the where clause.

Solution

Modified the ChromaKnowledgeAdapter to use ChromaDB's $and operator when multiple filter conditions are present:

  • Single condition: Passes through unchanged for optimal performance
  • Multiple conditions: Wrapped in {$and: [...]} format as required by ChromaDB

Changes

  • src/praisonai-agents/praisonaiagents/knowledge/adapters/factories.py: Updated search method to handle multiple where conditions using $and operator

Testing

  • βœ… Created verification test confirming proper formatting logic
  • βœ… Maintains backward compatibility with single condition filters
  • βœ… Follows ChromaDB's documented query format

Fixes

Fixes #1728

Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced search filtering to properly validate and construct filter conditions when applying scoped filters.

Review Change Stack

…ator

- Fix ChromaKnowledgeAdapter to handle multiple filter conditions properly
- Use ChromaDB's  operator when len(where_filter) > 1
- Resolves validation error: len(where) != 1 when filtering by user_id, agent_id, run_id
- Maintains backward compatibility with single condition filters
- Fixes #1728

Co-authored-by: praisonai-triage-agent[bot] <praisonai-triage-agent[bot]@users.noreply.github.com>
@MervinPraison
Copy link
Copy Markdown
Owner

@coderabbitai review

@MervinPraison
Copy link
Copy Markdown
Owner

/review

@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more β†’

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account β†’

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us β†’

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

βœ… Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

βš™οΈ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1dbb168-5efd-4061-9fd4-5b618673a60d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • πŸ” Trigger review
πŸ“ Walkthrough

Walkthrough

The PR fixes a ChromaDB query construction bug where multiple scoped filters (user_id, agent_id, run_id) caused failures. The ChromaKnowledgeAdapter.search method now normalizes the filter structure: single conditions are passed directly, while multiple conditions are wrapped in ChromaDB's $and format before execution.

Changes

ChromaDB Query Filter Normalization

Layer / File(s) Summary
Query filter condition handling
src/praisonai-agents/praisonaiagents/knowledge/adapters/factories.py
The where_filter is conditionally normalized: single filter conditions are assigned directly to query_kwargs["where"], while multiple conditions are wrapped in a $and list structure that matches ChromaDB's expected query format.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Filters that once would break and fail,
Now dance in $and without travail,
User, agent, run combined with grace,
ChromaDB searches find their place! ✨

πŸš₯ Pre-merge checks | βœ… 5
βœ… Passed checks (5 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The PR title accurately reflects the main change: fixing RAG filtering to handle multiple where conditions by properly formatting them with ChromaDB's $and operator.
Linked Issues check βœ… Passed The PR directly addresses all coding requirements from issue #1728: fixing the bug where multiple where conditions (user_id, agent_id, run_id) caused filtering to fail, implementing proper ChromaDB query format with conditional $and wrapping, and ensuring backward compatibility.
Out of Scope Changes check βœ… Passed All changes are scoped to the ChromaKnowledgeAdapter search method to fix the identified filtering bug; no unrelated modifications are present.
Docstring Coverage βœ… Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
πŸ§ͺ Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/issue-1728-20260523-1153

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@MervinPraison
Copy link
Copy Markdown
Owner

@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first β€” incorporate their findings.

Review areas:

  1. Bloat check: Are changes minimal and focused? Any unnecessary code or scope creep?
  2. Security: Any hardcoded secrets, unsafe eval/exec, missing input validation?
  3. Performance: Any module-level heavy imports? Hot-path regressions?
  4. Tests: Are tests included? Do they cover the changes adequately?
  5. Backward compat: Any public API changes without deprecation?
  6. Code quality: DRY violations, naming conventions, error handling?
  7. Address reviewer feedback: If Qodo, Coderabbit, or Gemini flagged valid issues, include them in your review
  8. Suggest specific improvements with code examples where possible

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 23, 2026

Greptile Summary

This PR fixes a ValueError: len(where) != 1 crash in ChromaKnowledgeAdapter.search() that occurred when multiple scoped filter conditions (user_id, agent_id, run_id) were all provided. ChromaDB requires exactly one top-level key in a where clause, so multiple conditions must be wrapped in a $and logical operator.

  • factories.py: Adds a branch in search() that checks len(where_filter). A single condition is passed through unchanged; two or more conditions are converted to {\"$and\": [{k: v}, ...]} format as required by ChromaDB.
  • test_adapters.py: Adds two new unit tests that mock the embedding call and assert the correct where-filter structure is produced for both the single-condition and multi-condition cases.

Confidence Score: 5/5

Safe to merge β€” the change is a minimal, targeted fix for a well-understood ChromaDB validation constraint.

The fix correctly uses ChromaDB's documented $and logical operator to combine multiple metadata filter conditions. ChromaDB accepts simple key-value shorthand inside $and arrays (treated as implicit $eq), so the produced filter structure is valid. The single-condition path is unchanged, preserving backward compatibility. Two focused unit tests verify both branches.

No files require special attention.

Important Files Changed

Filename Overview
src/praisonai-agents/praisonaiagents/knowledge/adapters/factories.py Adds $and wrapping for multiple where-filter conditions; logic is correct per ChromaDB docs (simple key-value shorthand is valid inside $and arrays)
src/praisonai-agents/tests/unit/knowledge/test_adapters.py Adds two unit tests covering single and multi-condition where-filter construction; monkeypatching approach is correct for the inline import pattern used in search()

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[search called with user_id / agent_id / run_id] --> B[Build where_filter dict]
    B --> C[Merge with extra filters]
    C --> D{len where_filter}
    D -- "== 0" --> E[No where clause]
    D -- "== 1" --> F["query_kwargs['where'] = where_filter as-is"]
    D -- "> 1" --> G["and_conditions = [{k:v} for k,v in where_filter.items()]"]
    G --> H["query_kwargs['where'] = {$and: and_conditions}"]
    E --> I[collection.query]
    F --> I
    H --> I
    I --> J[Return SearchResult]
Loading

Reviews (2): Last reviewed commit: "fix: add logging to ChromaDB search exce..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

Copilot AI commented May 23, 2026

@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first β€” incorporate their findings.

Review areas:...

Completed a full review and incorporated all available reviewer outputs. Actionable gap found was test coverage, so I added focused unit tests in commit 38900f7 to validate both single-filter pass-through and multi-filter $and formatting in ChromaKnowledgeAdapter.search.

Review outcome: changes remain minimal and focused, no secrets/unsafe eval paths introduced, no hot-path import/perf regression, backward compatibility is preserved for single-condition filters, and code quality is consistent. Reviewer status incorporated: Qodo had no technical findings (reviews paused), CodeRabbit had no actionable issues, and Gemini had no posted code-review comments on this PR run.

@MervinPraison
Copy link
Copy Markdown
Owner

@claude You are the FINAL architecture reviewer. If the branch is under MervinPraison/PraisonAI (not a fork), you are able to make modifications to this branch and push directly. SCOPE: Focus ONLY on Python packages (praisonaiagents, praisonai). Do NOT modify praisonai-rust or praisonai-ts. Read ALL comments above from Gemini, Qodo, CodeRabbit, and Copilot carefully before responding.

Phase 1: Review per AGENTS.md

  1. Protocol-driven: check heavy implementations vs core SDK
  2. Backward compatible: ensure zero feature regressions
  3. Performance: no hot-path regressions

Phase 2: FIX Valid Issues
4. For any VALID bugs or architectural flaws found by Gemini, CodeRabbit, Qodo, Copilot, or any other reviewer: implement the fix
5. Push all code fixes directly to THIS branch (do NOT create a new PR)
6. Comment a summary of exact files modified and what you skipped

Phase 3: Final Verdict
7. If all issues are resolved, approve the PR / close the Issue
8. If blocking issues remain, request changes / leave clear action items

@praisonai-triage-agent
Copy link
Copy Markdown
Contributor Author

praisonai-triage-agent Bot commented May 23, 2026

Claude finished @MervinPraison's task β€”β€” View job


Final Architecture Review βœ…

Phase 1: Review per AGENTS.md

  • Read AGENTS.md architecture guidelines
  • Check protocol-driven design (heavy implementations vs core SDK)
  • Verify backward compatibility (zero feature regressions)
  • Validate performance (no hot-path regressions)

Phase 2: Address Reviewer Feedback

  • Analyze Greptile's silent exception handling concern
  • Address silent exception handling (added logging)
  • Review CodeRabbit findings
  • Check Copilot's test additions

Phase 3: Final Verdict

  • All issues resolved - APPROVED

Architecture Review Summary

βœ… Architectural Compliance

Protocol-Driven Design: βœ… PASSES

  • Changes are properly isolated in knowledge adapters
  • Factory pattern correctly implemented
  • Heavy ChromaDB implementation in adapter, not core

Backward Compatibility: βœ… PASSES

  • API surface unchanged (user_id, agent_id, run_id parameters preserved)
  • Single-condition queries work exactly as before
  • Only internal ChromaDB query formatting improved

Performance: βœ… PASSES

  • No import-time regressions (factory pattern used)
  • No hot-path impact (conditional logic only when multiple filters)
  • Efficient: avoids $and wrapper for single conditions

βœ… Code Quality Review

Core Fix: The ChromaDB query bug fix is architecturally sound:

# Before: {'user_id': 'u1', 'agent_id': 'a1'} - ChromaDB rejects this
# After: {'$and': [{'user_id': 'u1'}, {'agent_id': 'a1'}]} - ChromaDB accepts
if len(where_filter) == 1:
    query_kwargs["where"] = where_filter  # Single condition - pass through
else:
    and_conditions = [{k: v} for k, v in where_filter.items()]
    query_kwargs["where"] = {"$and": and_conditions}  # Multi - wrap with $and

Test Coverage: βœ… Adequate (lines 181-230 in test_adapters.py)

  • Tests both single and multiple condition scenarios
  • Validates correct $and wrapping behavior

πŸ”§ Issue Resolution

Greptile's Silent Exception Concern: βœ… ADDRESSED

  • Added warning logging to ChromaDB search exception handler (commit c6fd980)
  • Pre-existing pattern now provides debugging visibility
  • Maintains error isolation while improving observability

Other Reviewer Feedback: βœ… NO ACTIONABLE ISSUES

  • CodeRabbit: No architectural concerns raised
  • Copilot: Added comprehensive tests (commit 38900f7)
  • Qodo: Reviews paused, no technical feedback

πŸ“‹ Files Modified

  1. factories.py: Core fix + logging improvement

    • Added conditional $and wrapping for ChromaDB queries
    • Enhanced exception logging for better debugging
  2. test_adapters.py: Test coverage

    • Added verification for both single and multi-condition scenarios
    • Validates ChromaDB query format correctness

Final Verdict: βœ… APPROVED

This PR successfully fixes the ChromaDB filtering bug while maintaining architectural principles. The solution is minimal, focused, and follows the protocol-driven design patterns established in AGENTS.md.

Ready to merge - all architectural requirements satisfied and reviewer concerns addressed.


Job Run | Branch: claude/issue-1728-20260523-1153

Improves debugging for ChromaDB query failures by logging warnings
instead of silently returning empty results.

Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] RAG filtering error in types.py: len(where) != 1 fails when where has 2 conditions

2 participants