Skip to content

PyDevDeep/bubblebrain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

201 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 BubbleBrain

AI-powered chatbot backend for e-commerce β€” RAG pipeline, price comparison, lead generation, and Flowise widget integration in a single production-ready FastAPI service.

Python FastAPI OpenAI Pinecone Poetry Docker Coverage License


πŸ“Œ Why BubbleBrain Exists

Modern e-commerce stores lose customers due to slow or absent support. BubbleBrain automates this entirely:

  • Answers product questions instantly using your store's own data (RAG, no hallucinations)
  • Compares prices between your store and suppliers in real-time
  • Captures leads and routes hot prospects directly to Telegram
  • Embeds into any frontend via Flowise Chat Widget β€” no custom UI required
  • Syncs with WooCommerce via webhooks to stay up-to-date on orders and inventory

πŸš€ Features

  • RAG Engine β€” retrieves accurate answers from your product catalog using OpenAI Embeddings + Pinecone vector search
  • Price Comparator β€” scrapes supplier sites and compares against WooCommerce prices on demand
  • Lead Pipeline β€” classifies intent, captures contact info, and routes hot leads to dedicated Telegram topics
  • Document Ingestion β€” uploads and indexes PDF/DOCX files into the vector store via /api/v1/ingest
  • WooCommerce Webhook β€” receives real-time order/product events and updates internal state
  • Telegram Integration β€” broadcasts lead alerts, price updates, bot stats, and errors across topic-organized groups
  • API Key Auth β€” static secret key validation on all /api/v1/* endpoints
  • Rate Limiting β€” 20 requests/min per IP via slowapi
  • Structured Logging β€” structlog + Sentry error tracking
  • Prometheus Metrics β€” built-in /metrics endpoint, Prometheus container included in Compose
  • Conversation Memory β€” per-session chat history stored in SQLite via aiosqlite
  • 88% Test Coverage β€” pytest suite with async support and remote integration tests

πŸ›  Tech Stack

Layer Technology
Runtime Python 3.13
Web Framework FastAPI + Uvicorn
AI / LLM OpenAI GPT-3.5/4, text-embedding-3-small
Vector DB Pinecone
Chat Widget Flowise Embed
WooCommerce REST API + Webhooks
Scheduling APScheduler
HTTP Client httpx
Scraping BeautifulSoup4, requests
Data Validation Pydantic v2, pydantic-settings
Database SQLite (aiosqlite) + SQLAlchemy
Monitoring Prometheus, Sentry SDK
Logging structlog
Rate Limiting slowapi
Containerization Docker, Docker Compose
Dependency Manager Poetry
Linter / Formatter Ruff
Type Checker mypy (strict), pyright
Testing pytest, pytest-asyncio, pytest-cov
Docs MkDocs Material

πŸ“¦ Quick Start

Prerequisites

  • Python 3.13+
  • Poetry
  • Docker + Docker Compose
  • OpenAI API key
  • Pinecone API key (free tier works)

1. Clone the repository

git clone https://github.com/PyDevDeep/BubbleBrain.git
cd BubbleBrain

2. Configure environment variables

cp .env.example .env

Open .env and fill in the required values:

# Required
OPENAI_API_KEY=sk-...
PINECONE_API_KEY=pc-...
PINECONE_INDEX_NAME=chatbot-index
API_KEY_SECRET=your_static_secret_key

# WooCommerce (if using webhook integration)
WOO_CK=your_consumer_key
WOO_CS=your_consumer_secret
WOO_URL=https://your-shop-domain.com

# Telegram (for lead alerts)
TELEGRAM_CONTACT_URL=https://t.me/your_bot

See .env.example for the full list of available variables.

3. Start infrastructure

docker-compose up -d

This launches:

  • bubblebrain-app on port 8200 (maps to internal 8000)
  • bubblebrain-prometheus on port 9290

4. Install dependencies and start the dev server

poetry install
poetry run uvicorn app.main:app --reload

API is now available at http://localhost:8000. Interactive Swagger UI: http://localhost:8000/docs ReDoc: http://localhost:8000/redoc


πŸ”Œ API Overview

All endpoints are prefixed with /api/v1/ and require Bearer token authentication.

Header:

Authorization: Bearer YOUR_API_KEY

Core Endpoints

Method Endpoint Description
POST /api/v1/chat Send a message and receive an AI response
POST /api/v1/ingest Upload PDF/DOCX for RAG indexing
POST /api/v1/leads Submit a lead capture form
POST /api/v1/telegram Telegram webhook receiver
POST /api/v1/woo-webhook WooCommerce event receiver
GET /api/v1/health Health check
GET /metrics Prometheus metrics

Example: Chat Request

curl -X POST "http://localhost:8000/api/v1/chat" \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -d '{"question": "What is the price of product X?"}'

Response:

{
  "answer": "Product X costs $49.99. Our supplier price is $42.00, giving you a margin of 16%.",
  "sources": ["catalog/product-x.pdf"],
  "session_id": "abc123"
}

Full endpoint reference with schemas and error codes: docs/reference/api.md After starting the app, also see: http://localhost:8000/docs


🧠 RAG Architecture

BubbleBrain uses the Retrieval-Augmented Generation pattern to eliminate AI hallucinations:

User Question
     β”‚
     β–Ό
[Embedding Model]  ←── text-embedding-3-small
     β”‚
     β–Ό
[Pinecone Search]  ←── cosine similarity, top-k retrieval
     β”‚
     β–Ό
[Context Assembly] ←── retrieved chunks + chat history
     β”‚
     β–Ό
[OpenAI LLM]       ←──  GPT-4
     β”‚
     β–Ό
Grounded Answer
  1. Ingestion β€” documents are chunked, embedded, and stored in Pinecone
  2. Retrieval β€” query is embedded; nearest vectors are fetched
  3. Generation β€” LLM generates an answer strictly from retrieved context

See docs/explanation/rag-architecture.md for full details.


βš™οΈ Configuration Reference

Variable Required Description
OPENAI_API_KEY βœ… OpenAI API key
OPENAI_MODEL βœ… LLM model (e.g. gpt-3.5-turbo)
EMBEDDING_MODEL βœ… Embedding model (e.g. text-embedding-3-small)
PINECONE_API_KEY βœ… Pinecone API key
PINECONE_INDEX_NAME βœ… Name of your Pinecone index
PINECONE_ENVIRONMENT βœ… e.g. gcp-starter
API_KEY_SECRET βœ… Static secret for client authentication
WOO_CK / WOO_CS ⚠️ WooCommerce consumer key/secret
WOO_URL ⚠️ WooCommerce store URL
SUPPLIER_URL ⚠️ Supplier site URL for price comparison
SENTRY_DSN ❌ Sentry error tracking DSN
PROMETHEUS_EXTERNAL_URL ❌ External URL for Prometheus
ALLOWED_ORIGINS ❌ CORS origins (default: *)
TELEGRAM_CONTACT_URL ❌ Telegram bot deep link

πŸ§ͺ Testing

# Run all tests with coverage report
poetry run pytest --cov=app --cov-report=term-missing

# Run only remote integration tests (requires running server)
poetry run pytest -m remote

Current coverage: 88% across 2,553 statements.

Key modules with full coverage: main.py, health, security, metrics, woo_service, telegram_service, statistics_service.


πŸ“Š Monitoring

BubbleBrain exposes Prometheus metrics at /metrics and includes a pre-configured Prometheus container.

Service Port URL
BubbleBrain API 8200 http://localhost:8200
Prometheus 9290 http://localhost:9290
Swagger UI 8200 http://localhost:8200/docs

Sentry integration is enabled when SENTRY_DSN is set in .env.


πŸ“ Project Structure

BubbleBrain/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/v1/endpoints/     # chat, ingest, leads, telegram, woo_webhook
β”‚   β”œβ”€β”€ core/                 # config, db, security, logging, metrics
β”‚   β”œβ”€β”€ middleware/           # rate limiter, request logging
β”‚   β”œβ”€β”€ models/               # SQLAlchemy models
β”‚   β”œβ”€β”€ schemas/              # Pydantic schemas
β”‚   β”œβ”€β”€ services/             # RAG engine, OpenAI, vector, scraper, price comparator...
β”‚   └── utils/                # helpers, prompts, URL utils
β”œβ”€β”€ tests/
β”œβ”€β”€ docs/                     # MkDocs documentation
β”œβ”€β”€ prometheus/
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ pyproject.toml
└── .env.example

πŸ“š Documentation

Full documentation is available via MkDocs:

poetry run mkdocs serve

Then open http://localhost:8001.

Section Description
Getting Started Run the stack locally in 10 minutes
Configure Pinecone Set up vector index for RAG
API Reference Endpoint schemas and auth details
RAG Architecture How the retrieval pipeline works

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feat/your-feature
  3. Commit using Conventional Commits: git commit -m "feat: add X"
  4. Push and open a Pull Request

Code quality is enforced via pre-commit hooks (Ruff, mypy, pyright):

pre-commit install

πŸ“„ License

MIT Β© PyDevDeep

About

A lightweight FastAPI & Pinecone backend with a custom adapter for Chat Embed, enabling secure and production-ready RAG pipelines.

Resources

Stars

Watchers

Forks

Contributors