🧠 BubbleBrain

AI-powered chatbot backend for e-commerce — RAG pipeline, price comparison, lead generation, and Flowise widget integration in a single production-ready FastAPI service.

📌 Why BubbleBrain Exists

Modern e-commerce stores lose customers due to slow or absent support. BubbleBrain automates this entirely:

Answers product questions instantly using your store's own data (RAG, no hallucinations)
Compares prices between your store and suppliers in real-time
Captures leads and routes hot prospects directly to Telegram
Embeds into any frontend via Flowise Chat Widget — no custom UI required
Syncs with WooCommerce via webhooks to stay up-to-date on orders and inventory

🚀 Features

RAG Engine — retrieves accurate answers from your product catalog using OpenAI Embeddings + Pinecone vector search
Price Comparator — scrapes supplier sites and compares against WooCommerce prices on demand
Lead Pipeline — classifies intent, captures contact info, and routes hot leads to dedicated Telegram topics
Document Ingestion — uploads and indexes PDF/DOCX files into the vector store via /api/v1/ingest
WooCommerce Webhook — receives real-time order/product events and updates internal state
Telegram Integration — broadcasts lead alerts, price updates, bot stats, and errors across topic-organized groups
API Key Auth — static secret key validation on all /api/v1/* endpoints
Rate Limiting — 20 requests/min per IP via slowapi
Structured Logging — structlog + Sentry error tracking
Prometheus Metrics — built-in /metrics endpoint, Prometheus container included in Compose
Conversation Memory — per-session chat history stored in SQLite via aiosqlite
88% Test Coverage — pytest suite with async support and remote integration tests

🛠 Tech Stack

Layer	Technology
Runtime	Python 3.13
Web Framework	FastAPI + Uvicorn
AI / LLM	OpenAI GPT-3.5/4, `text-embedding-3-small`
Vector DB	Pinecone
Chat Widget	Flowise Embed
WooCommerce	REST API + Webhooks
Scheduling	APScheduler
HTTP Client	httpx
Scraping	BeautifulSoup4, requests
Data Validation	Pydantic v2, pydantic-settings
Database	SQLite (aiosqlite) + SQLAlchemy
Monitoring	Prometheus, Sentry SDK
Logging	structlog
Rate Limiting	slowapi
Containerization	Docker, Docker Compose
Dependency Manager	Poetry
Linter / Formatter	Ruff
Type Checker	mypy (strict), pyright
Testing	pytest, pytest-asyncio, pytest-cov
Docs	MkDocs Material

📦 Quick Start

Prerequisites

1. Clone the repository

git clone https://github.com/PyDevDeep/BubbleBrain.git
cd BubbleBrain

2. Configure environment variables

cp .env.example .env

Open .env and fill in the required values:

# Required
OPENAI_API_KEY=sk-...
PINECONE_API_KEY=pc-...
PINECONE_INDEX_NAME=chatbot-index
API_KEY_SECRET=your_static_secret_key

# WooCommerce (if using webhook integration)
WOO_CK=your_consumer_key
WOO_CS=your_consumer_secret
WOO_URL=https://your-shop-domain.com

# Telegram (for lead alerts)
TELEGRAM_CONTACT_URL=https://t.me/your_bot

See .env.example for the full list of available variables.

3. Start infrastructure

docker-compose up -d

This launches:

bubblebrain-app on port 8200 (maps to internal 8000)
bubblebrain-prometheus on port 9290

4. Install dependencies and start the dev server

poetry install
poetry run uvicorn app.main:app --reload

API is now available at http://localhost:8000. Interactive Swagger UI: http://localhost:8000/docs ReDoc: http://localhost:8000/redoc

🔌 API Overview

All endpoints are prefixed with /api/v1/ and require Bearer token authentication.

Header:

Authorization: Bearer YOUR_API_KEY

Core Endpoints

Method	Endpoint	Description
`POST`	`/api/v1/chat`	Send a message and receive an AI response
`POST`	`/api/v1/ingest`	Upload PDF/DOCX for RAG indexing
`POST`	`/api/v1/leads`	Submit a lead capture form
`POST`	`/api/v1/telegram`	Telegram webhook receiver
`POST`	`/api/v1/woo-webhook`	WooCommerce event receiver
`GET`	`/api/v1/health`	Health check
`GET`	`/metrics`	Prometheus metrics

Example: Chat Request

curl -X POST "http://localhost:8000/api/v1/chat" \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -d '{"question": "What is the price of product X?"}'

Response:

{
  "answer": "Product X costs $49.99. Our supplier price is $42.00, giving you a margin of 16%.",
  "sources": ["catalog/product-x.pdf"],
  "session_id": "abc123"
}

Full endpoint reference with schemas and error codes: docs/reference/api.md After starting the app, also see: http://localhost:8000/docs

🧠 RAG Architecture

BubbleBrain uses the Retrieval-Augmented Generation pattern to eliminate AI hallucinations:

User Question
     │
     ▼
[Embedding Model]  ←── text-embedding-3-small
     │
     ▼
[Pinecone Search]  ←── cosine similarity, top-k retrieval
     │
     ▼
[Context Assembly] ←── retrieved chunks + chat history
     │
     ▼
[OpenAI LLM]       ←──  GPT-4
     │
     ▼
Grounded Answer

Ingestion — documents are chunked, embedded, and stored in Pinecone
Retrieval — query is embedded; nearest vectors are fetched
Generation — LLM generates an answer strictly from retrieved context

See docs/explanation/rag-architecture.md for full details.

⚙️ Configuration Reference

Variable	Required	Description
`OPENAI_API_KEY`	✅	OpenAI API key
`OPENAI_MODEL`	✅	LLM model (e.g. `gpt-3.5-turbo`)
`EMBEDDING_MODEL`	✅	Embedding model (e.g. `text-embedding-3-small`)
`PINECONE_API_KEY`	✅	Pinecone API key
`PINECONE_INDEX_NAME`	✅	Name of your Pinecone index
`PINECONE_ENVIRONMENT`	✅	e.g. `gcp-starter`
`API_KEY_SECRET`	✅	Static secret for client authentication
`WOO_CK` / `WOO_CS`	⚠️	WooCommerce consumer key/secret
`WOO_URL`	⚠️	WooCommerce store URL
`SUPPLIER_URL`	⚠️	Supplier site URL for price comparison
`SENTRY_DSN`	❌	Sentry error tracking DSN
`PROMETHEUS_EXTERNAL_URL`	❌	External URL for Prometheus
`ALLOWED_ORIGINS`	❌	CORS origins (default: `*`)
`TELEGRAM_CONTACT_URL`	❌	Telegram bot deep link

🧪 Testing

# Run all tests with coverage report
poetry run pytest --cov=app --cov-report=term-missing

# Run only remote integration tests (requires running server)
poetry run pytest -m remote

Current coverage: 88% across 2,553 statements.

Key modules with full coverage: main.py, health, security, metrics, woo_service, telegram_service, statistics_service.

📊 Monitoring

BubbleBrain exposes Prometheus metrics at /metrics and includes a pre-configured Prometheus container.

Service	Port	URL
BubbleBrain API	8200	`http://localhost:8200`
Prometheus	9290	`http://localhost:9290`
Swagger UI	8200	`http://localhost:8200/docs`

Sentry integration is enabled when SENTRY_DSN is set in .env.

📁 Project Structure

BubbleBrain/
├── app/
│   ├── api/v1/endpoints/     # chat, ingest, leads, telegram, woo_webhook
│   ├── core/                 # config, db, security, logging, metrics
│   ├── middleware/           # rate limiter, request logging
│   ├── models/               # SQLAlchemy models
│   ├── schemas/              # Pydantic schemas
│   ├── services/             # RAG engine, OpenAI, vector, scraper, price comparator...
│   └── utils/                # helpers, prompts, URL utils
├── tests/
├── docs/                     # MkDocs documentation
├── prometheus/
├── docker-compose.yml
├── pyproject.toml
└── .env.example

📚 Documentation

Full documentation is available via MkDocs:

poetry run mkdocs serve

Then open http://localhost:8001.

Section	Description
Getting Started	Run the stack locally in 10 minutes
Configure Pinecone	Set up vector index for RAG
API Reference	Endpoint schemas and auth details
RAG Architecture	How the retrieval pipeline works

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feat/your-feature
Commit using Conventional Commits: git commit -m "feat: add X"
Push and open a Pull Request

Code quality is enforced via pre-commit hooks (Ruff, mypy, pyright):

pre-commit install

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
.github		.github
app		app
data		data
docs		docs
frontend		frontend
plugins		plugins
prometheus		prometheus
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 BubbleBrain

📌 Why BubbleBrain Exists

🚀 Features

🛠 Tech Stack

📦 Quick Start

Prerequisites

1. Clone the repository

2. Configure environment variables

3. Start infrastructure

4. Install dependencies and start the dev server

🔌 API Overview

Core Endpoints

Example: Chat Request

🧠 RAG Architecture

⚙️ Configuration Reference

🧪 Testing

📊 Monitoring

📁 Project Structure

📚 Documentation

🤝 Contributing

📄 License

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 BubbleBrain

📌 Why BubbleBrain Exists

🚀 Features

🛠 Tech Stack

📦 Quick Start

Prerequisites

1. Clone the repository

2. Configure environment variables

3. Start infrastructure

4. Install dependencies and start the dev server

🔌 API Overview

Core Endpoints

Example: Chat Request

🧠 RAG Architecture

⚙️ Configuration Reference

🧪 Testing

📊 Monitoring

📁 Project Structure

📚 Documentation

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages