AI-powered, self-hosted document organizer with OCR, receipt scanning, and analytics. Upload a scan from your desktop or phone and — a few seconds later — it is renamed, dated, filed into the right category, and browsable in a clean, mobile-friendly interface. Receipts are auto-detected and broken down into structured line items so you can see where your money actually goes.
Built for a Synology NAS in Docker, but runs anywhere Docker (or just plain Python) runs. Pick your AI: Anthropic Claude, OpenAI GPT, Google Gemini, or run a local model via Ollama — your documents never have to leave your network.
- Web UI on a port you pick (default 8080, configurable in the wizard
and in
/settings) — dashboard, library browser with full-text search, per-document detail + PDF preview, mobile upload with camera capture - First-run setup wizard at
/setup— pick language, AI provider, paste token, optionally configure backup. Always-reachable settings page at/settings - Multi-provider AI: Anthropic Claude · OpenAI GPT · Google Gemini · any OpenAI-compatible endpoint (Ollama, Groq, xAI, Mistral, OpenRouter …) · or the Local AI Bridge (run inference on a Mac / Linux / Windows box of your choice — see below)
- Subcategories + free-form tags — files land at
Library/YYYY/Category/Subcategory/and carry up to 8 lowercase labels - Receipt scanner + analytics — Kassenzettel are recognised
automatically. A second-pass LLM extracts shop name + type, payment
method, total, and per-line items with prices and item categories.
Browse aggregated spend per month, by shop type and item category,
search line items, see your most-bought items at
/analytics. - OCR for scanned PDFs and images (Tesseract
deu+eng) - Backup to a local folder (rsync, no setup) or to the cloud via rclone (Drive · Dropbox · OneDrive · S3 · WebDAV · SFTP). Headless-friendly: no browser needed on the host.
- Cost tracking per document + aggregated (tokens in/out, USD and EUR preview), with prompt-caching factored in for Anthropic and OpenAI
- Low-confidence review folder instead of wrong guesses, full metadata editing on the document detail page
- Trash + restore + permanent purge, ZIP export of any filtered selection
- Safety copy of every original kept in
_Processed/ - i18n: German · English · French · Italian · Spanish
Every filed document follows the same pattern:
YYYY-MM-DD_Category_Sender_Subject.pdf
Examples:
2026-02-14_Rechnungen_Vodafone_Mobilfunk-Februar.pdf
2026-01-03_Gesundheit_Hausarzt-Dr-Mueller_Blutbild.pdf
2026-03-20_Steuer_Finanzamt-Dresden_Bescheid-2024.pdf
The template is configurable in config/config.yaml.
/data/
├── inbox/ ← drop scans here
└── library/
├── 2026/
│ ├── Rechnungen/
│ ├── Vertraege/
│ ├── Behoerde/
│ ├── Gesundheit/
│ ├── Gehalt/
│ ├── Steuer/
│ ├── Haus/
│ ├── Versicherung/
│ ├── Bank/
│ └── Sonstiges/
├── _Review/ ← uncertain docs land here for manual sorting
└── _Processed/ ← copy of every original file
- Docker and docker-compose (Synology: install "Container Manager" from Package Center, DSM 7.2+) — or run directly on Linux/macOS via the launchers below
- A folder on your NAS where scans arrive (e.g.
/volume1/Scan) - A folder that will become your library (e.g.
/volume1/Dokumente) - An API key for one of:
- Anthropic Claude (
ANTHROPIC_API_KEY, recommended — cheapest with prompt caching) - OpenAI GPT (
OPENAI_API_KEY) - Google Gemini (
GEMINI_API_KEY) - …or no key at all if you run a local model via Ollama. Hardware suggestion: 8 GB RAM for an 8B model, 16 GB for a 13B, GPU recommended.
- Anthropic Claude (
-
Copy the project to your NAS, e.g. to
/volume1/docker/docusort/. Via File Station, SFTP, or:scp -r docusort admin@synology:/volume1/docker/
-
Adjust
docker-compose.ymlif your paths differ. Defaults:volumes: - /volume1/Scan:/data/inbox - /volume1/Dokumente:/data/library - /volume1/docker/docusort/config:/app/config - /volume1/docker/docusort/logs:/app/logs
-
Build and start:
sudo docker compose up -d --build
-
Check the logs:
sudo docker logs -f docusort
-
Open the UI at
http://<nas-ip>:<port>(default8080; you can change it during setup or later in/settings). On first start the setup wizard at/setupwalks you through language, AI provider + token, an optional port + host, and an optional backup target. The wizard writesconfig/secrets.yaml(mode 0600, gitignored) and updatesconfig/config.yaml. After the final step the service restarts itself and lands you on the dashboard.Dropping a PDF into
/volume1/Scanthen works — it appears correctly named under/volume1/Dokumente/2026/…/.You can revisit any of those choices later under Einstellungen (the cog in the header) — provider, model, API keys, paths, sync target.
Legacy env-var setup also still works — if you set
ANTHROPIC_API_KEY(orOPENAI_API_KEY/GEMINI_API_KEY) in your.env, DocuSort picks it up and skips the wizard's token step.
Three launcher scripts live in the project root — pick the one that matches your OS:
- macOS: double-click
start.command(or./start.shfrom a Terminal) - Linux:
./start.sh - Windows: double-click
start.bat
Each launcher creates a .venv on first run, keeps Python deps in sync,
warns if tesseract / ocrmypdf are missing, and then boots the app on the
configured port (default http://localhost:8080). Open the URL — the
setup wizard at /setup collects everything else, including the
port if you want a different one.
If you'd rather pre-seed the API key as an env var instead of typing it
in the wizard, drop a .env next to the launcher with one of:
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIza...
OCR needs system-level Tesseract and ocrmypdf installed
(brew install tesseract tesseract-lang ocrmypdf on macOS,
sudo apt install tesseract-ocr tesseract-ocr-deu ocrmypdf on Debian/Ubuntu).
Browsers only run service workers in a secure context — over plain HTTP uploads work but run in the foreground (keep the tab open). Flipping to HTTPS buys you true background-uploads that survive a tab close.
On a Tailscale-attached host, one script does everything:
./scripts/setup-tailscale-https.shIt grabs a Let's Encrypt cert via tailscale cert, installs a weekly
systemd timer that renews it, and updates config/config.yaml with the
cert paths. After sudo systemctl restart docusort the UI lives at
https://<host>.<tailnet>.ts.net:9876.
To do it by hand, set these under web: in config/config.yaml:
web:
ssl_cert: "/etc/docusort/certs/yourhost.ts.net.crt"
ssl_key: "/etc/docusort/certs/yourhost.ts.net.key"Any PEM cert/key pair works (Caddy, certbot, self-signed). Uvicorn picks them up on next start and serves TLS on the configured port.
DocuSort ships with a built-in updater that pulls the newest release straight from GitHub:
- Web UI: a banner appears on every page when a newer version is available — one click installs it.
- CLI:
python -m docusort --check-updateandpython -m docusort --update.
On systemd hosts, enable the one-click restart by installing the scoped sudoers rule once:
./scripts/install-sudoers-rule.shThe rule grants NOPASSWD only for systemctl restart docusort.
DocuSort can do every classification and bank-statement extraction locally — no token, no cloud round-trip, no per-document cost. Three ways to set it up depending on where you want the model to actually run.
The simplest path. If you start DocuSort directly with
./start.command / start.sh / start.bat, install Ollama on the
same machine and DocuSort can talk to it directly:
# macOS
brew install ollama && brew services start ollama
ollama pull qwen2.5:7b-instruct
# Linux
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:7b-instruct
# Windows
winget install Ollama.Ollama
ollama pull qwen2.5:7b-instructThen open /settings in DocuSort. The page detects the local
Ollama and shows a Local Ollama on this machine card with a
model dropdown — pick a model, click Use this on this machine,
restart, done. Provider is set to openai_compat, base URL is
http://127.0.0.1:11434/v1.
Use the Local AI Bridge when DocuSort itself runs somewhere that cannot host a 7B+ model (Synology DS218 with 2 GB RAM, a tiny VM, a Raspberry Pi). The bridge is a Python script that runs on the machine you do want to do inference on (Mac / Linux box / Windows desktop) and connects outbound to DocuSort over a WebSocket. No port forwarding, no firewall changes — anything that can open the DocuSort URL in a browser can run the bridge.
- In DocuSort, switch the AI provider to Local AI Bridge in
/settings. - Scroll to the Local AI Bridge card and download the
launcher for your OS — there are three buttons: macOS
(
.command), Windows (.bat), Linux (.sh). The launcher already contains the server URL and a shared-secret token. - Double-click the file you just downloaded. macOS: first launch may show "from an unidentified developer" — right-click → Open. Windows: SmartScreen may say "Windows protected your PC" — click More info → Run anyway.
- The launcher auto-installs Ollama (Homebrew on macOS, the
official installer on Linux, winget on Windows), starts
ollama serve, pulls the requested model the first time, and stays connected until you press Ctrl-C. - The Settings card flips to a green connected badge with the bridge host's name, OS, and model. Hit Test to round-trip a prompt through the bridge and confirm it answers.
The bridge tolerates network blips: a 120-second reconnect grace window holds in-flight requests open across a brief WebSocket drop, and the bridge client buffers any computed response that could not be delivered before the disconnect. Long bank-statement extractions that take 10+ minutes on a small model survive Tailscale or Wi-Fi hiccups without losing work.
The Test button also exposes a per-statement progress bar in
/finance: the Alle auswerten banner runs through every
unprocessed Kontoauszug in the background, one by one, and reports
done / failed counts as it goes. Lets you start a bulk extraction
on a quiet evening and check the result the next morning.
For users who run their own inference cluster, or who want to use
Groq / Together / xAI / OpenRouter / Mistral, pick OpenAI-compatible
in /settings, paste the base URL (https://api.groq.com/openai/v1,
http://192.168.1.50:8080/v1, …) and an API key if needed.
DocuSort can ping you out of band when a document needs your attention. Configure under Settings → Notifications:
- Telegram — create a bot via @BotFather,
send any message to your bot, then visit
https://api.telegram.org/bot<TOKEN>/getUpdatesto find your numericchat_id. Paste both into the form. - Email — standard SMTP. Works with Gmail (use an app password), Fastmail, or your own server.
Per-event toggles control the noise:
- Document landed in review — the classifier was unsure or the doc has incomplete metadata.
- Classification failed — the LLM call raised an exception.
- Document filed — every successful filing (off by default — too chatty for normal use).
- Bulk job finished —
analyze-all,retry-review, and friends emit a summary message with the success / failure tally.
Each notification carries a clickable URL back to the document
detail page. Channel credentials live in secrets.yaml (mode 0600)
and are never logged.
A new Duplicates page (/duplicates) groups every byte-identical
pair in the library by SHA-256 hash and offers a one-click bulk
trash action. Pick which copy to keep per group (default: oldest)
or sweep them all at once. The dashboard shows an amber banner
when groups exist, so you do not have to remember to look.
All behaviour is controlled by three files in config/:
config.yaml– paths, OCR settings, AI provider/model, sync target, thresholdscategories.yaml– the list of categories and their subcategoriessecrets.yaml– API keys (mode 0600, gitignored). Written by the wizard.
Most users never edit these directly — the setup wizard and the
/settings page cover everything. The knobs that matter:
| Setting | Default | What it does |
|---|---|---|
ai.provider |
anthropic |
anthropic · openai · gemini · openai_compat |
ai.model |
claude-haiku-4-5-20251001 |
Provider-specific model id |
ai.base_url |
"" |
Only for openai_compat (e.g. http://localhost:11434/v1 for Ollama) |
ai.min_confidence |
0.65 |
Documents below this go to _Review |
ocr.languages |
deu+eng |
Tesseract language packs |
ocr.max_parallel |
2 |
Cap on concurrent OCR + AI jobs (memory bound) |
sync.target_type |
local |
local (rsync to a folder) or rclone (cloud) |
sync.local_path |
"" |
Target folder for local-mode backup |
sync.remote |
"" |
rclone remote, format <name>:<path> |
keep_original |
true |
Keep an untouched copy of each original in _Processed |
dry_run |
false |
Classify and log but don't move anything |
After changing config from the CLI, restart the service. From the UI the wizard handles the restart for you.
python -m docusort # watcher + web UI on the configured port (default 8080)
python -m docusort --once # process existing files and exit
python -m docusort --no-web # watcher only, no UI
python -m docusort --dry-run # classify + log, no moves
python -m docusort --version- File appears in
inbox/. - Watcher waits until the file size stops changing (default 5 s).
- If the PDF has no text layer,
ocrmypdfadds one. - The first ~12 k characters go to the configured AI provider, together with the category list and the prompt that forces JSON output.
- The model replies with strict JSON:
category, subcategory, tags, date, sender, subject, confidence, reasoning. - Confidence ≥ 0.65 → move to
library/YYYY/Category/Subcategory/(subcategory dir is omitted when empty). Lower → move to_Review/for a human look. - The original is copied to
_Processed/before being removed frominbox/.
Per provider (typical one-page letter, ~3 k input + 200 output tokens):
| Provider | Model | Roughly per doc |
|---|---|---|
| Anthropic | Haiku 4.5 (with prompt cache) | ~$0.0005 |
| Anthropic | Sonnet 4.6 (with prompt cache) | ~$0.005 |
| OpenAI | gpt-4o-mini | ~$0.0008 |
| OpenAI | gpt-4o | ~$0.015 |
| Gemini 2.5 Flash | ~$0.0005 | |
| Gemini 2.5 Pro | ~$0.008 | |
| Local | Ollama (any model) | $0 — only your electricity |
A batch of 1 000 documents per month with Haiku 4.5 stays well under EUR 1 in API fees. The dashboard shows actual cost across all providers in real time, with cache savings (Anthropic) and cached-prompt savings (OpenAI) credited.
Every document detail page has a move-to-trash button. Trashed documents
move into a _Trash/ tree that mirrors the category layout on disk and become
hidden from the dashboard, tree and stats — but stay in the DB so they're
recoverable. The library's tree sidebar gets a "Papierkorb" entry whenever
the trash is non-empty. From there you can restore or permanently purge
individual items, or empty the whole trash.
- Dashboard → "ZIP laden" → downloads the whole library as a single ZIP.
- Library filtered → export a single year, a single category, or both.
_Trash/is excluded by default.- The download is streamed, so multi-GB exports don't spike memory.
Two backup paths, picked from the wizard or /settings → "Backup":
Mirror the library to any path on the host with rsync — a mounted USB stick, NAS share, NFS/SMB mount, second disk. No tokens, no OAuth.
In the UI: pick the "Lokaler Ordner / NAS-Mount" tile, browse to the folder with the built-in folder picker (or paste a path), enable. Equivalent config:
sync:
enabled: true
target_type: local
local_path: /mnt/backup/docusortBacked by rsync -a --delete --delete-excluded --exclude=_Trash/. If
rsync isn't installed, DocuSort falls back to a slower pure-Python copy.
DocuSort uses rclone for cloud sync — whatever rclone supports, DocuSort can sync to. Headless-friendly: no browser needed on the host. On the machine running DocuSort:
sudo apt install rclone # Debian/Ubuntu
brew install rclone # macOSThen in /settings → Backup → Cloud-Speicher (rclone):
- WebDAV / Nextcloud · SFTP · S3 / R2 / MinIO: simple form — URL, credentials, done. No OAuth, works on any headless machine.
- Google Drive · Dropbox · OneDrive (folded behind "Show OAuth
providers"): the only flow that needs OAuth. On a separate machine
with a browser, run e.g.
rclone authorize "drive"— it spawns a one-shot OAuth dance, prints a JSON token. Paste that token into the textarea in the UI; DocuSort writes the remote intorclone.conffor you. Norclone configinteraction on the host.
A "Test" button next to each remote runs rclone lsd <remote>: so you
catch broken auth before flipping enabled: true. Broken OAuth remotes
(empty token field in rclone.conf) get a red "defekt" badge with a
one-click Reconnect button that re-opens the token-paste form.
For scheduled sync, point a systemd timer at
curl -XPOST http://localhost:<port>/api/sync/run (port defaults to 8080;
the value lives under web.port in config.yaml).
Etappe 2: Web UI, cost tracking, SQLite + FTS5 search— shipped in v0.2.0Local-AI bridge so a small NAS can offload inference to a beefier desktop on the same network— shipped in v0.19.0 + robustness pass in v0.21.0Etappe 3: Telegram / email notification on new file or— shipped in v0.22.0_ReviewentryEtappe 4: Duplicate detection across the whole library— shipped in v0.22.0Etappe 6: Prompt caching for bulk imports (reuse system prompt across calls)— Anthropic ephemeral cache, already shipped earlier- Etappe 5: Automatic reminders for contract termination dates
Proprietary — see LICENSE.
DocuSort is source-available but not open source. You may download, install and run it for personal, non-commercial use, and read the source code for inspection and security review. Modification, redistribution, derivative works, and commercial use require prior written permission from the copyright holder.
Versions up to and including v0.12.3 are still available under the MIT License for anyone who obtained a copy of those releases — that does not change retroactively. The proprietary terms apply to v0.12.4 and later.




