diff --git a/README.md b/README.md index 966ee5f..138e727 100644 --- a/README.md +++ b/README.md @@ -1,68 +1,53 @@
-
+
- A privacy-first desktop PDF utility for Windows.
+ Privacy-first PDF tools for humans and AI agents — entirely local.
- Read, search, copy, merge, split, extract, and compress PDFs — all locally, no uploads required.
+ Read, search, compare, merge, split, extract, and compress PDFs without uploading documents anywhere.
- Download - · - Features - · - Screenshots - · - Build - · - Privacy - · - AI Agent - · - Architecture - · - Versioning -
+--- ## Overview -OpenReader is a **local-first desktop PDF utility** built with Python, PySide6, and PyMuPDF. It is designed for people who want common PDF tasks in a simple native app without sending private documents to a cloud service. +OpenReader is a **local-first PDF utility** that works with AI agents. -The app is intentionally local-first: PDFs are opened, rendered, searched, merged, split, annotated, and compressed on your computer — no uploads, no accounts, no telemetry. +Every operation — reading, searching, annotating, compressing, comparing, merging, splitting — runs on your machine. No accounts. No subscriptions. No telemetry. No cloud uploads. -**Current version:** v1.2.2 (June 2026) +Use OpenReader directly, from scripts, or through AI agents. + +--- ## Download -### Recommended: Microsoft Store +### Microsoft Store (Recommended) -The Microsoft Store submission is in certification. Once approved, install OpenReader with one click — automatic updates included. +The Store submission is in certification. Once approved, install with one click — automatic updates included. *Store link will appear here after certification.* ### GitHub Releases (Advanced Users) -MSIX packages are available from the [Releases page](https://github.com/sparshsam/pdfreader-by-sparsh/releases). +MSIX packages are available on the [Releases page](https://github.com/sparshsam/pdfreader-by-sparsh/releases). | Platform | Package | Notes | |---|---|---| -| Windows 10/11 | `OpenReader.msix` | MSIX package. May be unsigned — requires [Windows Developer Mode](https://learn.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development) for sideloading. | -| Windows 10/11 | `OpenReader-Setup.exe` | Legacy Inno Setup installer for manual recovery. Requires administrator rights. | -| Windows 10/11 | `OpenReader-Windows.zip` | Portable ZIP for manual recovery. | -| macOS | `OpenReader-macOS-*.zip` | **Experimental.** Community-tested. See [macOS notes](docs/macos.md). | +| Windows 10/11 | `OpenReader.msix` | MSIX package. May be unsigned — requires [Developer Mode](https://learn.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development) for sideloading. | +| Windows 10/11 | `OpenReader-Setup.exe` | Legacy installer for manual recovery. Requires administrator rights. | +| Windows 10/11 | `OpenReader-Windows.zip` | Portable ZIP. | +| macOS | `OpenReader-macOS-*.zip` | **Experimental.** See [macOS notes](docs/macos.md). | | Linux | — | Unsupported. | ### Platform Support @@ -81,8 +66,77 @@ MSIX packages are available from the [Releases page](https://github.com/sparshsa OpenReader does not install updates itself. - **Microsoft Store** installations update automatically through the Store. -- **GitHub MSIX** installations can check for new versions (Help → Check for Updates) but updates must be downloaded and installed manually. -- **Source builds** should be updated with `git pull` and rebuilt locally. +- **GitHub MSIX** installations: Help → Check for Updates opens the releases page. Download and install manually. +- **Source builds**: `git pull` and rebuild. + +--- + +## AI Agent Integration (MCP Server) + +OpenReader ships with a built-in [MCP (Model Context Protocol)](https://modelcontextprotocol.io) server. Any MCP-compatible agent — Claude Code, Claude Desktop, Hermes, or others — can interact with PDFs directly on your machine. + +No cloud, no API keys, no document uploads. + +### What you can do with AI agents + +| Workflow | What happens | +|---|---| +| **Ask questions about a PDF** | Agent extracts text from any page and answers. | +| **Search entire PDF libraries** | Agent indexes a folder and searches across all documents (keyword or semantic). | +| **Compare document versions** | Agent runs a side-by-side diff and gives you a structured summary. | +| **Summarize research collections** | Agent reads multiple PDFs and synthesizes findings. | +| **Build automated PDF pipelines** | Write scripts that merge, split, compress, and extract — all local. | + +### Architecture + +``` +┌─────────────────────────────────────────────────────┐ +│ Claude / Hermes / any MCP-compatible agent │ +│ (asks questions, runs searches, compares docs) │ +└──────────┬──────────────────────────────────────────┘ + │ MCP protocol (stdio or SSE) + ▼ +┌─────────────────────────────────────────────────────┐ +│ OpenReader MCP Server │ +│ pdfreader_lib/mcp_server.py │ +│ 14 tools: extract, search, compare, merge, split… │ +└──────────┬──────────────────────────────────────────┘ + │ local file access only + ▼ +┌─────────────────────────────────────────────────────┐ +│ Your PDFs (stored on your machine) │ +└─────────────────────────────────────────────────────┘ +``` + +### Quick setup + +```bash +# Install MCP dependencies +pip install -r requirements-mcp.txt +``` + +Add to your MCP-compatible agent's configuration: + +```json +{ + "mcpServers": { + "openreader": { + "command": "python", + "args": ["-m", "pdfreader_lib.mcp_server"] + } + } +} +``` + +The server runs over stdio by default. For HTTP/SSE transport: + +```bash +python -m pdfreader_lib.mcp_server --transport sse --port 8312 +``` + +All operations are local. No data is uploaded anywhere. + +--- ## Features @@ -105,19 +159,23 @@ OpenReader does not install updates itself. | Recent files | Quick access to the last 10 opened PDFs via File → Open Recent | | Update detection | Help → Check for Updates queries GitHub API and opens the releases page. | +--- + ## Screenshots | Reader | Sample PDF | |---|---| -|  |  | +|  |  | -| Sample PDF 2 | PDF Tools | +| Dark Mode | PDF Tools | |---|---| -|  |  | +|  |  | -| Dark Mode | About | +| Sample PDF 2 | | |---|---| -|  |  | +|  | | + +--- ## Privacy and Security @@ -136,18 +194,15 @@ The app includes lightweight safety checks before opening and rendering document These checks reduce risk from malformed or oversized PDFs, but PDF parsing still depends on PyMuPDF/MuPDF. Avoid opening PDFs from untrusted sources unless you use OS-level sandboxing, a VM, or another isolation layer. +--- + ## License OpenReader is free software under the [GNU AGPLv3](LICENSE). Copyright © 2026 Sparsh Sam. -## Requirements - -| Use case | Requirements | -|---|---| -| Run Windows package | Windows 10 or newer. Python is not required. | -| Develop or build from source | Python 3.11 or newer. Windows recommended. macOS source builds may work but are not tested. | +--- ## Build From Source @@ -160,7 +215,7 @@ python -m pip install -r requirements.txt python main.py ``` -Build the Windows executable: +Build the executable: ```powershell .\scripts\build_windows.ps1 @@ -179,9 +234,7 @@ dist\OpenReader\ ### macOS -The Windows `.exe` cannot run on macOS. PyInstaller bundles native binaries for the operating system it runs on. - -**macOS packaged builds are experimental.** The app is developed and tested primarily on Windows. To run on macOS, build from source: +macOS packaged builds are **experimental**. To run from source: ```bash git clone https://github.com/sparshsam/pdfreader-by-sparsh.git @@ -192,53 +245,19 @@ pip install -r requirements.txt python main.py ``` -See [docs/macos.md](docs/macos.md) for macOS setup, Finder "Open With" notes, icon generation, and OCR notes. +See [docs/macos.md](docs/macos.md) for macOS setup and OCR notes. -## OCR Setup +### OCR Setup Text selection works natively on PDFs with embedded text. For scanned/image-only PDFs, the app falls back to OCR via PyMuPDF's Tesseract integration. -No OCR setup is needed to read regular PDFs — the app only requires Tesseract when you drag-select text on a scanned page. - -### Installing Tesseract +**Windows:** Download Tesseract from [UB-Mannheim/tesseract](https://github.com/UB-Mannheim/tesseract/releases), run the installer, check "Add to PATH", restart the app. -**Windows** -1. Download the installer from [GitHub UB-Mannheim/tesseract](https://github.com/UB-Mannheim/tesseract/releases) -2. Run the installer — check "Add to PATH" during setup -3. Restart the app; OCR will work automatically - -**macOS** -```bash -brew install tesseract -``` -No restart needed — PyMuPDF finds it automatically. - -**Linux (source builds)** -```bash -# Debian / Ubuntu -sudo apt install tesseract-ocr tesseract-ocr-eng +**macOS:** `brew install tesseract` -# Fedora -sudo dnf install tesseract tesseract-langpack-eng +**Linux (source builds):** `sudo apt install tesseract-ocr tesseract-ocr-eng` -# Arch -sudo pacman -S tesseract tesseract-data-eng -``` - -## Roadmap - -### Near-Term -- **Microsoft Store submission** — currently in certification -- **Local AI summarization** — generate document summaries and extract key points using a local LLM (e.g. Ollama, llama.cpp); no data ever leaves your machine -- **Stronger sandboxing guidance** — documented approaches for running the app in an OS sandbox when opening documents from untrusted sources -- **Winget support** — `winget install SparshSam.OpenReader` - -### Long-Term Vision -- **Cross-platform desktop support** — native builds for Linux in addition to Windows and macOS -- **Secure research workspace** — a sandboxed reading environment with isolated rendering and optional network blocking -- **PDF timeline and version history** — track changes across document revisions -- **Plugin system** — a lightweight extension API for community-contributed tools -- **Collaborative annotations (optional, wallet-based)** — share annotations between trusted peers using cryptographic identity +--- ## Project Structure @@ -259,86 +278,12 @@ sudo pacman -S tesseract tesseract-data-eng └── CHANGELOG.md ``` +--- + ## Contributing Contributions are welcome. Please read [CONTRIBUTING.md](CONTRIBUTING.md) and [SECURITY.md](SECURITY.md) before opening issues or pull requests. -## AI Agent Integration (MCP Server) - -OpenReader ships with a built-in [MCP (Model Context Protocol)](https://modelcontextprotocol.io) server that lets AI agents interact with PDFs programmatically. Agents can read, search, compare, merge, split, compress, and index PDFs — all locally, no cloud involved. - -### Available Tools (14) - -| Tool | Purpose | -|---|---| -| `extract_text` | Extract all text from a PDF, per-page | -| `get_page_text` | Extract text from a single page | -| `get_metadata` | Get PDF metadata (title, author, pages, size) | -| `get_page_count` | Get the number of pages | -| `search_pdf` | Search for text within a single PDF | -| `compare_pdfs` | Compare two PDFs page-by-page with diff | -| `merge_pdfs` | Merge multiple PDFs into one | -| `split_pdf` | Split into individual page files | -| `extract_pages` | Extract specific pages by range (e.g. `1-3,5,7-9`) | -| `compress_pdf` | Create a compressed copy | -| `index_folder` | Build SQLite FTS5 full-text index for a folder | -| `search_library` | Search across all indexed PDFs (BM25 ranked) | -| `search_semantic` | TF-IDF meaning-based search across indexed PDFs | -| `list_indexed_docs` | List all documents in the search index | - -### Setup - -```bash -# Install the MCP SDK -pip install -r requirements-mcp.txt - -# For SSE/HTTP transport (optional): -# pip install starlette uvicorn -``` - -### Agent Configuration - -**Claude Code, Hermes Agent, or any MCP-compatible agent:** - -Add to your agent's MCP server configuration: - -```json -{ - "mcpServers": { - "pdfreader-by-sparsh": { - "command": "python", - "args": ["-m", "pdfreader_lib.mcp_server"] - } - } -} -``` - -### Usage - -The server runs over stdio by default (standard for AI agents): - -```bash -python -m pdfreader_lib.mcp_server -``` - -For HTTP/SSE transport (gateway mode): - -```bash -python -m pdfreader_lib.mcp_server --transport sse --port 8312 -``` - -### What Agents Can Do - -- **Extract text** from PDFs for analysis or summarization -- **Search** across a folder of PDFs using full-text or semantic search -- **Compare** document versions and get structured diffs -- **Merge** multiple PDFs into one document -- **Split** PDFs by page or extract specific page ranges -- **Compress** PDFs to reduce file size -- **Index** entire folders for cross-document search - -All operations are local. No data is uploaded anywhere. - ## Tech Stack | Layer | Choice | diff --git a/assets/screenshot-main.png b/assets/screenshot-main.png deleted file mode 100644 index b6a6f35..0000000 Binary files a/assets/screenshot-main.png and /dev/null differ diff --git a/assets/screenshot-search.png b/assets/screenshot-search.png deleted file mode 100644 index 2662961..0000000 Binary files a/assets/screenshot-search.png and /dev/null differ diff --git a/assets/screenshots/v1.2.2/about.png b/assets/screenshots/v1.2.2/about.png new file mode 100644 index 0000000..656a5b6 Binary files /dev/null and b/assets/screenshots/v1.2.2/about.png differ diff --git a/assets/screenshots/v1.2.2/dark-mode.png b/assets/screenshots/v1.2.2/dark-mode.png new file mode 100644 index 0000000..6a534f7 Binary files /dev/null and b/assets/screenshots/v1.2.2/dark-mode.png differ diff --git a/assets/screenshots/v1.2.2/merge-split.png b/assets/screenshots/v1.2.2/merge-split.png new file mode 100644 index 0000000..862ee73 Binary files /dev/null and b/assets/screenshots/v1.2.2/merge-split.png differ diff --git a/assets/screenshots/v1.2.2/reader-main.png b/assets/screenshots/v1.2.2/reader-main.png new file mode 100644 index 0000000..f9f893d Binary files /dev/null and b/assets/screenshots/v1.2.2/reader-main.png differ diff --git a/assets/screenshots/v1.2.2/sample-pdf-2.png b/assets/screenshots/v1.2.2/sample-pdf-2.png new file mode 100755 index 0000000..3990ef5 Binary files /dev/null and b/assets/screenshots/v1.2.2/sample-pdf-2.png differ diff --git a/assets/screenshots/v1.2.2/sample-pdf.png b/assets/screenshots/v1.2.2/sample-pdf.png new file mode 100644 index 0000000..0e2f3a9 Binary files /dev/null and b/assets/screenshots/v1.2.2/sample-pdf.png differ diff --git a/tools/capture_screenshots.py b/tools/capture_screenshots.py new file mode 100644 index 0000000..906d0ab --- /dev/null +++ b/tools/capture_screenshots.py @@ -0,0 +1,111 @@ +#!/usr/bin/env python3 +"""Capture OpenReader v1.2.2 screenshots in offscreen mode (1920x1080). + +Usage: + source .venv-test-screenshots/bin/activate + QT_QPA_PLATFORM=offscreen python tools/capture_screenshots.py + +Output: assets/screenshots/v1.2.2/ +""" + +import os +import sys +from pathlib import Path + +ROOT = Path(__file__).resolve().parents[1] +sys.path.insert(0, str(ROOT)) + +os.environ.setdefault("QT_QPA_PLATFORM", "offscreen") + +from PySide6.QtWidgets import QApplication +from PySide6.QtCore import QEventLoop, QTimer as QLoopTimer + +OUT_DIR = ROOT / "assets" / "screenshots" / "v1.2.2" +OUT_DIR.mkdir(parents=True, exist_ok=True) + +app = QApplication(sys.argv) + + +def shot(widget, filename, delay_ms=600): + """Capture a widget after a short delay for rendering.""" + result = [] + + def capture(): + pixmap = widget.grab() + path = OUT_DIR / filename + pixmap.save(str(path)) + result.append(path) + print(f" Captured: {path} ({pixmap.width()}x{pixmap.height()})") + + QLoopTimer.singleShot(delay_ms, capture) + loop = QEventLoop() + QLoopTimer.singleShot(delay_ms + 200, loop.quit) + loop.exec() + return result[0] if result else None + + +def main(): + from main import PdfReaderWindow + + print("Creating window...") + window = PdfReaderWindow() + window.show() + window.resize(1920, 1080) + app.processEvents() + + # 1. Empty state — fresh launch, no PDFs open + print("\n1. Empty state...") + shot(window, "empty-state.png", delay_ms=800) + + # 2. Open a test PDF + test_pdf = str(ROOT / "screenshots_test.pdf") + if os.path.exists(test_pdf): + print(f"\n2. Opening test PDF...") + window.open_pdf(test_pdf) + app.processEvents() + shot(window, "reader-main.png", delay_ms=1000) + + # 3. Dark mode + print("\n3. Dark mode...") + window.set_theme(window.THEME_DARK) + app.processEvents() + shot(window, "dark-mode.png", delay_ms=800) + + # Reset to light + print("\n4. Light mode for tools...") + window.set_theme(window.THEME_LIGHT) + app.processEvents() + + # 5. Tools view + print("\n5. Tools view...") + # Try to open the merge/split dialog for a more interesting capture + try: + window._open_compare_dialog() + app.processEvents() + except Exception: + pass # nosec B110 — expected failure if compare dialog unavailable + shot(window, "merge-split.png", delay_ms=800) + + # 6. About dialog — use the real about method + print("\n6. About dialog...") + try: + window._show_about() + app.processEvents() + about_widgets = [w for w in app.topLevelWidgets() if w != window and w.isVisible()] + if about_widgets: + shot(about_widgets[0], "about.png", delay_ms=800) + about_widgets[0].close() + else: + print(" No about dialog found, capturing fallback...") + shot(window, "about.png") + except Exception as e: + print(f" About dialog failed: {e}") + shot(window, "about.png") + + print(f"\nAll screenshots saved to: {OUT_DIR}") + for p in sorted(OUT_DIR.glob("*.png")): + print(f" {p.name}") + + +if __name__ == "__main__": + main()