Skip to content

ocrbase-hq/ocrbase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ocrbase: Model-Agnostic OCR API

Release Docker Bun License

πŸ“„ ocrbase is a lightweight, model-agnostic API that standardizes document parsing across visual language models (VLMs).

Key Features of ocrbase

πŸͺΆ Lightweight: Tiny Bun + Elysia service, single container, minimal footprint.

πŸ”Œ Model-Agnostic: Point at any supported VLM β€” GLM-OCR, PaddleOCR-VL β€” via env vars.

πŸ“Š State of the Art: Backed by models scoring β‰₯94.5 on OmniDocBench v1.5.

πŸ’Ž Easy to Deploy: One command away from a working OCR API.

🧩 Core

  • /v1/parse β€” turn a document into text
  • /v1/parse/async β€” enqueue a parse job
  • /v1/extract β€” extract structured JSON from a document
  • /v1/extract/async β€” enqueue an extract job
  • /v1/job/:jobId β€” inspect parse or extract job status

🧠 Models

Both models are state of the art:

πŸ“‹ Requirements

Important

ocrbase does not ship the models β€” point it at a running inference server:

πŸš€ Quick Start

docker run -d -p 3000:3000 \
  -e PADDLEOCR_URL=http://localhost:8190 \
  -e GLM_OCR_URL=http://localhost:5002 \
  --name ocrbase ghcr.io/ocrbase-hq/ocrbase

πŸ› οΈ Develop

bun install
bun dev

☁️ Optional S3 Input Staging

If S3_ACCESS_KEY_ID, S3_SECRET_ACCESS_KEY, S3_BUCKET, and S3_ENDPOINT are set, /v1/parse will:

  • upload incoming File inputs to S3
  • fetch remote document URLs and upload the contents to S3
  • upload base64 or data URL payloads to S3
  • pass a presigned GET URL into the selected document model

If those env vars are not set, ocrbase keeps the current direct behavior and sends the original input to the model.

πŸ“¬ Optional BullMQ Parse Queue

If REDIS_URL and the S3 env vars above are set, queue mode is enabled:

  • POST /v1/parse uploads or normalizes the input to S3, enqueues a parse job, waits for completion, and returns the normal parse response
  • POST /v1/parse/async returns 202 { jobId }
  • GET /v1/job/:jobId returns the job state plus result or error

If Redis is missing, or Redis is present but S3 is not fully configured, POST /v1/parse keeps the existing direct behavior and the async/status endpoints return 503.

When queue mode is enabled, Bull Board is also available at /v1/admin/queues.

About

πŸ“„ PDF/IMG ->.MD/JSON Document OCR API for PaddleOCR and GLMOCR. Self-hostable.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors