Name	Name	Last commit message	Last commit date
parent directory ..
app	app
docs	docs
tests	tests
.gitignore	.gitignore
README.md	README.md
config.py	config.py
requirements.txt	requirements.txt
run.py	run.py

Name

Last commit message

Last commit date

docs

OCR Extractor Web App

OCR Extractor is a Flask-based web application that accepts multiple image uploads, runs OCR on each image with Tesseract, and presents the extracted text in a clean, copyable interface. Each image result is displayed in its own section so text from different uploads stays clearly separated.

Project Overview

Upload multiple image files in a single request
Extract text from each image using the tesseract CLI
Normalize OCR output for cleaner spacing and readability
Copy extracted text from per-image result panels
Download a single .txt file containing text from all processed images
Use a lightweight Tailwind CSS interface for a simple user experience

Project Structure

ocr_extract/
├── app/
│   ├── services/
│   │   └── ocr.py
│   ├── templates/
│   │   └── index.html
│   ├── utils/
│   │   └── text.py
│   ├── __init__.py
│   └── routes.py
├── tests/
│   └── test_app.py
├── config.py
├── implementation.md
├── requirements.txt
├── run.py
└── todo.md

Setup Instructions

Create and activate a virtual environment.

python3 -m venv .venv
source .venv/bin/activate

Install Python dependencies.

pip install -r requirements.txt

Install Tesseract OCR if it is not already available on your system.

Ubuntu or Debian:

sudo apt-get update
sudo apt-get install -y tesseract-ocr

Optionally configure environment variables.

export SECRET_KEY="replace-this-in-production"
export TESSERACT_COMMAND="tesseract"

Usage Guide

Start the Flask development server.

python3 run.py

Open http://127.0.0.1:5000 in your browser.
Upload one or more image files.
Click Extract Text.
Review the combined output or the per-image result cards.
Use Copy All Text, Download TXT, or the per-image copy buttons as needed.

Running Tests

python3 -m unittest discover -s tests -v

Notes

Supported image types: PNG, JPG, JPEG, BMP, TIFF, GIF, WEBP
Maximum upload size is 16 MB per request
Tailwind CSS is loaded through its CDN for a simple setup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

OCR Extractor Web App

Project Overview

Project Structure

Setup Instructions

Usage Guide

Running Tests

Notes

FilesExpand file tree

04_ocr_extract

Directory actions

More options

Directory actions

More options

Latest commit

History

04_ocr_extract

Folders and files

parent directory

README.md

OCR Extractor Web App

Project Overview

Project Structure

Setup Instructions

Usage Guide

Running Tests

Notes