LocalDub: Real-Time AI YouTube Dubbing

A 100% local, zero-latency, and professional AI dubbing extension for YouTube.

What is LocalDub?

LocalDub is a powerful Chrome Extension and Python backend combo that translates and dubs YouTube videos into your native language (English, Turkish, German, Spanish, French) in real-time. You do not pay for any cloud APIs. All audio extraction, AI speech-to-text, LLM translation, and AI text-to-speech happen entirely on your local machine.

Important

LocalDub features a revolutionary Zero-Drift HTML5 Time-Stretching Algorithm. When the AI generates a translation that is longer or shorter than the original video scene, the browser dynamically stretches the audio while perfectly preserving the human pitch. This guarantees that the dubbed audio perfectly matches the lip movements and video frames with absolutely zero delay accumulation.

Architecture & How it Works

The system operates using a seamless pipeline between the browser and your local GPU/CPU:

sequenceDiagram
    participant YT as YouTube (Chrome Extension)
    participant FF as FFmpeg (Downloader)
    participant ASR as Faster-Whisper (Speech-to-Text)
    participant LLM as Ollama Gemma2 (Translator)
    participant TTS as Edge-TTS (Voice Synthesizer)

    YT->>FF: "Video is at 0:03. Give me the audio."
    FF->>ASR: Extracts 3-second raw audio chunk
    ASR->>LLM: Transcribes: "Hello guys"
    LLM->>TTS: Translates: "Merhaba arkadaşlar"
    TTS->>YT: Sends perfectly timed MP3 via WebSockets
    Note over YT,TTS: The extension receives the audio, dynamically calculates playbackRate, and plays it in flawless lip-sync!

Core Technologies

Faster-Whisper: The fastest and most accurate offline Speech-to-Text (ASR) AI. Includes an advanced Voice Activity Detection (VAD) filter to strip silent frames.
Ollama (Gemma2 / Llama3): A strictly lobotomized local LLM pipeline. It utilizes Few-Shot completion prompting, Temperature 0.0, and strict stop tokens (\n) to act as a pure machine translator. It never hallucinates conversational text and preserves technical terms (e.g., Firewall, React, Cheatsheet).
Edge-TTS: Microsoft's highly natural, breathing neural voice synthesis engine.
HTML5 Time-Stretching: The frontend mathematical engine that scales audio lengths to perfectly fit the visual scenes without sounding distorted (preserves pitch).

Installation Guide (Step-by-Step)

Follow these instructions carefully to set up your local AI dubbing studio.

Step 1: Install Python & FFmpeg

Download and install Python 3.12 from Python.org.
- !Critical: During installation, make sure to check the box that says "Add Python to PATH".*
Download FFmpeg. Extract the folder and add the bin directory to your Windows Environment Variables (PATH). Open your Command Prompt (CMD) and type ffmpeg to verify it is installed correctly.

Step 2: Install Ollama (The AI Brain)

Download and install Ollama from Ollama.com.
Open your terminal (CMD or PowerShell) and pull the translation model by running:
```
ollama run gemma2:9b
```
(Note: This is a 5GB+ download. Wait for it to finish and then close the terminal).

Step 3: Install the Backend Server

Clone or download this repository to your computer.
Open a terminal inside the downloaded folder and navigate to the backend directory.

Install the required Python libraries:

cd backend
pip install -r requirements.txt

Step 4: Load the Chrome Extension

Open Google Chrome and go to chrome://extensions/.
Enable Developer mode using the toggle in the top right corner.
Click the Load unpacked button in the top left.
Select the extension folder located inside the LocalDub project directory.
The LocalDub logo will now appear in your browser's extension bar!

How to Use

Start the AI Server: Open a terminal in the backend folder and run:
```
python main.py
```
(Wait until you see "WebSocket connected" and "Uvicorn running" in the logs).
Start Dubbing:
- Open any foreign language video on YouTube.
- Click the LocalDub extension icon in your browser.
- Select your desired target language from the dropdown menu.
- Click Enable Dubbing.

The video will pause for 1-2 seconds to buffer the initial AI generation, and then play continuously with perfectly synced, natural-sounding AI dubbing!

Industry Standard Roadmap (Coming Soon)

We are actively working on upgrading LocalDub to compete with multi-million dollar AI dubbing startups:

Zero-Shot Voice Cloning: Instead of using Microsoft's default voices, the backend will dynamically clone the original YouTuber's exact voice print using XTTSv2 or CosyVoice.
Audio Separation (Background Music Preservation): Integrating Facebook's Demucs AI to strip ONLY the vocals from the video. Background music, explosions, and sound effects will be preserved and mixed beneath the translated AI dub.
Speaker Diarization: Integrating Pyannote.audio to detect multiple speakers in interviews or podcasts (Speaker A vs. Speaker B) and assigning them distinct AI voices automatically.
Semantic Chunking: Advanced VAD logic that chunks audio based on breathing and sentence completion rather than strict 3-second blocks, ensuring 100% grammatical perfection.

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute it.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
extension		extension
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LocalDub: Real-Time AI YouTube Dubbing

What is LocalDub?

Architecture & How it Works

Core Technologies

Installation Guide (Step-by-Step)

Step 1: Install Python & FFmpeg

Step 2: Install Ollama (The AI Brain)

Step 3: Install the Backend Server

Step 4: Load the Chrome Extension

How to Use

Industry Standard Roadmap (Coming Soon)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LocalDub: Real-Time AI YouTube Dubbing

What is LocalDub?

Architecture & How it Works

Core Technologies

Installation Guide (Step-by-Step)

Step 1: Install Python & FFmpeg

Step 2: Install Ollama (The AI Brain)

Step 3: Install the Backend Server

Step 4: Load the Chrome Extension

How to Use

Industry Standard Roadmap (Coming Soon)

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages