Linguaflow: Multilingual Translation API

Objective

Linguaflow is designed to provide seamless, high-performance multilingual translation capabilities using FastAPI and Hugging Face's MarianMT models. It enables developers to easily translate English text into multiple target languages, including French, Portuguese, and Spanish, with scalability and accuracy at its core.

How It Works

The translation pipeline is straightforward yet powerful:

| Client Request | --> | FastAPI Endpoint (/translate) | --> | MarianMT Tokenizer | --> 
| MarianMT Model (Generate Translation) | --> | Decode Translations | --> | Response to Client |

Pipeline Workflow

Below is a step-by-step workflow of the Linguaflow system:

1. Client Request:
   |-- A user sends a JSON payload to the `/translate` endpoint.
   |-- Example payload: {"src_text": [">>fra<< This is a test sentence."]}

2. Input Validation:
   |-- FastAPI validates the request payload using Pydantic's `BaseModel`.
   |-- Ensures the input follows the required schema.

3. Tokenization:
   |-- The MarianTokenizer from Hugging Face processes the input text.
   |-- Converts the text into token IDs compatible with the MarianMT model.
   |-- Adds any necessary padding for batch processing.

4. Translation:
   |-- The MarianMT model generates the translation in the specified target language.
   |-- Utilizes the Transformer architecture's encoder-decoder mechanism for efficient and accurate results.

5. Decoding:
   |-- The tokenized output from the model is decoded back into human-readable text.
   |-- Special tokens are removed to produce clean translations.

6. Response:
   |-- The translated text is encapsulated in a JSON response.
   |-- Example response: {"translated_text": ["Ceci est une phrase de test."]}

Tech Stack

Component	Technology	Description
Backend Framework	FastAPI	High-performance framework for building APIs with Python.
Translation Model	MarianMT (Hugging Face)	Pre-trained transformer models optimized for multilingual translation.
Data Validation	Pydantic	Ensures request payloads adhere to the expected schema.
Server	Uvicorn	Lightning-fast ASGI server for running FastAPI apps.
Libraries	Transformers, PyTorch	Core libraries for loading and fine-tuning transformer models.

Setup and Installation

Prerequisites

Python Version: 3.8 or higher.
Package Manager: pip (or Conda for virtual environments).

Steps

1. Clone the Repository

git clone <repository-url>
cd linguaflow

2. Create a Virtual Environment

# On Windows
python -m venv venv
.\venv\Scripts\activate

# On macOS/Linux
python3 -m venv venv
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Run the Application

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

5. Test the API

Navigate to http://127.0.0.1:8000 to access the root endpoint. Test the /translate endpoint using tools like Postman, cURL, or Swagger UI (automatically available via FastAPI).

API Documentation

1. Root Endpoint

URL: /
Method: GET
Description: Confirms that the API is running.

Sample Response:

{
  "message": "Welcome to the MarianMT Translation API!"
}

2. Translation Endpoint

URL: /translate
Method: POST
Description: Translates English text into a target language.
Request Payload:
```
{
  "src_text": [">>fra<< This is a test sentence to translate to French."]
}
```
- Language Prefixes:
  - >>fra<<: French
  - >>por<<: Portuguese
  - >>spa<<: Spanish

Response:

{
  "translated_text": ["Ceci est une phrase de test à traduire en français."]
}

Key Features

Multilingual Translation:
- Supports English to French, Portuguese, and Spanish translations.
- Easily extensible to other languages supported by MarianMT.
Asynchronous Processing:
- Handles concurrent requests efficiently, making it ideal for production-grade applications.
User-Friendly Testing:
- Swagger UI and JSON responses simplify API integration and debugging.
Error Handling:
- Built-in mechanisms for handling invalid inputs and unexpected exceptions.

Future Enhancements

Support for Additional Languages:
- Expand the API to include languages like German, Chinese, and Hindi.
Advanced Features:
- Context-based translation to preserve tone and sentiment.
- Batch processing for large-scale translation tasks.
Deployment to Cloud:
- Deploy on AWS or Google Cloud for global scalability.

Contributions

Contributions are welcome! Feel free to fork the repository, make improvements, and submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Linguaflow		Linguaflow
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Linguaflow: Multilingual Translation API

Table of Contents

Objective

How It Works

Pipeline Workflow

Tech Stack

Setup and Installation

Prerequisites

Steps

1. Clone the Repository

2. Create a Virtual Environment

3. Install Dependencies

4. Run the Application

5. Test the API

API Documentation

1. Root Endpoint

2. Translation Endpoint

Key Features

Future Enhancements

Contributions

About

Releases

Packages

Languages

BhawnaMehbubani/Multilingual_translation_API_built_using_FastAPI_and-MarianMT_from_Hugging_Face_Transformers

Folders and files

Latest commit

History

Repository files navigation

Linguaflow: Multilingual Translation API

Table of Contents

Objective

How It Works

Pipeline Workflow

Tech Stack

Setup and Installation

Prerequisites

Steps

1. Clone the Repository

2. Create a Virtual Environment

3. Install Dependencies

4. Run the Application

5. Test the API

API Documentation

1. Root Endpoint

2. Translation Endpoint

Key Features

Future Enhancements

Contributions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages