Skip to content

edy-cpu/data-pipeline-api

Repository files navigation

Data Pipeline API

Small FastAPI service that demonstrates a clean backend/data-processing pipeline: receive text through REST API, normalize it, extract simple metrics and keywords, detect sentiment, and return a structured JSON response.

This is a portfolio demo project focused on readable architecture, API design, tests, and Docker-based local launch.

What this project shows

  • REST API design with FastAPI
  • Pydantic request/response schemas
  • Text cleaning and normalization
  • Keyword extraction from normalized text
  • Rule-based sentiment detection
  • Layered structure: routers, services, models, tests
  • Docker and docker-compose launch flow
  • Unit tests for core text-processing logic

Data Flow

POST /api/analyze
        ↓
Input validation
        ↓
Text cleaning / normalization
        ↓
Metrics extraction
        ↓
Keyword extraction
        ↓
Rule-based sentiment detection
        ↓
Structured JSON response

Tech Stack

  • Python 3.11
  • FastAPI
  • Pydantic
  • Uvicorn
  • Pytest
  • Docker
  • Docker Compose

Project Structure

data-pipeline-api/
  app/
    main.py
    routers/
      analyze.py
    services/
      text_service.py
    models/
      schemas.py
  tests/
    test_text_service.py
  Dockerfile
  docker-compose.yml
  Makefile
  requirements.txt
  requirements-dev.txt
  PRD.md
  README.md

API Endpoints

Healthcheck

GET /health

Response:

{
  "status": "ok"
}

Analyze Text

POST /api/analyze

Request:

{
  "text": "Курс очень полезный, но немного дорогой"
}

Response:

{
  "original_text": "Курс очень полезный, но немного дорогой",
  "cleaned_text": "курс очень полезный но немного дорогой",
  "word_count": 6,
  "char_count": 38,
  "keywords": ["курс", "полезный", "дорогой"],
  "sentiment": "mixed"
}

Local Run

Makefile

make install
make run

Manual

pip install -r requirements.txt
uvicorn app.main:app --reload

Docker

docker build -t data-pipeline-api .
docker run -p 8000:8000 data-pipeline-api

Docker Compose

docker-compose up --build

Swagger UI:

http://127.0.0.1:8000/docs

Tests

pip install -r requirements-dev.txt
pytest

The tests cover the core service logic: cleaning, keyword extraction, sentiment detection, and complete text analysis.

Example Use Cases

This small service can be extended into:

  • feedback analysis for landing pages or support tickets;
  • review classification for marketplace or app-store reviews;
  • pre-processing layer before LLM enrichment;
  • lightweight analytics API for user-generated text.

Current Limitations

This is intentionally a compact demo. Sentiment detection is rule-based, not ML-based. There is no database, authentication, queue, or production observability layer.

Possible Improvements

  • Batch text analysis
  • PostgreSQL storage for processed results
  • LLM-based enrichment layer
  • Background jobs for larger datasets
  • API rate limiting and structured logging
  • CI pipeline with tests on every push

Purpose

The goal of this repository is to show a practical foundation: clean backend structure, API contract, data transformation logic, Docker packaging, and testable service code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors