Small FastAPI service that demonstrates a clean backend/data-processing pipeline: receive text through REST API, normalize it, extract simple metrics and keywords, detect sentiment, and return a structured JSON response.
This is a portfolio demo project focused on readable architecture, API design, tests, and Docker-based local launch.
- REST API design with FastAPI
- Pydantic request/response schemas
- Text cleaning and normalization
- Keyword extraction from normalized text
- Rule-based sentiment detection
- Layered structure: routers, services, models, tests
- Docker and docker-compose launch flow
- Unit tests for core text-processing logic
POST /api/analyze
↓
Input validation
↓
Text cleaning / normalization
↓
Metrics extraction
↓
Keyword extraction
↓
Rule-based sentiment detection
↓
Structured JSON response
- Python 3.11
- FastAPI
- Pydantic
- Uvicorn
- Pytest
- Docker
- Docker Compose
data-pipeline-api/
app/
main.py
routers/
analyze.py
services/
text_service.py
models/
schemas.py
tests/
test_text_service.py
Dockerfile
docker-compose.yml
Makefile
requirements.txt
requirements-dev.txt
PRD.md
README.md
GET /healthResponse:
{
"status": "ok"
}POST /api/analyzeRequest:
{
"text": "Курс очень полезный, но немного дорогой"
}Response:
{
"original_text": "Курс очень полезный, но немного дорогой",
"cleaned_text": "курс очень полезный но немного дорогой",
"word_count": 6,
"char_count": 38,
"keywords": ["курс", "полезный", "дорогой"],
"sentiment": "mixed"
}make install
make runpip install -r requirements.txt
uvicorn app.main:app --reloaddocker build -t data-pipeline-api .
docker run -p 8000:8000 data-pipeline-apidocker-compose up --buildSwagger UI:
http://127.0.0.1:8000/docs
pip install -r requirements-dev.txt
pytestThe tests cover the core service logic: cleaning, keyword extraction, sentiment detection, and complete text analysis.
This small service can be extended into:
- feedback analysis for landing pages or support tickets;
- review classification for marketplace or app-store reviews;
- pre-processing layer before LLM enrichment;
- lightweight analytics API for user-generated text.
This is intentionally a compact demo. Sentiment detection is rule-based, not ML-based. There is no database, authentication, queue, or production observability layer.
- Batch text analysis
- PostgreSQL storage for processed results
- LLM-based enrichment layer
- Background jobs for larger datasets
- API rate limiting and structured logging
- CI pipeline with tests on every push
The goal of this repository is to show a practical foundation: clean backend structure, API contract, data transformation logic, Docker packaging, and testable service code.