A production-ready Python project template with a modular architecture, featuring separation of concerns between backend and frontend layers, factory design patterns, and configuration management.
Project/
βββ app.py # Application entry point
βββ main.py # Alternative entry point
βββ pyproject.toml # Project dependencies and metadata
βββ uv.lock # Dependency lock file
βββ .env # Environment variables (not tracked)
β
βββ backend/ # Backend layer - Data processing & business logic
β βββ src/
β βββ auth/ # Authentication modules
β βββ computer/ # Computation modules
β βββ data/ # Data layer with factory pattern
β β βββ reader/ # Data readers (CSV, Excel, etc.)
β β β βββ interface.py # IReader abstract base class
β β β βββ factory.py # DataReaderFactory
β β β βββ csv.py # CSV reader implementation
β β β βββ excel.py # Excel reader implementation
β β βββ writer/ # Data writers
β β βββ interface.py # IWriter abstract base class
β β βββ factory.py # DataWriterFactory
β βββ ingestor/ # Data ingestion modules
β βββ processor/ # Data processing modules
β βββ validator/ # Data validation modules
β
βββ frontend/ # Frontend layer - Application management & UI
β βββ src/
β βββ auth/ # Frontend authentication
β β βββ manager.py
β βββ page/ # Page/view components
β βββ manager.py # AppManager - main application controller
β
βββ config/ # Configuration management
β βββ name_space.py # ConfigFactory - YAML/ENV config loader
β βββ __init__.py
β
βββ data/ # Data storage
β βββ config/ # Configuration files
β β βββ config.yaml # Application configuration
β β βββ Schema.xlsx # Data schema definitions
β βββ db/ # Database files
β βββ source/ # Source data files
β βββ ui/ # UI-related data
β
βββ test/ # Test modules
βββ __init__.py
- Backend Layer: Handles data processing, business logic, and computation
- Frontend Layer: Manages application flow, user interface, and presentation
- Config Layer: Centralized configuration management
- DataReaderFactory: Dynamically creates appropriate data readers (CSV, Excel, etc.)
- DataWriterFactory: Dynamically creates appropriate data writers
- Interface-based Design: Uses abstract base classes for extensibility
- ConfigFactory: Converts YAML and .env files into nested namespaces
- Schema-driven: Excel-based schema definitions for data validation
- Environment Variables: Secure credential management via
.env
- UV Package Manager: Fast, modern Python package management
- pyproject.toml: Standard Python project configuration
- Core dependencies:
pandas- Data manipulationpython-dotenv- Environment variable managementpyyaml- YAML configuration parsing
- Python 3.14+
- UV package manager (recommended) or pip
-
Clone the repository
git clone <your-repo-url> cd Project
-
Create virtual environment
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies
Using UV (recommended):
uv sync
Using pip:
pip install -e . -
Configure environment
cp .env.example .env # Create .env from example # Edit .env with your configuration
-
Update configuration
- Edit
data/config/config.yamlwith your parameters - Update
data/config/Schema.xlsxwith your data schemas
- Edit
python app.pyOr alternatively:
python main.pyfrom backend.src.data.reader.factory import DataReaderFactory
from config.name_space import ConfigFactory
# Initialize configuration
cfg = ConfigFactory(yaml_path='data/config/config.yaml').initialize()
# Create reader factory
reader_factory = DataReaderFactory(cfg)
# Read CSV file
df = reader_factory.read(source='csv', path='data/source/data.csv')
# Read Excel file
df = reader_factory.read(source='excel', path='data/source/data.xlsx', sheet_name='Sheet1')from config.name_space import ConfigFactory
# Initialize configuration
cfg = ConfigFactory(yaml_path='data/config/config.yaml').initialize()
# Access configuration values
print(cfg.Param.Name) # Access YAML parameters
print(cfg.Secret.API_KEY) # Access .env secrets
print(cfg.Path.ENV) # Access path configurations
print(cfg.Schema.ACCOUNT_NAME) # Access schema definitions-
Create a new reader:
from backend.src.data.reader.interface import IReader import pandas as pd class JSONReader(IReader): def read(self, path: str, **kwargs) -> pd.DataFrame: return pd.read_json(path, **kwargs)
-
Register in factory:
# In backend/src/data/reader/factory.py reader_dict = { 'csv': CSVReader, 'excel': ExcelReader, 'json': JSONReader, # Add your reader }
- Single Responsibility: Each module has one clear purpose
- Open/Closed: Extensible via interfaces without modifying existing code
- Liskov Substitution: Readers/writers are interchangeable via interfaces
- Interface Segregation: Minimal, focused interfaces (IReader, IWriter)
- Dependency Inversion: Depends on abstractions, not concrete implementations
- Factory Pattern: Dynamic object creation based on runtime parameters
- Strategy Pattern: Interchangeable reader/writer implementations
- Namespace Pattern: Hierarchical configuration access
Param:
Name: 'YourAppName'
Version: '1.0.0'
Path:
ENV: '.env'
DATA: 'data/source'
OUTPUT: 'data/output'# API Keys
API_KEY=your_api_key_here
# Database
DB_HOST=localhost
DB_PORT=5432
DB_NAME=mydbExcel file defining data schemas for your application. Each sheet represents a table/entity schema.
- Sheet names should use natural language with spaces (e.g., "Account Name", "User Profile", "Transaction Data")
- The ConfigFactory automatically converts sheet names to two formats:
- UPPER_SNAKE_CASE for schema access:
cfg.Schema.ACCOUNT_NAME - PascalCase for column access:
cfg.Col.AccountName
- UPPER_SNAKE_CASE for schema access:
Each sheet in Schema.xlsx must have the following columns with Variable as the index column:
| Column Name | Type | Description | Example |
|---|---|---|---|
| Variable | Index | Unique identifier for the row (index column) | user_id, email, created_at |
| Name | String | Display name or actual column name in data | User ID, Email Address, Created At |
| IS Derived? | Boolean | Whether this is a computed/derived field | TRUE, FALSE |
| Is Read? | Boolean | Whether this field is read from source data | TRUE, FALSE |
| Data Type | String | Expected data type | int, str, datetime, float |
| Description | String | Field description and purpose | Unique identifier for user |
| Validation Rule | String | Validation logic or constraints | NOT NULL, UNIQUE, > 0 |
Sheet: "Account Name"
| Variable | Name | IS Derived? | Is Read? | Data Type | Description | Validation Rule |
|---|---|---|---|---|---|---|
| account_id | Account ID | FALSE | TRUE | int | Unique account identifier | NOT NULL, UNIQUE |
| account_name | Account Name | FALSE | TRUE | str | Name of the account | NOT NULL |
| balance | Balance | FALSE | TRUE | float | Current account balance | >= 0 |
| created_at | Created At | FALSE | TRUE | datetime | Account creation timestamp | NOT NULL |
| is_active | Is Active | TRUE | FALSE | bool | Derived: balance > 0 | - |
| display_name | Display Name | TRUE | FALSE | str | Derived: account_name + account_id | - |
Sheet: "User Profile"
| Variable | Name | IS Derived? | Is Read? | Data Type | Description | Validation Rule |
|---|---|---|---|---|---|---|
| user_id | User ID | FALSE | TRUE | int | Unique user identifier | NOT NULL, UNIQUE |
| username | Username | FALSE | TRUE | str | User's login name | NOT NULL, UNIQUE |
| FALSE | TRUE | str | User's email address | NOT NULL, VALID EMAIL | ||
| full_name | Full Name | TRUE | FALSE | str | Derived: first_name + last_name | - |
The ConfigFactory processes Schema.xlsx as follows:
-
Reads all sheets from the Excel file
-
Creates schema dictionary with UPPER_SNAKE_CASE keys:
cfg.Schema.ACCOUNT_NAME # Full DataFrame for "Account Name" sheet cfg.Schema.USER_PROFILE # Full DataFrame for "User Profile" sheet
-
Creates column dictionary with PascalCase keys containing only readable/derived columns:
cfg.Col.AccountName # Dict of columns where IS Derived? OR Is Read? = TRUE cfg.Col.UserProfile # Dict of columns where IS Derived? OR Is Read? = TRUE
from config.name_space import ConfigFactory
# Initialize configuration
cfg = ConfigFactory(yaml_path='data/config/config.yaml').initialize()
# Access full schema for a table
account_schema = cfg.Schema.ACCOUNT_NAME
print(account_schema) # Full DataFrame with all schema information
# Access specific columns for a table
account_columns = cfg.Col.AccountName
print(account_columns) # Dict: {index: 'column_name'} for readable/derived fields
# Example: Get all readable columns
readable_cols = [
col for idx, col in account_columns.items()
]- Consistent Naming: Use clear, descriptive sheet names with spaces
- Index Column: Always set
Variableas the index column in Excel - Boolean Values: Use
TRUE/FALSEfor IS Derived? and Is Read? columns - Derived Fields: Mark computed fields as
IS Derived? = TRUEandIs Read? = FALSE - Source Fields: Mark fields from source data as
Is Read? = TRUE - Documentation: Use Description column to document field purpose and business logic
# Run tests
python -m pytest test/
# Run with coverage
python -m pytest --cov=backend --cov=frontend test/- Backend modules: Add to
backend/src/ - Frontend components: Add to
frontend/src/ - Configuration: Update
config/name_space.py - Dependencies: Add to
pyproject.toml
- Follow PEP 8 guidelines
- Use type hints where applicable
- Document classes and functions with docstrings
| Package | Version | Purpose |
|---|---|---|
| pandas | β₯2.3.3 | Data manipulation and analysis |
| python-dotenv | β₯1.2.1 | Environment variable management |
| pyyaml | β₯6.0.3 | YAML configuration parsing |
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Factory pattern implementation inspired by SOLID principles
- Configuration management using modern Python best practices
- Modular architecture for scalability and maintainability
Rishu Raj Gautam - @Rishurajgautam
Project Link: https://github.com/Rishurajgautam/python-project-setup
Built with β€οΈ using Python 3.14+