BytecodeBrewer · BytecodeBrewer · Jun 25, 2026 · Jun 25, 2026 · Jun 25, 2026 · Jun 25, 2026
diff --git a/docs/research-databases-and-storage.md b/docs/research-databases-and-storage.md
@@ -0,0 +1,383 @@
+# ARGUS Storage Research
+
+## Goal
+
+Research what ARGUS should store and which database/storage approach fits the project.
+
+ARGUS is moving from live API requests and in-memory analytics toward real data workflows.  
+The first storage decision should support local market analytics, SQL practice and future dashboard features without adding unnecessary infrastructure too early.
+
+---
+
+## Storage Use Cases
+
+ARGUS should eventually store different kinds of data, but not all of them need to be implemented at once.
+
+Relevant storage use cases are:
+
+* historical exchange rates
+* cleaned historical market data
+* source information
+* instruments that ARGUS can analyze
+* later watchlists
+* later generated reports
+* later macroeconomic data
+* later paper-trading history
+
+The first implementation should focus on historical market data and the basic entities needed to query it.
+
+---
+
+## Storage Candidates
+
+ARGUS should compare storage options based on the current project phase.
+
+The project currently needs local analytical storage, not a full server or cloud database.
+
+### DuckDB
+
+DuckDB is a local analytical database.
+
+It is a strong fit for ARGUS because it supports SQL-based analytics without requiring a database server.
+
+Useful for:
+
+* historical market data
+* local time-series analysis
+* SQL practice
+* Python-based analytics
+* notebook-based exploration
+* dashboard data preparation
+
+Limitations:
+
+* not a server database
+* less suitable for multi-user product features later
+
+---
+
+### SQLite
+
+SQLite is a simple local database.
+
+It is strong for small app storage and simple persistence.
+
+Useful for:
+
+* settings
+* small app-state data
+* simple local tables
+* later watchlists
+* lightweight metadata
+
+Limitations:
+
+* less analytics-focused than DuckDB
+* not ideal as the main storage layer for historical market data
+* better for app-state than analytical time-series queries
+
+---
+
+### PostgreSQL
+
+PostgreSQL is a server-based relational database.
+
+It is a strong long-term option when ARGUS becomes more product-like.
+
+Useful for:
+
+* server-based storage
+* user-facing features
+* report history
+* watchlists
+* paper-trading history
+* richer metadata
+* cloud-ready architecture
+* SQLGate usage later
+
+Limitations:
+
+* more setup than needed right now
+* requires server or Docker setup
+* adds infrastructure complexity too early
+
+Fit for ARGUS:
+
+PostgreSQL should be introduced later when ARGUS moves toward a server-based or cloud-ready architecture.
+
+---
+
+## Local, Server and Cloud Options
+
+| Option | Meaning | Fit Now | Fit Later |
+|---|---|---:|---:|
+| Local storage | Database runs locally inside or next to the project | High | High |
+| Server database | Database runs as a separate service, for example PostgreSQL | Medium | High |
+| Cloud storage/database | Managed storage or database in the cloud | Low | High |
+
+ARGUS should start with local storage.
+
+Reason:
+
+* simpler setup
+* easier learning curve
+* good fit for a Python analytics project
+* no cloud or server infrastructure required yet
+* enough for historical data, metrics and dashboard development
+
+Server and cloud storage should come later when ARGUS has stronger product features such as reports, user state, paper-trading history or deployment needs.
+
+---
+
+## Recommended First Storage Approach
+
+DuckDB should be the first storage technology for ARGUS.
+
+Reason:
+
+* ARGUS currently needs local analytical storage, not a full server database
+* DuckDB fits historical time-series analysis well
+* it supports SQL-based analytics without requiring a database server
+* it works well with Python and notebook-based exploration
+* it keeps the first storage implementation manageable
+* it can later be replaced or complemented by PostgreSQL if ARGUS becomes more product-like
+
+The first storage implementation should focus on:
+
+* historical market data
+* cleaned OHLCV-ready price data
+* source information
+* instruments that ARGUS can analyze
+
+PostgreSQL and SQLGate become more relevant later.
+
+For the first DuckDB phase, the goal is to build a clean local analytics workflow.
+
+---
+
+## Developer Interaction Workflow
+
+ARGUS should use a practical developer workflow for DuckDB.
+
+The goal is to make the database easy to inspect, explore and validate before logic is moved into production code.
+
+### Notebook Exploration
+
+Notebooks should be the main exploration layer.
+
+They are useful for:
+
+* opening the DuckDB database
+* testing SQL queries
+* validating imported data
+* comparing SQL results with pandas calculations
+* exploring metric logic
+* documenting research assumptions
+
+This workflow is especially useful before turning queries into reusable project code.
+
+Notebook exploration should be preferred over a GUI database tool in the first phase.
+
+### DuckDB CLI
+
+The DuckDB CLI should be used for quick database inspection.
+
+It is useful for:
+
+* checking available tables
+* running small SQL queries
+* validating stored records
+* debugging the local database file
+
+The CLI is not the main research environment, but it is useful as a fast inspection tool.
+
+A GUI tool such as DBeaver can be tested if needed, but it should stay optional.
+
+---
+
+## First Data Model Direction
+
+The first data model should support FX data now and broader market data later.
+
+ARGUS should not use a narrow `date | value` table as the main market-data model.
+
+That would work for simple exchange rates, but it would become limiting once ARGUS adds stocks, ETFs, indices or broader market APIs.
+
+The first model should focus on three tables:
+
+```text
+data_sources
+instruments
+price_bars
+```
+
+### data_sources
+
+Stores where data came from.
+
+Recommended first fields:
+
+```text
+id
+name
+provider_kind
+requires_api_key
+created_at
+updated_at
+```
+
+Example:
+
+| name | provider_kind | requires_api_key |
+|---|---|---:|
+| Frankfurter | fx_rates | false |
+| yfinance | market_prices | false |
+| FRED | macro_data | true |
+
+### instruments
+
+Stores what ARGUS can analyze.
+
+Examples:
+
+* EUR/USD
+* AAPL
+* SPY
+* S&P 500
+* BTC-USD
+
+Recommended first fields:
+
+```text
+id
+symbol
+name
+asset_class
+currency
+exchange
+base_currency
+quote_currency
+created_at
+updated_at
+```
+
+Example:
+
+| symbol | name | asset_class | currency | exchange | base_currency | quote_currency |
+|---|---|---|---|---|---|---|
+| EUR/USD | Euro / US Dollar | fx | null | null | EUR | USD |
+| AAPL | Apple Inc. | stock | USD | NASDAQ | null | null |
+| SPY | SPDR S&P 500 ETF | etf | USD | NYSE Arca | null | null |
+
+### price_bars
+
+Stores historical market data in an OHLCV-ready structure.
+
+Recommended first fields:
+
+```text
+id
+instrument_id
+source_id
+timestamp
+timeframe
+open
+high
+low
+close
+adjusted_close
+volume
+created_at
+updated_at
+```
+
+For Frankfurter, the exchange rate can be stored in `close`.
+
+The other OHLCV fields can stay empty until ARGUS uses data sources that provide them.
+
+Example:
+
+| symbol | timestamp | timeframe | open | high | low | close | adjusted_close | volume |
+|---|---|---|---:|---:|---:|---:|---:|---:|
+| EUR/USD | 2024-01-02 | 1d | null | null | null | 1.095 | null | null |
+| AAPL | 2024-01-02 | 1d | 187.15 | 188.44 | 183.89 | 185.64 | 184.25 | 50200000 |
+
+---
+
+## Recommended First Implementation Step
+
+The first storage implementation should not be tied to one specific data provider.
+
+ARGUS currently works with an existing ExchangeRate API client and evaluates broader market data through yfinance.  
+Frankfurter may be added later as a stronger FX-oriented historical data source.
+
+The storage layer should therefore focus on a normalized internal market-data format instead of depending on one API response structure.
+
+Recommended first step:
+
+```text
+active data client
+→ normalize into instruments and price_bars
+→ store in DuckDB
+→ query with SQL
+→ use results for analytics and charts
+```
+
+---
+
+## Future Direction
+
+Later sprints can expand the storage layer step by step.
+
+Possible later additions:
+
+| Future Area | Possible Additions |
+|---|---|
+| Better source mapping | source-specific symbols, provider metadata |
+| Watchlists | user-selected instruments |
+| Reports | generated report metadata and history |
+| Macro data | FRED indicators and observations |
+| Paper trading | simulated orders, positions and portfolio history |
+| Server architecture | PostgreSQL |
+| SQL tooling | SQLGate with PostgreSQL |
+| Cloud direction | managed PostgreSQL or cloud storage |
+
+SQLGate should be kept for a later PostgreSQL phase.
+
+It becomes useful when ARGUS moves toward:
+
+* server-based storage
+* stronger database management
+* richer metadata
+* more stable application state
+* user-facing features
+* report history
+* cloud-ready architecture
+
+Additional metadata such as documentation links, terms links or provider governance fields can also become useful later.
+
+For the first DuckDB phase, these details should stay in research documentation instead of the database schema.
+
+---
+
+## Final Recommendation
+
+ARGUS should start with DuckDB as the first local analytics storage layer.
+
+DuckDB fits the current phase best because ARGUS needs local analytical SQL workflows, not a full server database yet.
+
+The first implementation should store historical market data in an OHLCV-ready structure.
+
+The recommended first data model is:
+
+```text
+data_sources
+instruments
+price_bars
+```
+
+Notebook exploration should be the main developer workflow before SQL logic is moved into application code.
+
+The DuckDB CLI can be used for quick inspection.
+
+PostgreSQL and SQLGate should be introduced later when ARGUS moves toward a more product-like or cloud-based architecture.