QuantSpectre

A Cointegration-Based Statistical Arbitrage System

A statistically rigorous pairs trading system — every signal backstopped by a formal hypothesis test.

Connection to the MSc Statistics Programme

Programme Component	Statistical Work in QuantSpectre	Code Reference
Compulsory: Probability Theory & Mathematical Statistics	OU process stochastic calculus & exact discrete-time MLE; Student-$t$ and Normal return distribution fitting via likelihood; analytical VaR/CVaR from first principles	`stats_models/ou_process.py`, `stats_models/distributions.py`
Specialisation: Quantitative Methods of Financial Markets	Johansen cointegration for stat-arb; GARCH(1,1) volatility model; Kalman filter dynamic hedge ratio; position sizing via volatility targeting	`stats_models/cointegration.py`, `stats_models/garch_model.py`, `stats_models/kalman_filter.py`, `strategy/trading_rules.py`
Specialisation: Statistical Inference	Bootstrap Sharpe ratio hypothesis test ($H_0: SR = 0$) with circular block bootstrap; Johansen trace/max-eig inference; Ljung-Box, Jarque-Bera & ARCH-LM residual diagnostics; likelihood ratio test for distribution selection	`stats_validation.py`, `stats_models/diagnostics.py`
Interdisciplinary: Professional & Applied Work	Complete backtesting engine with transaction costs; reproducible pipeline from data fetch to visualisation; Streamlit dashboard for interactive parameter tuning; trade log CSV output	`backtest/`, `dashboard/`, `run.py`, `visualization/`

Mathematical & Statistical Foundations

Full derivations with all intermediate steps are in MATHS.md. Below is a summary of the core models.

1. Cointegration (Johansen, 1988)

Two non-stationary I(1) series $X_t$, $Y_t$ are cointegrated if a linear combination $Z_t = Y_t - \beta X_t$ is I(0) — stationary. The spread mean-reverts, creating a tradable opportunity.

The Johansen test is based on the VECM:

$$\Delta \mathbf{X}t = \Pi \mathbf{X}{t-1} + \sum_{i=1}^{p-1} \Gamma_i \Delta \mathbf{X}_{t-i} + \varepsilon_t$$

Under cointegration, $\Pi = \alpha\beta'$ where $\beta$ contains the cointegrating vectors. The trace test: $$\lambda_{\text{trace}}(r) = -T \sum_{i=r+1}^{k} \ln(1 - \hat{\lambda}_i)$$

Rejecting $H_0: r = 0$ confirms at least one cointegrating relationship — the hedge ratio $\beta$ that makes the spread stationary.

2. Ornstein–Uhlenbeck Process

The spread follows a mean-reverting diffusion: $$dZ_t = \theta(\mu - Z_t)dt + \sigma dW_t$$

where $\theta > 0$ is the speed of reversion, $\mu$ the long-run mean, and $\sigma$ the instantaneous volatility.

Exact discrete-time transition (not an Euler approximation): $$Z_{t+1} \mid Z_t \sim \mathcal{N}\left(\mu + (Z_t-\mu)e^{-\theta},; \frac{\sigma^2}{2\theta}(1 - e^{-2\theta})\right)$$

MLE: minimise the negative log-likelihood: $$\ell(\theta,\mu,\sigma) = -\frac{n}{2}\log(2\pi V) - \frac{1}{2V}\sum_{i=1}^n \left(Z_i - \mu - (Z_{i-1}-\mu)e^{-\theta}\right)^2$$

with $V = \frac{\sigma^2}{2\theta}(1-e^{-2\theta})$.

Half-life: $t_{1/2} = \ln(2)/\theta$ — the expected time for a deviation from equilibrium to halve. Directly informs trade horizon and stop-loss placement.

3. Kalman Filter for Dynamic β

State-space model with random-walk hedge ratio: $$\begin{aligned} \beta_t &= \beta_{t-1} + \eta_t, \quad \eta_t \sim \mathcal{N}(0, Q) \ Y_t &= \beta_t X_t + \varepsilon_t, \quad \varepsilon_t \sim \mathcal{N}(0, R) \end{aligned}$$

The Kalman recursion (predict → gain → update) yields $\hat{\beta}{t|t}$ at every time step. An RTS smoother provides the retrospective $\hat{\beta}{t|T}$. This captures time-varying relationships that a static OLS/Johansen hedge ratio misses.

4. GARCH(1,1) Volatility

$$\sigma_t^2 = \omega + \alpha\varepsilon_{t-1}^2 + \beta\sigma_{t-1}^2$$

Estimated via the arch library. Conditional volatility $\sigma_t$ drives position scaling: trade smaller when the spread is volatile, larger when it is calm. Long-run (unconditional) volatility is $\sigma^2_\infty = \omega/(1-\alpha-\beta)$, provided $\alpha+\beta < 1$.

5. Return Distribution & Risk Metrics

Both Normal and Student-$t$ distributions are fitted via MLE. A likelihood ratio test ($H_0$: Normal adequate vs $H_1$: Student-$t$) determines whether fat tails matter — empirically they almost always do.

Risk metrics computed analytically from both fitted distributions: $$\text{VaR}\alpha = \mu + \sigma \cdot F^{-1}(\alpha)$$ $$\text{CVaR}\alpha = \mu + \sigma \cdot \frac{f(F^{-1}(\alpha))}{1-\alpha}$$

6. Bootstrap Sharpe Ratio Test

The sampling distribution of the Sharpe ratio is complex under autocorrelated, non-normal returns. The circular block bootstrap (Politis & Romano, 1992) constructs the empirical distribution of $\widehat{SR}$:

Partition returns into non-overlapping blocks of length $L = 10$
Resample blocks with replacement (wrapping circularly around the series)
For each of $B = 2{,}000$ replicates, compute $SR^*_b$
Centre under $H_0$ and compute: $p = \frac{1}{B}\sum_b \mathbb{1}(|SR^*_{b,\text{centred}}| \geq |\widehat{SR}|)$

If $p < 0.05$, the strategy's risk-adjusted performance is statistically distinguishable from zero.

Project Structure

statsProj/
├── README.md, LICENSE, requirements.txt     # Meta
├── run.py                                   # Full pipeline orchestrator
├── stats_validation.py                      # Standalone statistical test battery
├── MATHS.md, ARCHITECTURE.md                # Deep-dive documentation
├── CHANGELOG.md, CONTRIBUTING.md            # Project governance
├── demo_presentation.md, video_script.md    # Presentation materials
├── data/              → data_loader.py      # yfinance + simulated fallback
├── stats_models/      → 7 modules           # ADF, Johansen, OU, Kalman, GARCH, distributions, diagnostics
├── strategy/          → 2 modules           # Pair selector, signal generation, position sizing
├── backtest/          → 2 modules           # Vectorised engine, performance metrics
├── visualization/     → charts, interactive # Static PNG + interactive Plotly HTML
├── dashboard/         → app.py              # Streamlit parameter-tuning UI
└── results/           → Auto-generated      # Equity curve, trade log, z-score chart, etc.

31 tracked files, ~4,100 lines of Python, ~1,200 lines of documentation.

Installation & Setup

Requirements: Python 3.9+, pip.

git clone https://github.com/YOUR_USERNAME/quant-spectre.git
cd quant-spectre

# Create a virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate        # macOS / Linux
# .venv\Scripts\activate         # Windows

# Install dependencies
pip install -r requirements.txt

Dependencies

Package	Purpose
`numpy`, `pandas`	Data wrangling
`scipy`	Optimisation (MLE), distributions
`statsmodels`	ADF, KPSS, Johansen, Ljung-Box, ARCH-LM
`arch`	GARCH estimation
`yfinance`	ETF price download
`matplotlib`, `seaborn`	Static visualisation
`plotly`	Interactive HTML charts
`streamlit`	Dashboard

How to Run

Quick Start (simulated data — no internet required)

python run.py

Uses deterministic simulated cointegrated data. Runs the complete statistical pipeline — cointegration testing, OU MLE, GARCH, VaR/CVaR, backtesting, bootstrap validation, and visualisation. Expected runtime: 30–60 seconds.

With real Yahoo Finance data

python run.py --live

Fetches actual GLD & SLV daily prices (2019–2024). The statistical tests are honest — the pair may or may not show cointegration on this period. The pipeline handles both cases gracefully.

Command-line options

python run.py --live --skip-kalman  # Real data, skip Kalman filter (faster)
python run.py --skip-kalman         # Simulated data, faster (~20 sec)

Statistical validation (standalone)

python stats_validation.py

Runs the bootstrap Sharpe test, extreme value analysis, runs test, and residual diagnostics. Also importable:

from stats_validation import bootstrap_sharpe_test, full_validation_report

Interactive dashboard

streamlit run dashboard/app.py

Opens a browser window with sliders for entry threshold, stop-loss, transaction costs, and target volatility. Adjust parameters and click Run Backtest to see updated equity curves, trade logs, and GARCH diagnostics in real time.

Pipeline Overview

flowchart TD
    A[📥 Fetch Data] --> B[📊 Stationarity Tests]
    B --> C[🔬 Johansen Cointegration]
    C --> D{H₀ rejected?}
    D -->|Yes| E[📈 OU Process MLE]
    D -->|No| Z[🚫 Exit: no cointegration]
    E --> F[🔄 Kalman Dynamic β]
    E --> G[📉 GARCH Volatility]
    G --> H[📐 Distribution Fit + VaR]
    H --> I[📏 Z-Score Signals]
    I --> J[⚖️ Position Sizing]
    J --> K[⏮ Backtest Engine]
    K --> L[📋 Performance Metrics]
    L --> M[🎲 Bootstrap Sharpe Test]
    M --> N[🔍 Residual Diagnostics]
    N --> O[📸 Visualisation]

At each step, the statistical test result is printed to the console with its null hypothesis, p-value, and a plain-language interpretation.

Click to expand — sample console output

======================================================================
 JOINT STATIONARITY DIAGNOSIS — SLV
======================================================================
ADF statistic = -2.4345, p-value = 0.1322.
  → VERDICT: STRONG EVIDENCE OF NON-STATIONARITY

======================================================================
 STEP 3: JOHANSEN COINTEGRATION TEST
======================================================================
Johansen test: AT LEAST 1 cointegrating relationship detected.
  Trace statistic r=0: 24.57 > 5% critical 15.49
  Implied hedge ratio (β): 0.9890

======================================================================
 STEP 4: OU PROCESS MAXIMUM LIKELIHOOD
======================================================================
  OU Parameter Estimates (MLE):
    θ = 0.042, μ = -0.654, σ = 0.002
    Half-life: 16.4 trading days | Stationary std: 0.007
    Log-likelihood: 4957.35 | Converged: True

Results Summary

Results from a run with simulated cointegrated data (entry $z = 2.0$, stop-loss $z = 3.5$, 5 bps transaction costs). Real-data results depend on market regime and cointegration strength.

Category	Metric	Value
Cointegration	Johansen trace stat ($r=0$)	24.6 > 15.5 (5% crit) → cointegration confirmed
OU Process	Mean reversion speed $\theta$	0.042 — half-life: 16.4 trading days
	Stationary $\sigma$	0.0073
GARCH	Persistence $\alpha+\beta$	0.998 (near-IGARCH)
Distribution	Preferred model	Student-$t$ (AIC −9897 vs Normal −9889, LR $p = 0.0016$)
Performance	Sharpe ratio (annualised)	1.28
	Sortino ratio	0.96
	Calmar ratio	1.36
	Max drawdown	−1.67%
	Total return (6 years)	+14.2%
	Annualised return	+2.3%
	Annualised volatility	1.8%
Trade Stats	Number of trades	9
	Hit rate	100%
	Profit factor	∞
Inference	Bootstrap Sharpe $p$-value	$p < 0.001$ — reject $H_0: SR = 0$
	Bootstrap 95% CI for SR	[0.037, 0.122]
Diagnostics	Ljung-Box	Autocorrelation at higher lags (expected)
	Jarque-Bera	Non-normal ($p < 0.001$, skewness 1.04, kurtosis 13.75)
	ARCH-LM	Heteroskedasticity confirmed → GARCH warranted

Note on the 100% hit rate: The simulated OU spread is perfectly mean-reverting by construction. Real-world pairs trades typically achieve 45–65% hit rates with non-trivial drawdowns. The bootstrap Sharpe test remains the most reliable quality gauge across both data sources.

Visual Output

Eight output files are saved to results/ on each run:

#	File	Description
1	`price_and_spread.png`	Log prices of both assets + spread with OU mean and ±2σ band
2	`zscore_signals.png`	Z-score with entry/exit signals overlaid on thresholds
3	`equity_curve.png`	Portfolio value over time with drawdown ribbon
4	`trade_distribution.png`	Bar chart of individual trade returns (green = win, red = loss)
5	`distribution_fit.png`	Histogram of returns with fitted Normal and Student-$t$ densities
6	`dynamic_beta.png`	Kalman-smoothed hedge ratio vs static Johansen estimate
7	`spread_zscore.html`	Interactive Plotly chart — hover, zoom, pan
8	`equity_curve_interactive.html`	Interactive equity curve with drawdown

Statistical Interpretation

Is the strategy's Sharpe ratio statistically significant?

The bootstrap Sharpe test answers this definitively. A positive point estimate alone is not evidence — it could arise from a favourable draw of noise. The circular block bootstrap preserves the autocorrelation structure of real returns, constructing the true sampling distribution of $\widehat{SR}$ under $H_0: SR = 0$. If $p < 0.05$, we have evidence of genuine risk-adjusted edge. If $p \ge 0.05$, the strategy may not be distinguishable from chance.

Are the strategy residuals well-behaved?

Test	Null Hypothesis	What Failure Means
Ljung-Box	No autocorrelation up to lag $m$	Unmodeled structure remains — potential for improvement or sign of misspecification
ARCH-LM	No ARCH effects (homoskedasticity)	Volatility clustering is present — validates the use of GARCH for position scaling
Jarque-Bera	Returns are normally distributed	Fat tails or skewness — using Student-$t$ VaR/CVaR rather than Normal is more realistic

Distributional fit

Financial returns famously violate normality. The likelihood ratio test between nested Normal and Student-$t$ models checks whether the extra degree-of-freedom parameter $\nu$ improves the fit. The AIC provides additional model selection evidence. Using the Student-$t$ for VaR/CVaR computation typically yields more conservative (realistic) risk estimates.

Design Principles

#	Principle	Implementation
1	No look-ahead bias	Statistical models estimated on in-sample data (2019–2022). Hedge ratio applied out-of-sample (2023–2024). Z-score uses OU parameters from in-sample fit.
2	Separation of concerns	Estimation (`stats_models/`) is cleanly separated from trading logic (`strategy/`) and backtest execution (`backtest/`).
3	Every test explained	Each statistical test prints $H_0$, $H_1$, the test statistic, $p$-value, and a plain-language interpretation.
4	Reproducibility	Random seeds are set. Simulated data is deterministic. The pipeline runs identically every time (modulo live Yahoo Finance data).
5	Graceful degradation	If Yahoo Finance fails, the system falls back to simulated cointegrated data — the full pipeline still executes.

Limitations & Further Work

Known Limitations

Survivorship bias — GLD and SLV were chosen because they survived. A rigorous study screens a universe.
Simplified transaction costs — Fixed 5 bps. Real costs include bid-ask spread, market impact, and short-borrow fees.
Static hedge ratio — The backtest uses the in-sample Johansen $\beta$ out-of-sample. Rolling recalibration (monthly Johansen) would be more realistic.
No regime detection — Cointegrating relationships can break down during structural shifts. Markov-switching or structural break tests would help.
Linear cointegration only — Non-linear cointegration (threshold, smooth-transition) is not explored.

Potential Extensions

Copulas — Replace linear correlation with copula-based dependence (Clayton, Gumbel, Student-$t$) for tail-dependence modelling
Bayesian OU model — Priors on $\theta$, $\mu$, $\sigma$ with MCMC (PyMC) for full posterior inference and parameter uncertainty in trading signals
Multi-asset stat-arb — Johansen on 3–5 assets simultaneously; trade the portfolio spread
High-frequency — Apply to intraday data with microstructure adjustments
Online learning — Particle filter or sequential MCMC to update OU parameters in real time

FAQ / Troubleshooting

The pipeline exits with "No cointegration detected"

The Johansen test is honest. On the GLD/SLV pair (2019–2024), cointegration is often borderline or absent. The default simulated data avoids this. Alternatives: (a) use a longer sample period, (b) try a different pair, or (c) use OLS + ADF (Engle-Granger) instead.

Yahoo Finance download fails

The default simulated data path requires no internet. With --live, a download failure automatically triggers the simulated fallback (console will say Source: simulated (yfinance fallback)).

The arch library won't install

The arch package (for GARCH) may conflict with a system package of the same name. Install in a fresh virtual environment:

python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt

Streamlit says "command not found"

Run with: python -m streamlit run dashboard/app.py

Charts look blank or crash on matplotlib error

The visualisation module uses the Agg backend (file-only, non-interactive). If you get a TclError, the charts.py module already handles this — ensure you run from the project root.

Permission denied when saving to results/

The results/ directory is auto-created by run.py. If you cloned to a read-only location, either mkdir results manually or change RESULTS_DIR in run.py.

Why does simulated data show a 100% hit rate?

The simulated OU spread is perfectly mean-reverting with no structural breaks — an idealised benchmark. Real-world pairs trades typically achieve 45–65% hit rates. The simulated data demonstrates the statistical machinery; real-data performance reflects market reality.

ModuleNotFoundError: No module named 'stats_models'

The project uses local imports. Run scripts from the project root (statsProj/), not from inside a subdirectory. Do not install stats_models from PyPI.

References

Engle, R. F. & Granger, C. W. J. (1987). Co-integration and Error Correction: Representation, Estimation, and Testing. Econometrica, 55(2), 251–276.
Johansen, S. (1988). Statistical Analysis of Cointegration Vectors. Journal of Economic Dynamics and Control, 12(2–3), 231–254.
Kwiatkowski, D., Phillips, P. C. B., Schmidt, P., & Shin, Y. (1992). Testing the Null Hypothesis of Stationarity Against the Alternative of a Unit Root. Journal of Econometrics, 54(1–3), 159–178.
Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 31(3), 307–327.
Politis, D. N. & Romano, J. P. (1992). A Circular Block-Resampling Procedure for Stationary Data. Exploring the Limits of Bootstrap, 263–270.
Ljung, G. M. & Box, G. E. P. (1978). On a Measure of Lack of Fit in Time Series Models. Biometrika, 65(2), 297–303.
Chan, E. (2013). Algorithmic Trading: Winning Strategies and Their Rationale. Wiley.
Tsay, R. S. (2010). Analysis of Financial Time Series, 3rd ed. Wiley.
Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.
Embrechts, P., Klüppelberg, C., & Mikosch, T. (1997). Modelling Extremal Events. Springer.

License

MIT — see LICENSE.

Built for demonstration of statistical rigour — not for live trading. Every test is stated, every p-value is interpreted, no black boxes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuantSpectre

A Cointegration-Based Statistical Arbitrage System

Table of Contents

Connection to the MSc Statistics Programme

Mathematical & Statistical Foundations

Project Structure

Installation & Setup

Dependencies

How to Run

Quick Start (simulated data — no internet required)

With real Yahoo Finance data

Command-line options

Statistical validation (standalone)

Interactive dashboard

Pipeline Overview

Results Summary

Visual Output

Statistical Interpretation

Is the strategy's Sharpe ratio statistically significant?

Are the strategy residuals well-behaved?

Distributional fit

Design Principles

Limitations & Further Work

Known Limitations

Potential Extensions

FAQ / Troubleshooting

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github		.github
backtest		backtest
dashboard		dashboard
data		data
results		results
stats_models		stats_models
strategy		strategy
visualization		visualization
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MATHS.md		MATHS.md
README.md		README.md
demo_presentation.md		demo_presentation.md
requirements.txt		requirements.txt
run.py		run.py
stats_validation.py		stats_validation.py
video_script.md		video_script.md

Folders and files

Latest commit

History

Repository files navigation

QuantSpectre

A Cointegration-Based Statistical Arbitrage System

Table of Contents

Connection to the MSc Statistics Programme

Mathematical & Statistical Foundations

Project Structure

Installation & Setup

Dependencies

How to Run

Quick Start (simulated data — no internet required)

With real Yahoo Finance data

Command-line options

Statistical validation (standalone)

Interactive dashboard

Pipeline Overview

Results Summary

Visual Output

Statistical Interpretation

Is the strategy's Sharpe ratio statistically significant?

Are the strategy residuals well-behaved?

Distributional fit

Design Principles

Limitations & Further Work

Known Limitations

Potential Extensions

FAQ / Troubleshooting

References

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages