A statistically rigorous pairs trading system โ every signal backstopped by a formal hypothesis test.
- Connection to the MSc Statistics Programme
- Mathematical & Statistical Foundations
- Project Structure
- Installation & Setup
- How to Run
- Pipeline Overview
- Results Summary
- Statistical Interpretation
- Design Principles
- Limitations & Further Work
- FAQ / Troubleshooting
- References
- License
| Programme Component | Statistical Work in QuantSpectre | Code Reference |
|---|---|---|
| Compulsory: Probability Theory & Mathematical Statistics | OU process stochastic calculus & exact discrete-time MLE; Student-$t$ and Normal return distribution fitting via likelihood; analytical VaR/CVaR from first principles |
stats_models/ou_process.py, stats_models/distributions.py
|
| Specialisation: Quantitative Methods of Financial Markets | Johansen cointegration for stat-arb; GARCH(1,1) volatility model; Kalman filter dynamic hedge ratio; position sizing via volatility targeting |
stats_models/cointegration.py, stats_models/garch_model.py, stats_models/kalman_filter.py, strategy/trading_rules.py
|
| Specialisation: Statistical Inference | Bootstrap Sharpe ratio hypothesis test ( |
stats_validation.py, stats_models/diagnostics.py
|
| Interdisciplinary: Professional & Applied Work | Complete backtesting engine with transaction costs; reproducible pipeline from data fetch to visualisation; Streamlit dashboard for interactive parameter tuning; trade log CSV output |
backtest/, dashboard/, run.py, visualization/
|
Full derivations with all intermediate steps are in
MATHS.md. Below is a summary of the core models.
1. Cointegration (Johansen, 1988)
Two non-stationary I(1) series
The Johansen test is based on the VECM:
$$\Delta \mathbf{X}t = \Pi \mathbf{X}{t-1} + \sum_{i=1}^{p-1} \Gamma_i \Delta \mathbf{X}_{t-i} + \varepsilon_t$$
Under cointegration,
Rejecting
2. OrnsteinโUhlenbeck Process
The spread follows a mean-reverting diffusion:
where
Exact discrete-time transition (not an Euler approximation):
MLE: minimise the negative log-likelihood:
with
Half-life:
3. Kalman Filter for Dynamic ฮฒ
State-space model with random-walk hedge ratio: $$\begin{aligned} \beta_t &= \beta_{t-1} + \eta_t, \quad \eta_t \sim \mathcal{N}(0, Q) \ Y_t &= \beta_t X_t + \varepsilon_t, \quad \varepsilon_t \sim \mathcal{N}(0, R) \end{aligned}$$
The Kalman recursion (predict โ gain โ update) yields $\hat{\beta}{t|t}$ at every time step. An RTS smoother provides the retrospective $\hat{\beta}{t|T}$. This captures time-varying relationships that a static OLS/Johansen hedge ratio misses.
4. GARCH(1,1) Volatility
Estimated via the arch library. Conditional volatility
5. Return Distribution & Risk Metrics
Both Normal and Student-$t$ distributions are fitted via MLE. A likelihood ratio test (
Risk metrics computed analytically from both fitted distributions: $$\text{VaR}\alpha = \mu + \sigma \cdot F^{-1}(\alpha)$$ $$\text{CVaR}\alpha = \mu + \sigma \cdot \frac{f(F^{-1}(\alpha))}{1-\alpha}$$
6. Bootstrap Sharpe Ratio Test
The sampling distribution of the Sharpe ratio is complex under autocorrelated, non-normal returns. The circular block bootstrap (Politis & Romano, 1992) constructs the empirical distribution of
- Partition returns into non-overlapping blocks of length
$L = 10$ - Resample blocks with replacement (wrapping circularly around the series)
- For each of
$B = 2{,}000$ replicates, compute$SR^*_b$ - Centre under
$H_0$ and compute:$p = \frac{1}{B}\sum_b \mathbb{1}(|SR^*_{b,\text{centred}}| \geq |\widehat{SR}|)$
If
statsProj/
โโโ README.md, LICENSE, requirements.txt # Meta
โโโ run.py # Full pipeline orchestrator
โโโ stats_validation.py # Standalone statistical test battery
โโโ MATHS.md, ARCHITECTURE.md # Deep-dive documentation
โโโ CHANGELOG.md, CONTRIBUTING.md # Project governance
โโโ demo_presentation.md, video_script.md # Presentation materials
โโโ data/ โ data_loader.py # yfinance + simulated fallback
โโโ stats_models/ โ 7 modules # ADF, Johansen, OU, Kalman, GARCH, distributions, diagnostics
โโโ strategy/ โ 2 modules # Pair selector, signal generation, position sizing
โโโ backtest/ โ 2 modules # Vectorised engine, performance metrics
โโโ visualization/ โ charts, interactive # Static PNG + interactive Plotly HTML
โโโ dashboard/ โ app.py # Streamlit parameter-tuning UI
โโโ results/ โ Auto-generated # Equity curve, trade log, z-score chart, etc.
31 tracked files, ~4,100 lines of Python, ~1,200 lines of documentation.
Requirements: Python 3.9+, pip.
git clone https://github.com/YOUR_USERNAME/quant-spectre.git
cd quant-spectre
# Create a virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate # macOS / Linux
# .venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt| Package | Purpose |
|---|---|
numpy, pandas |
Data wrangling |
scipy |
Optimisation (MLE), distributions |
statsmodels |
ADF, KPSS, Johansen, Ljung-Box, ARCH-LM |
arch |
GARCH estimation |
yfinance |
ETF price download |
matplotlib, seaborn |
Static visualisation |
plotly |
Interactive HTML charts |
streamlit |
Dashboard |
python run.pyUses deterministic simulated cointegrated data. Runs the complete statistical pipeline โ cointegration testing, OU MLE, GARCH, VaR/CVaR, backtesting, bootstrap validation, and visualisation. Expected runtime: 30โ60 seconds.
python run.py --liveFetches actual GLD & SLV daily prices (2019โ2024). The statistical tests are honest โ the pair may or may not show cointegration on this period. The pipeline handles both cases gracefully.
python run.py --live --skip-kalman # Real data, skip Kalman filter (faster)
python run.py --skip-kalman # Simulated data, faster (~20 sec)python stats_validation.pyRuns the bootstrap Sharpe test, extreme value analysis, runs test, and residual diagnostics. Also importable:
from stats_validation import bootstrap_sharpe_test, full_validation_reportstreamlit run dashboard/app.pyOpens a browser window with sliders for entry threshold, stop-loss, transaction costs, and target volatility. Adjust parameters and click Run Backtest to see updated equity curves, trade logs, and GARCH diagnostics in real time.
flowchart TD
A[๐ฅ Fetch Data] --> B[๐ Stationarity Tests]
B --> C[๐ฌ Johansen Cointegration]
C --> D{Hโ rejected?}
D -->|Yes| E[๐ OU Process MLE]
D -->|No| Z[๐ซ Exit: no cointegration]
E --> F[๐ Kalman Dynamic ฮฒ]
E --> G[๐ GARCH Volatility]
G --> H[๐ Distribution Fit + VaR]
H --> I[๐ Z-Score Signals]
I --> J[โ๏ธ Position Sizing]
J --> K[โฎ Backtest Engine]
K --> L[๐ Performance Metrics]
L --> M[๐ฒ Bootstrap Sharpe Test]
M --> N[๐ Residual Diagnostics]
N --> O[๐ธ Visualisation]
At each step, the statistical test result is printed to the console with its null hypothesis, p-value, and a plain-language interpretation.
Click to expand โ sample console output
======================================================================
JOINT STATIONARITY DIAGNOSIS โ SLV
======================================================================
ADF statistic = -2.4345, p-value = 0.1322.
โ VERDICT: STRONG EVIDENCE OF NON-STATIONARITY
======================================================================
STEP 3: JOHANSEN COINTEGRATION TEST
======================================================================
Johansen test: AT LEAST 1 cointegrating relationship detected.
Trace statistic r=0: 24.57 > 5% critical 15.49
Implied hedge ratio (ฮฒ): 0.9890
======================================================================
STEP 4: OU PROCESS MAXIMUM LIKELIHOOD
======================================================================
OU Parameter Estimates (MLE):
ฮธ = 0.042, ฮผ = -0.654, ฯ = 0.002
Half-life: 16.4 trading days | Stationary std: 0.007
Log-likelihood: 4957.35 | Converged: True
Results from a run with simulated cointegrated data (entry
$z = 2.0$ , stop-loss$z = 3.5$ , 5 bps transaction costs). Real-data results depend on market regime and cointegration strength.
| Category | Metric | Value |
|---|---|---|
| Cointegration | Johansen trace stat ( |
24.6 > 15.5 (5% crit) โ cointegration confirmed |
| OU Process | Mean reversion speed |
0.042 โ half-life: 16.4 trading days |
| Stationary |
0.0073 | |
| GARCH | Persistence |
0.998 (near-IGARCH) |
| Distribution | Preferred model | Student-$t$ (AIC โ9897 vs Normal โ9889, LR |
| Performance | Sharpe ratio (annualised) | 1.28 |
| Sortino ratio | 0.96 | |
| Calmar ratio | 1.36 | |
| Max drawdown | โ1.67% | |
| Total return (6 years) | +14.2% | |
| Annualised return | +2.3% | |
| Annualised volatility | 1.8% | |
| Trade Stats | Number of trades | 9 |
| Hit rate | 100% | |
| Profit factor | โ | |
| Inference | Bootstrap Sharpe |
|
| Bootstrap 95% CI for SR | [0.037, 0.122] | |
| Diagnostics | Ljung-Box | Autocorrelation at higher lags (expected) |
| Jarque-Bera | Non-normal ( |
|
| ARCH-LM | Heteroskedasticity confirmed โ GARCH warranted |
Note on the 100% hit rate: The simulated OU spread is perfectly mean-reverting by construction. Real-world pairs trades typically achieve 45โ65% hit rates with non-trivial drawdowns. The bootstrap Sharpe test remains the most reliable quality gauge across both data sources.
Eight output files are saved to results/ on each run:
| # | File | Description |
|---|---|---|
| 1 | price_and_spread.png |
Log prices of both assets + spread with OU mean and ยฑ2ฯ band |
| 2 | zscore_signals.png |
Z-score with entry/exit signals overlaid on thresholds |
| 3 | equity_curve.png |
Portfolio value over time with drawdown ribbon |
| 4 | trade_distribution.png |
Bar chart of individual trade returns (green = win, red = loss) |
| 5 | distribution_fit.png |
Histogram of returns with fitted Normal and Student-$t$ densities |
| 6 | dynamic_beta.png |
Kalman-smoothed hedge ratio vs static Johansen estimate |
| 7 | spread_zscore.html |
Interactive Plotly chart โ hover, zoom, pan |
| 8 | equity_curve_interactive.html |
Interactive equity curve with drawdown |
The bootstrap Sharpe test answers this definitively. A positive point estimate alone is not evidence โ it could arise from a favourable draw of noise. The circular block bootstrap preserves the autocorrelation structure of real returns, constructing the true sampling distribution of
| Test | Null Hypothesis | What Failure Means |
|---|---|---|
| Ljung-Box | No autocorrelation up to lag |
Unmodeled structure remains โ potential for improvement or sign of misspecification |
| ARCH-LM | No ARCH effects (homoskedasticity) | Volatility clustering is present โ validates the use of GARCH for position scaling |
| Jarque-Bera | Returns are normally distributed | Fat tails or skewness โ using Student-$t$ VaR/CVaR rather than Normal is more realistic |
Financial returns famously violate normality. The likelihood ratio test between nested Normal and Student-$t$ models checks whether the extra degree-of-freedom parameter
| # | Principle | Implementation |
|---|---|---|
| 1 | No look-ahead bias | Statistical models estimated on in-sample data (2019โ2022). Hedge ratio applied out-of-sample (2023โ2024). Z-score uses OU parameters from in-sample fit. |
| 2 | Separation of concerns | Estimation (stats_models/) is cleanly separated from trading logic (strategy/) and backtest execution (backtest/). |
| 3 | Every test explained | Each statistical test prints |
| 4 | Reproducibility | Random seeds are set. Simulated data is deterministic. The pipeline runs identically every time (modulo live Yahoo Finance data). |
| 5 | Graceful degradation | If Yahoo Finance fails, the system falls back to simulated cointegrated data โ the full pipeline still executes. |
- Survivorship bias โ GLD and SLV were chosen because they survived. A rigorous study screens a universe.
- Simplified transaction costs โ Fixed 5 bps. Real costs include bid-ask spread, market impact, and short-borrow fees.
-
Static hedge ratio โ The backtest uses the in-sample Johansen
$\beta$ out-of-sample. Rolling recalibration (monthly Johansen) would be more realistic. - No regime detection โ Cointegrating relationships can break down during structural shifts. Markov-switching or structural break tests would help.
- Linear cointegration only โ Non-linear cointegration (threshold, smooth-transition) is not explored.
- Copulas โ Replace linear correlation with copula-based dependence (Clayton, Gumbel, Student-$t$) for tail-dependence modelling
-
Bayesian OU model โ Priors on
$\theta$ ,$\mu$ ,$\sigma$ with MCMC (PyMC) for full posterior inference and parameter uncertainty in trading signals - Multi-asset stat-arb โ Johansen on 3โ5 assets simultaneously; trade the portfolio spread
- High-frequency โ Apply to intraday data with microstructure adjustments
- Online learning โ Particle filter or sequential MCMC to update OU parameters in real time
The pipeline exits with "No cointegration detected"
The Johansen test is honest. On the GLD/SLV pair (2019โ2024), cointegration is often borderline or absent. The default simulated data avoids this. Alternatives: (a) use a longer sample period, (b) try a different pair, or (c) use OLS + ADF (Engle-Granger) instead.
Yahoo Finance download fails
The default simulated data path requires no internet. With --live, a download failure automatically triggers the simulated fallback (console will say Source: simulated (yfinance fallback)).
The arch library won't install
The arch package (for GARCH) may conflict with a system package of the same name. Install in a fresh virtual environment:
python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txtStreamlit says "command not found"
Run with: python -m streamlit run dashboard/app.py
Charts look blank or crash on matplotlib error
The visualisation module uses the Agg backend (file-only, non-interactive). If you get a TclError, the charts.py module already handles this โ ensure you run from the project root.
Permission denied when saving to results/
The results/ directory is auto-created by run.py. If you cloned to a read-only location, either mkdir results manually or change RESULTS_DIR in run.py.
Why does simulated data show a 100% hit rate?
The simulated OU spread is perfectly mean-reverting with no structural breaks โ an idealised benchmark. Real-world pairs trades typically achieve 45โ65% hit rates. The simulated data demonstrates the statistical machinery; real-data performance reflects market reality.
ModuleNotFoundError: No module named 'stats_models'
The project uses local imports. Run scripts from the project root (statsProj/), not from inside a subdirectory. Do not install stats_models from PyPI.
- Engle, R. F. & Granger, C. W. J. (1987). Co-integration and Error Correction: Representation, Estimation, and Testing. Econometrica, 55(2), 251โ276.
- Johansen, S. (1988). Statistical Analysis of Cointegration Vectors. Journal of Economic Dynamics and Control, 12(2โ3), 231โ254.
- Kwiatkowski, D., Phillips, P. C. B., Schmidt, P., & Shin, Y. (1992). Testing the Null Hypothesis of Stationarity Against the Alternative of a Unit Root. Journal of Econometrics, 54(1โ3), 159โ178.
- Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 31(3), 307โ327.
- Politis, D. N. & Romano, J. P. (1992). A Circular Block-Resampling Procedure for Stationary Data. Exploring the Limits of Bootstrap, 263โ270.
- Ljung, G. M. & Box, G. E. P. (1978). On a Measure of Lack of Fit in Time Series Models. Biometrika, 65(2), 297โ303.
- Chan, E. (2013). Algorithmic Trading: Winning Strategies and Their Rationale. Wiley.
- Tsay, R. S. (2010). Analysis of Financial Time Series, 3rd ed. Wiley.
- Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.
- Embrechts, P., Klรผppelberg, C., & Mikosch, T. (1997). Modelling Extremal Events. Springer.
MIT โ see LICENSE.
Built for demonstration of statistical rigour โ not for live trading. Every test is stated, every p-value is interpreted, no black boxes.