Back to home

Building a Live Pair Trading System: Position Sizing, Risk Management, and AWS Deployment

Zhenyu Wen
#Python#Pair Trading#Alpaca#AWS#Lambda#CDK#Quantitative Finance

Pair trading is one of the more satisfying quant strategies to build — not because it's simple, but because every component touches a different discipline: statistics for pair selection, signal generation for entry/exit, execution logic for orders, and cloud infrastructure to run it all. This post walks through the live execution layer of my pair trading system, including some non-obvious bugs I hit along the way.

The System at a Glance

The system runs entirely on AWS. At a high level:

  • MarketDataStack: downloads 15-minute OHLCV bars daily and stores them in DynamoDB
  • MyTradeSelectionStack: runs monthly via Step Functions — screens cointegrated pairs over 12 months of data, backtests the top candidates over 6 months, and emails an HTML report
  • MyTradeLiveStack: triggers every 15 minutes during market hours, reads the active pairs from DynamoDB, computes z-scores, and sends orders to Alpaca

The live Lambda handler calls a DailyTrader class that orchestrates the full tick:

  1. Check market is open
  2. Fetch account equity and open positions from Alpaca
  3. Evaluate portfolio-level risk (circuit-breaker, VIX filter, max open pairs)
  4. Fetch recent 15-min bars for all pairs
  5. Compute rolling z-scores
  6. Generate entry/exit signals via a state machine
  7. Apply per-pair risk gates (stale position timeout, single-stock stop-loss)
  8. Enforce one-trade-per-day rule
  9. Submit market orders via AlpacaExecutor

Understanding OLS Hedge Ratios

The core of pair trading is the spread:

spread = Y - β * X - c

where β is fitted by OLS regression of Y on X. When the z-score of this spread crosses ±2, we enter a position — long the spread (buy Y, short X) or short the spread (sell Y, buy X).

The key insight about β: it measures statistical co-movement, not price parity. If MMM and ROK have β = 0.1428, that means MMM historically moves $0.14 for every $1 ROK moves. To be spread-neutral, you need:

  • Buy 1 share of MMM, short 0.1428 shares of ROK

This is different from dollar-neutral, which would be:

qty_x_dollar_neutral = round(qty_y * price_y / price_x)
# For 68 MMM @ $145 vs ROK @ $357:
# = round(68 * 145 / 357) ≈ 28 shares

The β-neutral sizing gives you 10 ROK shares (notional: $3,600) against 68 MMM shares (notional: $9,890) — a 2.75× imbalance. For most pairs this doesn't matter much, but when β is very small (like 0.14), the X leg becomes almost decorative.

The fix isn't to switch to dollar-neutral sizing — β-hedging is theoretically correct for the spread you're trading. The fix is to filter out pairs with extreme β values during selection. I added a min_beta = 0.2 guard in screen_pairs() so pairs like MMM/ROK won't be selected in future runs.

Fixing Position Sizing: Pair-Level Capital Cap

The next issue was capital allocation. I wanted each pair to use at most 10% of total account equity — for the pair as a whole, not per leg.

The original sizing code deployed capital only against the Y leg:

# Old: 10% goes entirely to Y leg, X leg is extra
qty_y = math.floor(capital * position_fraction / price_y)
qty_x = max(1, round(qty_y * hedge_ratio))

With 10% of $100k = $10k deployed to MMM, the ROK leg adds another ~$3.6k on top. The pair total is 13.6% of capital, not 10%.

The fix accounts for both legs together:

# New: total notional (both legs) ≈ capital * position_fraction
qty_y = math.floor(
    capital * position_fraction / (price_y + hedge_ratio * price_x)
)
qty_x = max(1, round(qty_y * hedge_ratio))

For MMM/ROK with β=0.1428:

qty_y = floor(10_000 / (145 + 0.1428 * 357))
       = floor(10_000 / 195.9)
       = 51 shares MMM
qty_x = max(1, round(51 * 0.1428)) = 7 shares ROK

Total notional: 51 * 145 + 7 * 357 = 7,395 + 2,499 = $9,894 ≈ 10%

The position_fraction itself is set dynamically: min(1.0 / n_pairs, 0.10). So with 10 active pairs you get 10% each; with 5 pairs you get 10% each (capped), not 20%.

Alpaca API: QueryOrderStatus Enum Bug

While debugging the live handler logs I hit this warning:

[WARNING] Could not fetch today's orders:
type object 'QueryOrderStatus' has no attribute 'FILLED'

The culprit was using a string where an enum was expected:

# Wrong
request = GetOrdersRequest(status="filled")

# Correct
request = GetOrdersRequest(status=QueryOrderStatus.FILLED)

The alpaca-py client is strict about enums — it doesn't coerce strings. Worth noting for anyone migrating from the older alpaca-trade-api library where string values worked fine.

Fetching Historical Data from Polygon

For testing and backfilling, I needed a year of 15-minute OHLCV bars from Polygon.io. The official Python client has a list_aggs() method that auto-paginates — and this is exactly where it goes wrong:

# This fires dozens of HTTP requests in rapid succession
for a in client.list_aggs("X:ETHUSD", 15, "minute", "2025-03-16", "2026-03-16"):
    aggs.append(a)

Each page transition triggers a new request immediately. On Polygon's free tier (5 req/min), you burn through your allowance in seconds and get flooded with 429s. Worse, the underlying urllib3 retry logic retries each 429 automatically — which counts as more requests and digs you deeper into the hole.

The fix: bypass the client entirely and use requests directly, fetching in 2-week chunks with a 13-second sleep between each:

import time
import requests
import pandas as pd

API_KEY = "..."
CHUNK = timedelta(days=14)
SLEEP_S = 13  # free tier: 5 req/min → 12s gap is safe

def get_aggs(ticker, from_date, to_date, retries=5):
    url = (
        f"https://api.polygon.io/v2/aggs/ticker/{ticker}"
        f"/range/15/minute/{from_date}/{to_date}"
    )
    params = {"adjusted": "true", "sort": "asc", "limit": 50000, "apiKey": API_KEY}
    for attempt in range(retries):
        resp = requests.get(url, params=params, timeout=30)
        if resp.status_code == 200:
            return resp.json().get("results", [])
        if resp.status_code == 429:
            wait = 60 * (attempt + 1)
            print(f"429 rate limit — waiting {wait}s ...")
            time.sleep(wait)
        else:
            resp.raise_for_status()
    raise RuntimeError(f"Failed after {retries} retries")

A year of ETH/USD 15-min data (crypto trades 24/7) comes out to ~35,000 bars across 27 chunks. Two years is ~70,000 bars across 53 chunks and takes about 12 minutes with the sleep.

CDK Deployment Gotcha: Stack Name Mismatch

When deploying via CDK I kept hitting:

No stacks match the name(s) LiveStack

The actual stack name in infra/app.py is MyTradeLiveStack — not LiveStack. Always verify with:

VIRTUAL_ENV="" uv run cdk list --app "python infra/app.py"

The VIRTUAL_ENV="" prefix is required when using uv because CDK's subprocess invocation picks up the virtual env variable and gets confused.

To deploy just the live stack:

VIRTUAL_ENV="" uv run cdk deploy MyTradeLiveStack \
  --require-approval never \
  --app "python infra/app.py"

What's Next

The system is running live on paper trading. A few things I want to tackle next:

  • Polygon as primary data source: massive.com works fine for equities, but for crypto pairs I need Polygon. The fetcher script is a start — next step is wiring it into the Lambda pipeline.
  • Backtesting with 15-min data: the current backtester uses daily bars for β fitting and 15-min bars for simulation. I want to validate whether fitting β on 15-min data directly improves signal quality.
  • Metrics dashboard: the Lambda already pushes to CloudWatch, but I want a simple daily email summary of open positions, P&L, and any risk events.

The full codebase is at github.com/RayVenn/MyTrade.