- Python 82.8%
- Jupyter Notebook 17.2%
Replace deprecated datetime.utcnow() with timezone-aware datetime.now(tz=timezone.utc) per Python 3.12+ deprecation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| notebooks | ||
| scripts | ||
| src | ||
| tests | ||
| .env.example | ||
| .gitignore | ||
| PLAN.md | ||
| pyproject.toml | ||
| README.md | ||
Weather Prediction Market Trading Bot
A system that estimates probability vectors over temperature settlement bins and trades the gap against Polymarket market-implied probabilities. Targets Seattle (KSEA / Sea-Tac) daily high temperature markets.
Core idea: Ingest NWP ensemble forecasts -> post-process into calibrated bin probabilities -> compare vs market prices -> trade when edge exceeds threshold.
Project Structure
weatherPred/
├── src/
│ ├── types.py # Core domain types (bins, markets, forecasts, signals)
│ ├── market/
│ │ ├── discovery.py # Gamma API market discovery & bin parsing
│ │ ├── book.py # Order book fetching & implied probabilities
│ │ ├── history.py # Historical token price data
│ │ └── execution.py # Authenticated order placement (neg-risk routing)
│ ├── weather/
│ │ ├── openmeteo.py # Open-Meteo deterministic & ensemble forecasts
│ │ ├── previous_runs.py # Archived forecasts for backtesting
│ │ ├── stations.py # Station registry (KSEA)
│ │ └── wunderground.py # Weather Underground actuals scraping
│ ├── model/
│ │ ├── bins.py # Gaussian CDF, ensemble counting, KDE bin probs
│ │ ├── parametric.py # Forecast-to-bin conversion, BMA
│ │ ├── calibration.py # Brier, CRPS, PIT, reliability diagrams
│ │ └── emos.py # EMOS post-processing (NGR)
│ ├── strategy/
│ │ ├── edge.py # Edge calculation, trade filtering, signal generation
│ │ └── sizing.py # Kelly criterion position sizing
│ ├── backtest/
│ │ └── engine.py # Event-driven backtesting with lookahead guards
│ ├── storage/
│ │ └── parquet_store.py # Parquet + DuckDB storage layer
│ └── utils/
│ └── logging.py # Logging configuration
├── notebooks/
│ ├── 01_market_exploration.ipynb # Live market bins, prices, forecasts
│ ├── 04_edge_analysis.ipynb # Edge heatmaps, signal generation
│ ├── 05_backtest.ipynb # PnL curves, calibration, bias analysis
│ └── 06_live_monitor.ipynb # Position/PnL monitoring dashboard
├── scripts/
│ └── run_strategy.py # CLI for signal generation & execution
├── tests/ # 118 unit tests
├── data/ # Parquet storage (gitignored)
├── pyproject.toml
├── .env.example
└── PLAN.md
Setup
Requirements: Python >= 3.11
# Clone and install
git clone <repo-url> && cd weatherPred
pip install -e ".[dev]"
# Configure credentials
cp .env.example .env
# Edit .env with your Polymarket API keys
Environment Variables
| Variable | Description |
|---|---|
POLYMARKET_PRIVATE_KEY |
Ethereum/Polygon private key |
POLYMARKET_API_KEY |
CLOB API key |
POLYMARKET_API_SECRET |
CLOB API secret |
POLYMARKET_API_PASSPHRASE |
CLOB API passphrase |
MAX_TRADE_USD |
Per-trade size cap (default: $10) |
MAX_POSITION_USD |
Total position cap (default: $50) |
Usage
CLI — Signal Generation & Execution
# Dry run (default): discover markets, generate signals, print them
python scripts/run_strategy.py
# Target a specific date
python scripts/run_strategy.py --date 2026-02-15
# Custom bankroll and Kelly fraction
python scripts/run_strategy.py --bankroll 500 --kelly 0.10
# Live execution (requires funded wallet + .env credentials)
python scripts/run_strategy.py --live
Notebooks
jupyter lab notebooks/
| Notebook | Purpose |
|---|---|
01_market_exploration |
Discover bins, fetch order books & forecasts, side-by-side comparison |
04_edge_analysis |
Per-bin edge calculation, heatmap, prospective trade signals |
05_backtest |
Historical PnL curves, calibration metrics, reliability diagrams |
06_live_monitor |
Real-time positions, signals, dry-run execution, risk summary |
Tests
python -m pytest tests/ -v
How It Works
1. Market Discovery
Queries the Gamma API for active weather events tagged "weather", parses bin labels ("30 to 34", "45 or above"), and extracts token IDs.
2. Forecast Ingestion
Fetches deterministic (GFS, HRRR, NBM) and ensemble (GEFS, ECMWF IFS) forecasts from Open-Meteo. Each forecast produces a mean and standard deviation for the daily high temperature.
3. Bin Probability Model
Converts forecasts into per-bin probabilities via:
- Gaussian CDF integration with +/-0.5 rounding correction (WU reports integer degrees)
- Ensemble member counting for empirical distributions
- KDE smoothing for smoother ensemble-based estimates
- BMA (Bayesian Model Averaging) across multiple models
- EMOS (Ensemble Model Output Statistics) for calibrated post-processing
4. Edge & Signal Generation
edge = model_prob - market_implied_prob
Signals are generated when:
- |edge| > 5% (configurable)
- Liquidity > $50 on the relevant side
- Bid-ask spread < 15%
Direction routing for neg-risk markets:
- Positive edge (model says underpriced) -> BUY YES token
- Negative edge (model says overpriced) -> BUY NO token
5. Position Sizing
Quarter-Kelly by default: conservative enough to handle model uncertainty while still capturing edge.
f* = (edge / odds) * 0.25
6. Settlement
Markets resolve against Weather Underground's reported daily high for the KSEA station page. WU hourly max can differ from official NWS daily max by 1-2 degrees F.
Key Design Decisions
- WU as resolution truth — market rules specify the WU station page, not NWS
- +/-0.5 bin boundary correction — WU reports integer degrees, so bin integration accounts for rounding
- Quarter-Kelly sizing — dramatically reduces variance vs full Kelly at ~half the growth rate
- Neg-risk token routing — "selling" a bin = buying the NO complement token
- Information-driven cadence — re-price on model updates (HRRR hourly, GFS 4x/day), not fixed intervals
- Seattle/KSEA focus — smaller market = less competition but also less liquidity; size conservatively