predict da weather
  • Python 82.8%
  • Jupyter Notebook 17.2%
Find a file
Jeron Wong 4f1cf7da1c Fix datetime.utcnow() deprecation warnings in parametric.py
Replace deprecated datetime.utcnow() with timezone-aware
datetime.now(tz=timezone.utc) per Python 3.12+ deprecation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 22:23:44 -08:00
notebooks tests, readme 2026-02-11 19:14:37 -08:00
scripts tests, readme 2026-02-11 19:14:37 -08:00
src Fix datetime.utcnow() deprecation warnings in parametric.py 2026-02-11 22:23:44 -08:00
tests tests, readme 2026-02-11 19:14:37 -08:00
.env.example init 2026-02-11 18:32:49 -08:00
.gitignore init 2026-02-11 18:32:49 -08:00
PLAN.md tests, readme 2026-02-11 19:14:37 -08:00
pyproject.toml init 2026-02-11 18:32:49 -08:00
README.md tests, readme 2026-02-11 19:14:37 -08:00

Weather Prediction Market Trading Bot

A system that estimates probability vectors over temperature settlement bins and trades the gap against Polymarket market-implied probabilities. Targets Seattle (KSEA / Sea-Tac) daily high temperature markets.

Core idea: Ingest NWP ensemble forecasts -> post-process into calibrated bin probabilities -> compare vs market prices -> trade when edge exceeds threshold.

Project Structure

weatherPred/
├── src/
│   ├── types.py              # Core domain types (bins, markets, forecasts, signals)
│   ├── market/
│   │   ├── discovery.py      # Gamma API market discovery & bin parsing
│   │   ├── book.py           # Order book fetching & implied probabilities
│   │   ├── history.py        # Historical token price data
│   │   └── execution.py      # Authenticated order placement (neg-risk routing)
│   ├── weather/
│   │   ├── openmeteo.py      # Open-Meteo deterministic & ensemble forecasts
│   │   ├── previous_runs.py  # Archived forecasts for backtesting
│   │   ├── stations.py       # Station registry (KSEA)
│   │   └── wunderground.py   # Weather Underground actuals scraping
│   ├── model/
│   │   ├── bins.py           # Gaussian CDF, ensemble counting, KDE bin probs
│   │   ├── parametric.py     # Forecast-to-bin conversion, BMA
│   │   ├── calibration.py    # Brier, CRPS, PIT, reliability diagrams
│   │   └── emos.py           # EMOS post-processing (NGR)
│   ├── strategy/
│   │   ├── edge.py           # Edge calculation, trade filtering, signal generation
│   │   └── sizing.py         # Kelly criterion position sizing
│   ├── backtest/
│   │   └── engine.py         # Event-driven backtesting with lookahead guards
│   ├── storage/
│   │   └── parquet_store.py  # Parquet + DuckDB storage layer
│   └── utils/
│       └── logging.py        # Logging configuration
├── notebooks/
│   ├── 01_market_exploration.ipynb   # Live market bins, prices, forecasts
│   ├── 04_edge_analysis.ipynb        # Edge heatmaps, signal generation
│   ├── 05_backtest.ipynb             # PnL curves, calibration, bias analysis
│   └── 06_live_monitor.ipynb         # Position/PnL monitoring dashboard
├── scripts/
│   └── run_strategy.py       # CLI for signal generation & execution
├── tests/                    # 118 unit tests
├── data/                     # Parquet storage (gitignored)
├── pyproject.toml
├── .env.example
└── PLAN.md

Setup

Requirements: Python >= 3.11

# Clone and install
git clone <repo-url> && cd weatherPred
pip install -e ".[dev]"

# Configure credentials
cp .env.example .env
# Edit .env with your Polymarket API keys

Environment Variables

Variable Description
POLYMARKET_PRIVATE_KEY Ethereum/Polygon private key
POLYMARKET_API_KEY CLOB API key
POLYMARKET_API_SECRET CLOB API secret
POLYMARKET_API_PASSPHRASE CLOB API passphrase
MAX_TRADE_USD Per-trade size cap (default: $10)
MAX_POSITION_USD Total position cap (default: $50)

Usage

CLI — Signal Generation & Execution

# Dry run (default): discover markets, generate signals, print them
python scripts/run_strategy.py

# Target a specific date
python scripts/run_strategy.py --date 2026-02-15

# Custom bankroll and Kelly fraction
python scripts/run_strategy.py --bankroll 500 --kelly 0.10

# Live execution (requires funded wallet + .env credentials)
python scripts/run_strategy.py --live

Notebooks

jupyter lab notebooks/
Notebook Purpose
01_market_exploration Discover bins, fetch order books & forecasts, side-by-side comparison
04_edge_analysis Per-bin edge calculation, heatmap, prospective trade signals
05_backtest Historical PnL curves, calibration metrics, reliability diagrams
06_live_monitor Real-time positions, signals, dry-run execution, risk summary

Tests

python -m pytest tests/ -v

How It Works

1. Market Discovery

Queries the Gamma API for active weather events tagged "weather", parses bin labels ("30 to 34", "45 or above"), and extracts token IDs.

2. Forecast Ingestion

Fetches deterministic (GFS, HRRR, NBM) and ensemble (GEFS, ECMWF IFS) forecasts from Open-Meteo. Each forecast produces a mean and standard deviation for the daily high temperature.

3. Bin Probability Model

Converts forecasts into per-bin probabilities via:

  • Gaussian CDF integration with +/-0.5 rounding correction (WU reports integer degrees)
  • Ensemble member counting for empirical distributions
  • KDE smoothing for smoother ensemble-based estimates
  • BMA (Bayesian Model Averaging) across multiple models
  • EMOS (Ensemble Model Output Statistics) for calibrated post-processing

4. Edge & Signal Generation

edge = model_prob - market_implied_prob

Signals are generated when:

  • |edge| > 5% (configurable)
  • Liquidity > $50 on the relevant side
  • Bid-ask spread < 15%

Direction routing for neg-risk markets:

  • Positive edge (model says underpriced) -> BUY YES token
  • Negative edge (model says overpriced) -> BUY NO token

5. Position Sizing

Quarter-Kelly by default: conservative enough to handle model uncertainty while still capturing edge.

f* = (edge / odds) * 0.25

6. Settlement

Markets resolve against Weather Underground's reported daily high for the KSEA station page. WU hourly max can differ from official NWS daily max by 1-2 degrees F.

Key Design Decisions

  1. WU as resolution truth — market rules specify the WU station page, not NWS
  2. +/-0.5 bin boundary correction — WU reports integer degrees, so bin integration accounts for rounding
  3. Quarter-Kelly sizing — dramatically reduces variance vs full Kelly at ~half the growth rate
  4. Neg-risk token routing — "selling" a bin = buying the NO complement token
  5. Information-driven cadence — re-price on model updates (HRRR hourly, GFS 4x/day), not fixed intervals
  6. Seattle/KSEA focus — smaller market = less competition but also less liquidity; size conservatively