Skip to main content

Backtest engines

Doc map: intro · vbt-pro deep dive: vbtpro-integration · LOB / tick-replay: hft-backtest · Class hierarchy: class-diagram · Worked tutorial: tutorials/first-backtest · Recipe: how-to/recipes/run-a-backtest-from-yaml.

AlphaSwarm runs every backtest through one of seven interchangeable engines behind the BaseBacktestEngine ABC. The runner, persistence, MLflow tracking, and UI never branch on which engine produced a run — every engine returns the same BacktestResult.

The seven engines fall into three tiers so you can pick one without scanning a 7-row table every time:

Tier 1 — Vectorised primary (VectorbtProEngine)

Default for research workloads, parameter screens, walk-forward optimisation, factor studies, and any backtest that does not need per-bar Python.

Five constructor modes select the inner vbt-pro path:

  • signals — array-based entries / exits / sizing
  • orders — column-of-orders DataFrame
  • optimizer — built-in vbt-pro Param sweeps
  • holding — buy-and-hold baseline
  • random — random-signal baseline

Implementation: alphaswarm/backtest/vbtpro/engine.py::VectorbtProEngine. Full mode dispatch + Numba-jit constraints in vbtpro-integration.

Tier 2 — Per-bar Python loop

Two engines run a true Python on_bar callback. Use them when you need synchronous decisions inside the inner loop — agent dispatch, event-sourced LOB replay, custom callbacks vbt-pro can't represent.

  • EventDrivenBacktester — the only engine that exposes context['agents'] to strategies via AgentDispatcher, with TTL + LRU dedup of LLM calls.
  • LobBacktestEngine — hftbacktest-driven LOB tick replay; latency + queue models; market-making + execution strategies.

Tier 3 — Fallback cascade

FallbackBacktestEngine tries primary first, then walks fallbacks until one returns a BacktestResult. The OSS engines exist mainly as cascade fallbacks and for license-constrained deployments:

  • VectorbtEngine — OSS vectorbt; signals only (Apache-2.0).
  • BacktestingPyEngine — single-symbol with .optimize(...) grid + SAMBO (AGPL-3.0).
  • ZvtBacktestEngine — permissive-licence CN-bar fallback (MIT).
  • AatBacktestEngine — async / synthetic LOB fallback (Apache-2.0).

NautilusTrader Bridge

NautilusTrader is the primary live execution engine for AlphaSwarm, providing research-to-live parity. It also powers high-fidelity backtesting through the NautilusBacktestEngine, enabling strategies to be deployed live with no code changes.

EngineCapabilities

Every engine declares its surface via EngineCapabilities on the class attribute. Agents introspect via the engine_capabilities tool; humans can call alphaswarm.backtest.engine_capabilities_index().

Pick by capability:

  • Vectorised research / parameter screens / WFO → VectorbtProEngine
  • Per-bar agent dispatch (LLM in the loop) → EventDrivenBacktester
  • LOB tick replay, latency + queue modelling → LobBacktestEngine
  • Synthetic LOB realism (OSS path) → AatBacktestEngine
  • Chinese-market data → ZvtBacktestEngine
  • Single-symbol grid optimisation → BacktestingPyEngine with .optimize(ranges, method="grid"|"sambo", ...)

When NOT to use the primary engine

The vbt-pro inner loop is Numba-jit compiled — signal_func_nb / order_func_nb cannot call Python objects per bar. Two patterns this rules out:

  1. Per-bar agent consults. Switch to EventDrivenBacktester and call context['agents'].consult(spec_name, inputs, ttl=...) from inside on_bar. The AgentDispatcher handles TTL + LRU dedup so the LLM gateway is not hammered.
  2. Per-bar custom Python that vbt-pro cannot express. If the inner loop needs a stateful Python object (custom risk model, bespoke order book heuristics), use event-driven.

If you can vectorise — or precompute a panel of decisions ahead of time — use vbt-pro AgenticVbtAlpha in precompute mode. The vectorbtpro mode dispatch lives in vbtpro-integration.

Dispatching from YAML

Three equivalent ways to pick an engine inside a strategy recipe:

# 1) Engine shortcut (cleanest).
backtest:
engine: vbt-pro:signals # or vbt-pro:orders / :optimizer / :holding / :random
kwargs:
initial_cash: 100000
fees: 0.0005

# 2) Explicit class + module.
backtest:
class: VectorbtProEngine
module_path: alphaswarm.backtest.vbtpro.engine
kwargs:
mode: orders
initial_cash: 100000

# 3) Fallback cascade.
backtest:
engine: fallback
primary: vbt-pro
fallbacks: [event, aat, zvt, vectorbt]
ShortcutResolves toNotes
default / event / event-drivenEventDrivenBacktesterBackward-compatible default.
primary / vbt-pro / vectorbt-proVectorbtProEngineTier 1.
vbt-pro:signals / :orders / :optimizer / :holding / :randomVectorbtProEngineMode injection.
vectorbt / vbtVectorbtEngineOSS fallback.
backtesting / btBacktestingPyEngineSingle-symbol.
zvtZvtBacktestEngineLazy import; CN bars.
aatAatBacktestEngineLazy import; async LOB.
hft / lobLobBacktestEngineTick replay.
fallback / cascadeFallbackBacktestEngineCascade with DEFAULT_FALLBACK_CHAIN = ("event", "aat", "zvt", "vectorbt").

alphaswarm.backtest.runner.run_backtest_from_config routes every YAML through the right engine and stamps engine into BacktestRun.metrics.

Agent + ML components

Strategies plug agents and ML models into either path:

  • Vectorised (vbt-pro) — panel components in alphaswarm/strategies/vbtpro/:
    • AgenticVbtAlpha — precompute or per-window agent dispatch into wide entries / exits / size arrays.
    • MLVbtAlpha — wraps any alphaswarm_models.base.Model (or MLflow URI) and emits arrays via threshold / top-k / rank policies.
    • AgenticOrderModel — drives Portfolio.from_orders from cached agent decisions.
  • Event-driven — context['agents'] exposes AgentDispatcher. See AgentAwareMomentumAlpha for a worked example.

For RL injection, every engine that declares EngineCapabilities.supports_rl_injection=True accepts the WeightCentricPipeline output through context['rl_agent'] (AGENTS rule 38).

Unified result shape

Every engine returns a BacktestResult with:

  • equity_curve: pd.Series indexed by timestamp.
  • trades: pd.DataFrame with timestamp, vt_symbol, side, quantity, price, commission, slippage, strategy_id.
  • orders: pd.DataFrame.
  • summary: dict — sharpe, sortino, max_drawdown, calmar, total_return, final_equity, n_bars, volatility_ann, n_trades, turnover, engine. Engine-specific keys live under vbt_*, bt_*, zvt_*, aat_*, hft_* so downstream code can light up native stats without re-running.

Hash-locked specs + audit ledger

Every dispatched backtest writes a row to backtest_runs with experiment_id (AGENTS rule 34) and a reference to the hash-locked StrategySpec version. The same spec hash returns the same *_spec_versions row on re-dispatch; content changes always create a new version. This makes every backtest replayable.

Gold-tier output lands at alphaswarm_gold_backtests.run_<run_id> via iceberg_catalog.append_arrow with medallion_layer="gold" (AGENTS rule 3, rule 21).

Worked example: dispatch + tearsheet

Goal: dispatch a backtest, tail its WebSocket frames, list the ledger row via DataMCP, render an equity curve in your browser.

Step 1 — dispatch

Step 2 — tail the WebSocket

curl -N http://localhost:8000/chat/stream/<task_id>

Frames arrive in the canonical {task_id, stage, message, timestamp, **extras} envelope (AGENTS rule 4). Expected stages: start → bar.processed (×N) → metrics.computed → done.

Step 3 — list via DataMCP

The data.backtests.list tool is the agent-safe alternative to a raw SELECT * FROM backtest_runs. From any MCP client:

curl -X POST http://localhost:8000/mcp/data/tools/data.backtests.list/invoke \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(alphaswarm-cli auth token)" \
-d '{"limit": 5, "order_by": "started_at_desc"}'

Step 4 — equity curve in Pyodide

Render the equity curve client-side from inline sample points so the snippet stays self-contained. Replace with a fetch to /analytics/portfolio/<run_id>/equity-curve.json when running against the real platform.

Step 5 — verify

  • backtest_runs row with non-NULL sharpe, engine='VectorbtProEngine'.
  • WebSocket emitted a stage=done frame with the matching run_id.
  • alphaswarm_gold_backtests.run_<run_id> Iceberg table exists.
  • data.backtests.describe { run_id } MCP call returns the full row.

What next

Deeper reads

  • vbtpro-integration — vbt-pro mode dispatch, Numba constraints, hooks, walk-forward, Param sweeps, IndicatorFactory bridge.
  • hft-backtest — LOB engine, latency profiles, queue models, the five HFT strategies under alphaswarm/strategies/hft/.
  • strategy-lifecycle — draft → backtested → paper → live.
  • strategy-development — composer / simulation / ideation / single / batch / compare routes in the operator UI.
  • factor-research — building factor / alpha strategies.
  • ml-alpha-backtest — AlphaBacktestExperiment orchestrator + MLAlphaBacktestRun schema.
  • class-diagram — full engine class hierarchy + BacktestResult shape.
  • reference/api — the backtest tag (interactive playground).
  • reference/python — auto-generated reference for alphaswarm.backtest and alphaswarm.strategies.