Backtesting¶
Dual-engine backtesting system: PnL simulation with historical prices and paper trading on Anvil forks.
PnL Backtester¶
PnLBacktester¶
almanak.framework.backtesting.PnLBacktester
dataclass
¶
PnLBacktester(
data_provider: HistoricalDataProvider,
fee_models: dict[str, FeeModel],
slippage_models: dict[str, SlippageModel],
strategy_type: str | None = "auto",
gas_provider: GasPriceProvider | None = None,
data_config: BacktestDataConfig | None = None,
_mev_simulator: MEVSimulator | None = None,
_current_backtest_id: str = "",
_adapter: StrategyBacktestAdapter | None = None,
_detected_strategy_type: StrategyTypeHint | None = None,
_error_handler: BacktestErrorHandler | None = None,
_fallback_usage: dict[str, int] | None = None,
_gas_price_records: list[GasPriceRecord] | None = None,
)
Main PnL backtesting engine for historical strategy simulation.
The PnLBacktester simulates strategy execution against historical price data to evaluate performance. It:
- Iterates through historical market data at configured intervals
- Calls strategy.decide() with a MarketSnapshot for each time step
- Simulates intent execution with configurable fee/slippage models
- Tracks portfolio state and builds an equity curve
- Calculates comprehensive performance metrics
Attributes:
| Name | Type | Description |
|---|---|---|
data_provider |
HistoricalDataProvider
|
Historical data provider (e.g., CoinGeckoDataProvider) |
fee_models |
dict[str, FeeModel]
|
Dict mapping protocol -> FeeModel (or "default" for all) |
slippage_models |
dict[str, SlippageModel]
|
Dict mapping protocol -> SlippageModel |
gas_provider |
GasPriceProvider | None
|
Optional gas price provider for historical gas prices. When provided and config.use_historical_gas_gwei=True, the engine will fetch historical gas prices at each simulation timestamp. |
mev_simulator |
GasPriceProvider | None
|
Optional MEV simulator (created dynamically based on config) |
strategy_type |
str | None
|
Optional explicit strategy type for adapter selection. If "auto" (default), the type is detected from strategy metadata. Valid values: "lp", "perp", "lending", "arbitrage", "swap", "auto", or None. |
data_config |
BacktestDataConfig | None
|
Optional BacktestDataConfig for controlling historical data providers in adapters. When provided, adapters will use historical volume, funding rates, and APY data from real sources instead of fallback values. |
Example
backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, )
result = await backtester.backtest(my_strategy, config) print(result.summary())
With explicit strategy type:¶
backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, strategy_type="lp", # Force LP adapter )
With BacktestDataConfig for historical data:¶
from almanak.framework.backtesting.config import BacktestDataConfig
data_config = BacktestDataConfig( use_historical_volume=True, use_historical_funding=True, use_historical_apy=True, ) backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, data_config=data_config, )
gas_provider
class-attribute
instance-attribute
¶
Optional gas price provider for historical gas prices.
When provided and config.use_historical_gas_gwei=True, the engine will fetch historical gas prices at each simulation timestamp instead of using the static config.gas_price_gwei value.
Example
from almanak.framework.backtesting.pnl.providers import EtherscanGasPriceProvider
gas_provider = EtherscanGasPriceProvider( api_keys={"ethereum": "your-key"}, ) backtester = PnLBacktester( data_provider=data_provider, fee_models=fee_models, slippage_models=slippage_models, gas_provider=gas_provider, )
data_config
class-attribute
instance-attribute
¶
Optional BacktestDataConfig for controlling historical data providers.
When provided, this configuration is passed to strategy-specific adapters (LP, Perp, Lending) to control historical data provider behavior: - use_historical_volume: Fetch LP fee data from subgraphs - use_historical_funding: Fetch perp funding rates from APIs - use_historical_apy: Fetch lending APY from subgraphs - strict_historical_mode: Fail if historical data unavailable - Fallback values for when historical data is unavailable - Rate limiting configuration for CoinGecko and The Graph - Cache settings for persistent data storage
Example
from almanak.framework.backtesting.config import BacktestDataConfig
data_config = BacktestDataConfig( use_historical_volume=True, use_historical_funding=True, use_historical_apy=True, strict_historical_mode=False, ) backtester = PnLBacktester( data_provider=data_provider, fee_models=fee_models, slippage_models=slippage_models, data_config=data_config, )
run_preflight_validation
async
¶
Run preflight validation checks before starting a backtest.
Performs validation checks to ensure data requirements can be met: - Checks price data availability for all tokens in config - Verifies data provider capabilities match requirements - Tests archive node accessibility if historical TWAP/Chainlink needed - Estimates data coverage based on provider capabilities
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
PnLBacktestConfig
|
Backtest configuration specifying tokens, time range, etc. |
required |
Returns:
| Type | Description |
|---|---|
PreflightReport
|
PreflightReport with pass/fail status and detailed check results. |
Example
preflight = await backtester.run_preflight_validation(config) if not preflight.passed: print(preflight.summary()) # Handle validation failure else: result = await backtester.backtest(strategy, config)
backtest
async
¶
Run a backtest for a strategy over the configured period.
This method: 1. Initializes a simulated portfolio with initial capital 2. Creates a HistoricalDataConfig from the backtest config 3. Iterates through historical market states 4. For each time step: a. Creates a MarketSnapshot from MarketState b. Calls strategy.decide(snapshot) to get intent c. Queues intent for execution (with inclusion delay) d. Executes queued intents e. Marks portfolio to market 5. Calculates final metrics and returns BacktestResult
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy
|
BacktestableStrategy
|
Strategy to backtest (must implement BacktestableStrategy) |
required |
config
|
PnLBacktestConfig
|
Backtest configuration (time range, capital, models, etc.) |
required |
Returns:
| Type | Description |
|---|---|
BacktestResult
|
BacktestResult with metrics, trades, and equity curve |
Raises:
| Type | Description |
|---|---|
ValueError
|
If strategy is not compatible with backtesting |
get_fee_model
¶
Get the fee model for a protocol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protocol
|
str
|
Protocol name (e.g., "uniswap_v3", "aave_v3") |
required |
Returns:
| Type | Description |
|---|---|
FeeModel
|
FeeModel for the protocol, or default if not found |
get_slippage_model
¶
Get the slippage model for a protocol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
protocol
|
str
|
Protocol name (e.g., "uniswap_v3", "gmx") |
required |
Returns:
| Type | Description |
|---|---|
SlippageModel
|
SlippageModel for the protocol, or default if not found |
PnLBacktestConfig¶
almanak.framework.backtesting.PnLBacktestConfig
dataclass
¶
PnLBacktestConfig(
start_time: datetime,
end_time: datetime,
interval_seconds: int = 3600,
initial_capital_usd: Decimal = Decimal("10000"),
fee_model: str = "realistic",
slippage_model: str = "realistic",
include_gas_costs: bool = True,
gas_price_gwei: Decimal = Decimal("30"),
inclusion_delay_blocks: int = 1,
chain: str = "arbitrum",
tokens: list[str] = (lambda: ["WETH", "USDC"])(),
benchmark_token: str = "WETH",
risk_free_rate: Decimal = Decimal("0.05"),
trading_days_per_year: int = 365,
initial_margin_ratio: Decimal = Decimal("0.1"),
maintenance_margin_ratio: Decimal = Decimal("0.05"),
mev_simulation_enabled: bool = False,
auto_correct_positions: bool = False,
reconciliation_alert_threshold_pct: Decimal = Decimal(
"0.05"
),
random_seed: int | None = None,
strict_reproducibility: bool = False,
staleness_threshold_seconds: int = 3600,
institutional_mode: bool = False,
min_data_coverage: Decimal = Decimal("0.98"),
allow_hardcoded_fallback: bool = False,
allow_degraded_data: bool = True,
require_symbol_mapping: bool = False,
use_historical_gas_prices: bool = False,
gas_eth_price_override: Decimal | None = None,
use_historical_gas_gwei: bool = False,
track_gas_prices: bool = False,
preflight_validation: bool = True,
fail_on_preflight_error: bool = True,
)
Configuration for a PnL backtest simulation.
Controls all parameters of the backtest including time range, initial capital, fee/slippage models, gas costs, and execution delay simulation.
Attributes:
| Name | Type | Description |
|---|---|---|
start_time |
datetime
|
Start of the backtest period (inclusive) |
end_time |
datetime
|
End of the backtest period (inclusive) |
interval_seconds |
int
|
Time between simulation ticks in seconds (default: 3600 = 1 hour) |
initial_capital_usd |
Decimal
|
Starting capital in USD |
fee_model |
str
|
Fee model to use - 'realistic', 'zero', or protocol-specific (e.g., 'uniswap_v3', 'aave_v3', 'gmx') |
slippage_model |
str
|
Slippage model to use - 'realistic', 'zero', or protocol-specific (e.g., 'liquidity_aware', 'constant') |
include_gas_costs |
bool
|
Whether to include gas costs in PnL calculations |
gas_price_gwei |
Decimal
|
Gas price to use for cost calculations (default: 30 gwei) |
inclusion_delay_blocks |
int
|
Number of blocks to delay intent execution to simulate realistic trade timing (default: 1). When > 0, intents are queued and executed in the next iteration(s) rather than immediately. |
chain |
str
|
Blockchain to simulate execution on (default: 'arbitrum') |
tokens |
list[str]
|
List of tokens to track prices for (default: ['WETH', 'USDC']) |
benchmark_token |
str
|
Token to use for benchmark comparisons (default: 'WETH') |
risk_free_rate |
Decimal
|
Annual risk-free rate for Sharpe ratio calculation (default: 0.05) |
trading_days_per_year |
int
|
Number of trading days for annualization (default: 365) |
initial_margin_ratio |
Decimal
|
Initial margin ratio for opening perp positions (default: 0.1 = 10%) |
maintenance_margin_ratio |
Decimal
|
Maintenance margin ratio for liquidation (default: 0.05 = 5%) |
mev_simulation_enabled |
bool
|
Enable MEV cost simulation for realistic execution costs (default: False) |
auto_correct_positions |
bool
|
Enable auto-correction of tracked positions when discrepancies are detected |
reconciliation_alert_threshold_pct |
Decimal
|
Threshold percentage for triggering reconciliation alerts (default: 5%) |
Example
config = PnLBacktestConfig( start_time=datetime(2024, 1, 1), end_time=datetime(2024, 6, 1), initial_capital_usd=Decimal("10000"), ) print(f"Duration: {config.duration_days:.1f} days") print(f"Estimated ticks: {config.estimated_ticks}")
initial_margin_ratio
class-attribute
instance-attribute
¶
Initial margin ratio required to open a position (default: 0.1 = 10%). This is the minimum margin/position_value ratio required to open a new perp position.
maintenance_margin_ratio
class-attribute
instance-attribute
¶
Maintenance margin ratio for liquidation threshold (default: 0.05 = 5%). When margin/position_value falls below this, the position may be liquidated.
mev_simulation_enabled
class-attribute
instance-attribute
¶
Enable MEV (Maximal Extractable Value) cost simulation (default: False). When enabled, simulates sandwich attack probability and additional slippage based on trade size and token characteristics. Adds estimated MEV costs to trade records and total MEV cost to backtest metrics.
auto_correct_positions
class-attribute
instance-attribute
¶
Enable automatic position correction when discrepancies are detected (default: False). When enabled, the reconciliation process will update tracked positions to match actual on-chain state when discrepancies exceed the alert threshold. Corrected positions will have auto_corrected=True in their ReconciliationEvent.
reconciliation_alert_threshold_pct
class-attribute
instance-attribute
¶
Threshold percentage for triggering reconciliation alerts (default: 5%). When discrepancy_pct exceeds this threshold, an alert is emitted. Set to 0 to alert on any discrepancy, or higher values to only alert on significant drift.
random_seed
class-attribute
instance-attribute
¶
Random seed for reproducibility (default: None = no seed). When set, any randomness in the backtest (e.g., Monte Carlo simulations, random sampling) will use this seed for reproducibility. The seed is recorded in the config for re-running identical backtests.
strict_reproducibility
class-attribute
instance-attribute
¶
Enforce strict reproducibility mode (default: False).
When enabled, the backtest will raise errors instead of using fallbacks that could produce non-deterministic results:
- Raises ValueError if simulation timestamp is missing (instead of using datetime.now())
- Raises ValueError if required historical data is unavailable
- Requires all price sources to provide historical data, not just current prices
Use this mode when you need byte-identical results across multiple runs with the same configuration and random_seed. When disabled, the backtester will use reasonable defaults and log warnings instead of failing.
staleness_threshold_seconds
class-attribute
instance-attribute
¶
Threshold in seconds for marking price data as stale (default: 3600 = 1 hour).
Price data older than this threshold relative to the simulation timestamp will be counted as stale in the data quality report. This helps identify backtests that may be using outdated price information.
Set to 0 to disable staleness tracking.
institutional_mode
class-attribute
instance-attribute
¶
Enable institutional-grade enforcement mode (default: False).
When enabled, applies stricter data quality requirements suitable for institutional trading operations:
- Fails backtest if data coverage is below min_data_coverage threshold
- Disables hardcoded price fallbacks (allow_hardcoded_fallback=False)
- Requires historical price data from verified sources
- Enforces strict reproducibility (strict_reproducibility=True)
This mode is designed for production-grade backtests where data quality and reproducibility are critical. Use for institutional trading strategies or when accurate PnL calculations are required for compliance/reporting.
min_data_coverage
class-attribute
instance-attribute
¶
Minimum data coverage ratio required in institutional mode (default: 0.98 = 98%).
When institutional_mode is enabled, the backtest will fail if the actual data coverage ratio (successful price lookups / total lookups) falls below this threshold.
When institutional_mode is disabled, this threshold is only used for warnings in the data quality report, not enforcement.
Valid range: 0.0 to 1.0 (0% to 100%)
allow_hardcoded_fallback
class-attribute
instance-attribute
¶
Allow hardcoded price fallbacks when price data is unavailable (default: False).
When disabled (default): The backtester will raise an error if it cannot find price data, ensuring that all valuations use actual market prices. This is the institutional-grade setting for production backtests.
When enabled: The backtester may use hardcoded fallback prices for tokens when historical price data is unavailable. This can mask data quality issues and should only be used for development/testing where price accuracy is not critical.
Note: This is automatically set to False when institutional_mode=True in post_init, as institutional-grade backtests should never use arbitrary hardcoded prices.
Environment variable: Set ALMANAK_ALLOW_HARDCODED_PRICES=1 to override for testing scenarios where you need relaxed defaults.
allow_degraded_data
class-attribute
instance-attribute
¶
Allow backtests to proceed with degraded or incomplete data (default: True).
When enabled, the backtester will continue execution even when: - Some price data is missing or interpolated - Data sources return stale information - Historical data has gaps
When disabled, the backtester will fail fast if data quality issues are detected, ensuring only high-quality data is used for analysis.
Note: This is automatically set to False when institutional_mode=True in post_init, as institutional-grade backtests require complete data.
require_symbol_mapping
class-attribute
instance-attribute
¶
Require all token addresses to be resolved to symbols (default: False).
When enabled, the backtester will fail if any token address cannot be resolved to a human-readable symbol. This ensures all trade records and reports use consistent, recognizable token names.
When disabled, unresolved token addresses are used as-is (checksummed), which may make reports harder to read and audit.
Note: This is automatically set to True when institutional_mode=True in post_init, as institutional-grade backtests require clear symbol identification for compliance and reporting purposes.
use_historical_gas_prices
class-attribute
instance-attribute
¶
Use historical gas prices for accurate gas cost simulation (default: False).
When enabled, the backtester will attempt to fetch historical ETH prices at each simulation timestamp to calculate gas costs more accurately. This provides realistic gas cost estimates that reflect market conditions at the time of simulated trades.
When disabled, gas costs use the current ETH price or gas_eth_price_override if specified. This is faster but less accurate for historical backtests.
Note: Requires a data provider that supports historical price lookups.
gas_eth_price_override
class-attribute
instance-attribute
¶
Override ETH price for gas cost calculations (default: None = use market price).
When set, this value is used as the ETH price for all gas cost calculations, ignoring both historical and current market prices. This is useful for:
- Testing with a fixed ETH price for reproducibility
- Stress testing with extreme ETH price scenarios
- Backtests where gas cost accuracy is not critical
When None, gas costs use: 1. Historical ETH price (if use_historical_gas_prices=True) 2. Current ETH price from data provider 3. Default fallback ($3000) with warning if unavailable
Value should be in USD (e.g., Decimal("3000") for $3000 per ETH).
use_historical_gas_gwei
class-attribute
instance-attribute
¶
Use historical gas prices (gwei) from gas price provider (default: False).
When enabled and a gas_provider is attached to the PnLBacktester, the engine will fetch historical gas prices at each simulation timestamp instead of using the static gas_price_gwei value. This provides more realistic gas cost estimates that reflect network congestion at historical timestamps.
Priority order for gas price (gwei): 1. Historical gas price from gas_provider (if use_historical_gas_gwei=True) 2. MarketState.gas_price_gwei (if populated by data provider) 3. config.gas_price_gwei (static default: 30 gwei)
When disabled, gas costs use the static gas_price_gwei for all trades, which is faster but may not reflect actual network conditions.
Note: Requires a GasPriceProvider (e.g., EtherscanGasPriceProvider) to be passed to the PnLBacktester. If enabled without a provider, falls back to MarketState.gas_price_gwei or config.gas_price_gwei with a warning.
track_gas_prices
class-attribute
instance-attribute
¶
Track detailed gas price records for each trade (default: False).
When enabled, the backtester records a GasPriceRecord for each trade, capturing the gas price in gwei, source, and USD cost. These records are stored in BacktestResult.gas_prices_used for detailed analysis.
This is useful for: - Analyzing gas price volatility impact on strategy performance - Understanding gas cost breakdown by source (historical vs config) - Auditing gas costs in institutional-grade backtests
When disabled, only summary statistics (gas_price_summary) are populated from the TradeRecord.gas_price_gwei values, reducing result size.
Note: Gas price summary statistics are always calculated regardless of this setting, since TradeRecord already contains gas_price_gwei.
preflight_validation
class-attribute
instance-attribute
¶
Enable preflight validation before running backtest (default: True).
When enabled, the backtester performs validation checks before starting the simulation to ensure data requirements can be met:
- Checks price data availability for all tokens in config
- Verifies data provider capabilities match requirements
- Tests archive node accessibility if historical TWAP/Chainlink needed
- Reports estimated data coverage and potential gaps
Results are returned in a PreflightReport with pass/fail and details. This helps identify data issues early, before spending time on a backtest that would fail or produce inaccurate results.
When disabled, the backtest proceeds without validation, which is faster but may encounter data issues during simulation.
fail_on_preflight_error
class-attribute
instance-attribute
¶
Fail fast if preflight validation fails (default: True).
When enabled (True): If preflight validation detects critical issues (e.g., missing price data, insufficient data coverage), the backtester raises PreflightValidationError with an actionable error message that includes: - What failed (specific checks that did not pass) - Why it failed (the underlying cause) - How to fix it (recommendations for resolution)
When disabled (False): The backtester logs warnings about preflight issues but continues in degraded mode. This is useful for exploratory backtests where you want to see partial results even with data gaps.
The preflight_passed field in BacktestResult indicates whether preflight validation passed, regardless of this setting.
Note: This setting only applies when preflight_validation=True.
get_gas_cost_usd
¶
Calculate gas cost in USD for a given amount of gas used.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gas_used
|
int
|
Amount of gas consumed by the transaction |
required |
eth_price_usd
|
Decimal
|
Current ETH price in USD |
required |
Returns:
| Type | Description |
|---|---|
Decimal
|
Gas cost in USD |
to_dict_with_metadata
¶
Serialize to dictionary with full metadata for reproducibility.
This method extends to_dict() to include additional metadata needed to reproduce a backtest exactly, such as: - Data provider versions and timestamps - SDK/framework versions - Run timestamp
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_provider_info
|
dict[str, Any] | None
|
Optional dict containing data provider information: - name: Provider name (e.g., "coingecko", "chainlink") - version: Provider version if available - data_fetched_at: ISO timestamp when data was fetched - cache_hit_rate: Optional cache hit rate percentage - additional provider-specific metadata |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dictionary with full config and metadata for reproducibility |
calculate_config_hash
¶
Calculate a deterministic hash of the configuration for verification.
The hash is calculated from all configuration parameters that affect backtest results, excluding runtime metadata like timestamps. This enables verification that a backtest was run with identical config.
The hash uses SHA-256 and includes: - Time range (start_time, end_time, interval_seconds) - Capital settings (initial_capital_usd) - Model settings (fee_model, slippage_model) - Gas settings (include_gas_costs, gas_price_gwei) - Execution settings (inclusion_delay_blocks) - Chain and token settings - Metrics settings (benchmark_token, risk_free_rate, etc.) - Margin settings - Other simulation parameters
Returns:
| Type | Description |
|---|---|
str
|
64-character hex string (SHA-256 hash) |
Example
config = PnLBacktestConfig(...) hash1 = config.calculate_config_hash()
Same config produces same hash¶
config2 = PnLBacktestConfig(...) # identical params hash2 = config2.calculate_config_hash() assert hash1 == hash2
from_dict
classmethod
¶
Deserialize from dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary containing config fields |
required |
Returns:
| Type | Description |
|---|---|
PnLBacktestConfig
|
PnLBacktestConfig instance |
Paper Trader¶
PaperTrader¶
almanak.framework.backtesting.PaperTrader
dataclass
¶
PaperTrader(
fork_manager: RollingForkManager,
portfolio_tracker: PaperPortfolioTracker,
config: PaperTraderConfig,
event_callback: PaperTradeEventCallback | None = None,
)
Main paper trading engine for fork-based strategy simulation.
The PaperTrader executes strategy decisions on local Anvil forks, providing accurate simulation of real DeFi execution. It:
- Manages fork lifecycle via RollingForkManager
- Calls strategy.decide() at configured intervals
- Compiles intents to ActionBundles
- Executes transactions on the fork via ExecutionOrchestrator
- Tracks portfolio state and records trades
- Calculates comprehensive performance metrics
Attributes:
| Name | Type | Description |
|---|---|---|
fork_manager |
RollingForkManager
|
RollingForkManager for Anvil fork lifecycle |
portfolio_tracker |
PaperPortfolioTracker
|
PaperPortfolioTracker for state tracking |
config |
PaperTraderConfig
|
PaperTraderConfig with execution parameters |
event_callback |
PaperTradeEventCallback | None
|
Optional callback for trading events |
Example
trader = PaperTrader( fork_manager=fork_manager, portfolio_tracker=portfolio_tracker, config=PaperTraderConfig(tick_interval_seconds=60), )
Run for 1 hour¶
result = await trader.run(my_strategy, duration_seconds=3600)
Or run indefinitely until stopped¶
await trader.start(my_strategy)
... later ...¶
await trader.stop()
run
async
¶
run(
strategy: PaperTradeableStrategy,
duration_seconds: float | None = None,
max_ticks: int | None = None,
) -> BacktestResult
Run a paper trading session for the specified duration.
This is the main entry point for paper trading. It: 1. Initializes the fork and orchestrator 2. Runs the trading loop for the specified duration 3. Calculates and returns metrics
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy
|
PaperTradeableStrategy
|
Strategy to paper trade |
required |
duration_seconds
|
float | None
|
Maximum duration in seconds (None = config default) |
None
|
max_ticks
|
int | None
|
Maximum number of ticks (None = no limit) |
None
|
Returns:
| Type | Description |
|---|---|
BacktestResult
|
BacktestResult with comprehensive metrics and trades |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If already running |
start
async
¶
Start continuous paper trading until stop() is called.
This method runs paper trading indefinitely. Call stop() to end the session gracefully.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy
|
PaperTradeableStrategy
|
Strategy to paper trade |
required |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If already running |
stop
async
¶
Stop the current paper trading session.
Signals the trading loop to exit gracefully. The current tick will complete before stopping.
is_running
¶
Check if paper trading is currently active.
Returns:
| Type | Description |
|---|---|
bool
|
True if a session is running |
tick
async
¶
Execute one trading cycle (tick) manually.
This method allows manual tick execution for testing or custom integration. It performs one complete trading cycle:
- Optionally resets fork to latest block (based on config)
- Creates MarketSnapshot from current fork state
- Calls strategy.decide(snapshot) to get intent
- If intent returned (non-HOLD), executes via orchestrator on fork
- Records trade result in portfolio_tracker
- Handles and records errors gracefully
Prerequisites
- PaperTrader must be initialized (call start() or run() first)
- A strategy must be set via _current_strategy
Returns:
| Type | Description |
|---|---|
PaperTrade | None
|
PaperTrade if a trade was executed successfully, None otherwise |
PaperTrade | None
|
(including HOLD decisions, errors, or no strategy set) |
Example
Manual tick control¶
trader = PaperTrader(fork_manager, portfolio_tracker, config) await trader._initialize_fork() await trader._initialize_orchestrator() trader._current_strategy = my_strategy trader._running = True
Execute single tick¶
trade = await trader.tick() if trade: print(f"Trade executed: {trade.tx_hash}")
run_loop
async
¶
Run a paper trading session with a simple tick loop.
This method implements the classic paper trading loop pattern: 1. Initialize fork and orchestrator 2. Loop: call tick(), sleep for tick_interval_seconds 3. Stop when max_ticks reached or _running becomes False 4. Cleanup in finally block
Unlike run(), which returns a comprehensive BacktestResult, this method returns a simpler PaperTradingSummary focused on trade statistics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy
|
PaperTradeableStrategy
|
Strategy to paper trade |
required |
max_ticks
|
int | None
|
Maximum number of ticks to run (None = use config.max_ticks, if that's also None, runs until stop() is called) |
None
|
Returns:
| Type | Description |
|---|---|
PaperTradingSummary
|
PaperTradingSummary with session statistics and trade details |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If already running |
Example
trader = PaperTrader(fork_manager, portfolio_tracker, config) summary = await trader.run_loop(my_strategy, max_ticks=100) print(summary.summary())
PaperTraderConfig¶
almanak.framework.backtesting.PaperTraderConfig
dataclass
¶
PaperTraderConfig(
chain: str,
rpc_url: str,
strategy_id: str,
initial_eth: Decimal = Decimal("10"),
initial_tokens: dict[str, Decimal] = dict(),
tick_interval_seconds: int = 60,
max_ticks: int | None = None,
anvil_port: int = 8546,
reset_fork_every_tick: bool = True,
startup_timeout_seconds: float = 30.0,
auto_impersonate: bool = True,
block_time: int | None = None,
wallet_address: str | None = None,
log_trades: bool = True,
log_level: str = "INFO",
price_source: Literal[
"coingecko", "chainlink", "twap", "auto"
] = "auto",
strict_price_mode: bool = True,
allow_hardcoded_fallback: bool | None = None,
)
Configuration for a paper trading session.
Controls all parameters of the paper trading session including chain, initial balances, tick intervals, and Anvil fork settings.
Paper trading executes real transactions on a local Anvil fork, allowing strategies to be validated with actual DeFi protocol interactions before deployment with real capital.
Attributes:
| Name | Type | Description |
|---|---|---|
chain |
str
|
Blockchain to paper trade on (e.g., "arbitrum", "ethereum") |
rpc_url |
str
|
Archive RPC URL to fork from (Alchemy, Infura, etc.) |
strategy_id |
str
|
Identifier of the strategy being tested |
initial_eth |
Decimal
|
Initial ETH balance for the paper wallet (default: 10) |
initial_tokens |
dict[str, Decimal]
|
Dict of token symbol to amount for initial balances |
tick_interval_seconds |
int
|
Time between trading ticks in seconds (default: 60) |
max_ticks |
int | None
|
Maximum number of ticks to run, None = run indefinitely |
anvil_port |
int
|
Port to run Anvil on (default: 8546) |
reset_fork_every_tick |
bool
|
Whether to reset fork to latest block each tick (default: True) |
startup_timeout_seconds |
float
|
Timeout for Anvil startup (default: 30) |
auto_impersonate |
bool
|
Enable auto-impersonation for any address (default: True) |
block_time |
int | None
|
Optional block time in seconds (default: None = instant) |
wallet_address |
str | None
|
Optional paper wallet address (default: None = auto-generated) |
log_trades |
bool
|
Whether to log individual trades (default: True) |
log_level |
str
|
Logging level for paper trader (default: "INFO") |
price_source |
Literal['coingecko', 'chainlink', 'twap', 'auto']
|
Price source to use ('coingecko', 'chainlink', 'twap', 'auto') |
Example
config = PaperTraderConfig( chain="arbitrum", rpc_url="https://arb1.arbitrum.io/rpc", strategy_id="momentum_v1", initial_eth=Decimal("10"), initial_tokens={"USDC": Decimal("10000")}, ) print(f"Chain: {config.chain} (ID: {config.chain_id})") print(f"Max duration: {config.max_duration_seconds}s")
price_source
class-attribute
instance-attribute
¶
Price source to use for portfolio valuation.
strict_price_mode
class-attribute
instance-attribute
¶
Whether to fail when price providers cannot return a price.
When True (default): Raises ValueError if all price providers fail for a token. This is the institutional-grade setting that ensures all prices are from real data sources. Use this for production backtests where accuracy is critical. Error messages include the failed token and chain for debugging.
When False: Falls back to hardcoded prices for common tokens (ETH=$3000, BTC=$60000, etc.) when all price providers fail. This allows backtests to complete but may produce inaccurate results. Only use this for development/testing where price accuracy is not critical.
Note: This is the inverse of the deprecated allow_hardcoded_fallback field. If both are set, strict_price_mode takes precedence.
Environment variable: Set ALMANAK_ALLOW_HARDCODED_PRICES=1 to override strict_price_mode=False for testing scenarios.
allow_hardcoded_fallback
class-attribute
instance-attribute
¶
DEPRECATED: Use strict_price_mode instead.
This field is kept for backward compatibility. If set, it will be converted to the equivalent strict_price_mode value (allow_hardcoded_fallback=False is equivalent to strict_price_mode=True).
Will be removed in a future version.
max_duration_seconds
property
¶
Get the maximum duration in seconds, or None if indefinite.
max_duration_minutes
property
¶
Get the maximum duration in minutes, or None if indefinite.
max_duration_hours
property
¶
Get the maximum duration in hours, or None if indefinite.
get_initial_balances
¶
Get all initial balances including ETH.
Returns:
| Type | Description |
|---|---|
dict[str, Decimal]
|
Dictionary of token symbol to initial balance amount |
from_dict
classmethod
¶
Deserialize from dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary containing config fields |
required |
Returns:
| Type | Description |
|---|---|
PaperTraderConfig
|
PaperTraderConfig instance |
Results¶
BacktestResult¶
almanak.framework.backtesting.BacktestResult
dataclass
¶
BacktestResult(
engine: BacktestEngine,
strategy_id: str,
start_time: datetime,
end_time: datetime,
metrics: BacktestMetrics,
trades: list[TradeRecord] = list(),
equity_curve: list[EquityPoint] = list(),
initial_capital_usd: Decimal = Decimal("10000"),
final_capital_usd: Decimal = Decimal("10000"),
chain: str = "arbitrum",
run_started_at: datetime | None = None,
run_ended_at: datetime | None = None,
run_duration_seconds: float = 0.0,
config: dict[str, Any] = dict(),
error: str | None = None,
lending_liquidations: list[
LendingLiquidationEvent
] = list(),
aggregated_portfolio_view: AggregatedPortfolioView
| None = None,
reconciliation_events: list[
ReconciliationEvent
] = list(),
walk_forward_results: WalkForwardResult | None = None,
monte_carlo_results: MonteCarloSimulationResult
| None = None,
crisis_results: CrisisMetrics | None = None,
errors: list[dict[str, Any]] = list(),
backtest_id: str | None = None,
phase_timings: list[dict[str, Any]] = list(),
config_hash: str | None = None,
execution_delayed_at_end: int = 0,
data_source_capabilities: dict[
str, HistoricalDataCapability
] = dict(),
data_source_warnings: list[str] = list(),
data_quality: DataQualityReport | None = None,
institutional_compliance: bool = True,
compliance_violations: list[str] = list(),
fallback_usage: dict[str, int] = dict(),
preflight_report: PreflightReport | None = None,
preflight_passed: bool = True,
gas_prices_used: list[GasPriceRecord] = list(),
gas_price_summary: GasPriceSummary | None = None,
parameter_sources: ParameterSourceTracker | None = None,
accuracy_estimate: AccuracyEstimate | None = None,
data_coverage_metrics: DataCoverageMetrics
| None = None,
)
Complete results from a backtest run.
This model is used by both the PnL Backtester and Paper Trader to provide consistent result formatting and analysis.
Attributes:
| Name | Type | Description |
|---|---|---|
engine |
BacktestEngine
|
Which backtesting engine was used (pnl or paper) |
strategy_id |
str
|
Identifier of the strategy being tested |
start_time |
datetime
|
When the backtest started (simulation time) |
end_time |
datetime
|
When the backtest ended (simulation time) |
metrics |
BacktestMetrics
|
Calculated performance metrics |
trades |
list[TradeRecord]
|
List of all trade records |
equity_curve |
list[EquityPoint]
|
Portfolio value over time |
initial_capital_usd |
Decimal
|
Starting capital in USD |
final_capital_usd |
Decimal
|
Ending capital in USD |
chain |
str
|
Target blockchain (arbitrum, base, etc.) |
run_started_at |
datetime | None
|
When the backtest run actually started (wall time) |
run_ended_at |
datetime | None
|
When the backtest run actually completed (wall time) |
run_duration_seconds |
float
|
Wall clock duration of the backtest run |
config |
dict[str, Any]
|
Configuration used for the backtest |
error |
str | None
|
Error message if backtest failed |
lending_liquidations |
list[LendingLiquidationEvent]
|
List of lending liquidation events that occurred |
aggregated_portfolio_view |
AggregatedPortfolioView | None
|
Tick-by-tick portfolio state snapshots with risk scores |
reconciliation_events |
list[ReconciliationEvent]
|
List of position reconciliation events (discrepancies detected) |
walk_forward_results |
WalkForwardResult | None
|
Results from walk-forward optimization (if run with --walk-forward) |
monte_carlo_results |
MonteCarloSimulationResult | None
|
Results from Monte Carlo simulation (if run with --monte-carlo). Contains return confidence intervals, drawdown probabilities, and path statistics. |
crisis_results |
CrisisMetrics | None
|
Crisis-specific metrics when backtest was run during a crisis scenario. Contains drawdown analysis, recovery time, and comparison to normal period performance. |
errors |
list[dict[str, Any]]
|
List of error records as dictionaries with timestamps and context for debugging and analysis. Each error dict contains: timestamp, error_type, error_message, classification (with error_type, category, is_recoverable, is_fatal, is_non_critical, suggested_action), context, and handled action. |
backtest_id |
str | None
|
Unique correlation ID (UUID) for this backtest run. Used for structured logging and tracing across all log messages generated during this backtest. |
phase_timings |
list[dict[str, Any]]
|
List of phase timing records showing how long each backtest phase took. Each record contains: phase_name, start_time, end_time, duration_seconds, error. Useful for performance analysis and identifying bottlenecks. |
config_hash |
str | None
|
SHA-256 hash of the configuration used for this backtest. Enables verification that a backtest was run with identical configuration. Calculated from all parameters that affect backtest results, excluding runtime metadata. |
execution_delayed_at_end |
int
|
Count of pending intents executed at simulation end. These were queued due to inclusion_delay_blocks > 0 and executed with the last market state when the simulation completed. |
data_source_capabilities |
dict[str, HistoricalDataCapability]
|
Dictionary mapping data provider names to their HistoricalDataCapability enum values. Shows which providers were used and their ability to provide accurate historical data (FULL, CURRENT_ONLY, PRE_CACHE). Useful for understanding potential data quality limitations in the backtest. |
data_source_warnings |
list[str]
|
List of warning messages about data source limitations. Generated when providers with CURRENT_ONLY or PRE_CACHE capability are used, as these may affect backtest accuracy. |
data_quality |
DataQualityReport | None
|
Data quality metrics for the backtest run. Includes coverage ratio, source breakdown, stale data count, and interpolation count. Useful for understanding data reliability and identifying potential accuracy issues. |
institutional_compliance |
bool
|
Whether the backtest run meets institutional standards. Set to False when any strict reproducibility, data quality, or compliance check fails. Use compliance_violations to see which checks failed. |
compliance_violations |
list[str]
|
List of compliance violations that caused institutional_compliance to be set to False. Each entry describes a specific compliance failure such as "CURRENT_ONLY data provider used", "Symbol mapping failed for 0x...", "Data coverage below minimum threshold (95% < 98%)". |
fallback_usage |
dict[str, int]
|
Dictionary tracking count of each fallback type used during the backtest. Keys include: "hardcoded_price", "default_gas_price", "default_usd_amount". Empty dict means no fallbacks were used, which is the desired state for institutional-grade backtests. |
preflight_report |
PreflightReport | None
|
Preflight validation report from checks run before the backtest. Contains pass/fail status, individual check results, estimated data coverage, and recommendations for fixing any issues. None if preflight validation was disabled. |
preflight_passed |
bool
|
Whether preflight validation passed (True) or failed (False). Defaults to True if preflight validation was disabled. This is a convenience field for quick checks - for full details, inspect preflight_report. |
parameter_sources |
ParameterSourceTracker | None
|
Tracks the source of all configuration parameters for audit purposes. Contains detailed records of where each configuration value came from (default, config_file, env_var, explicit) for config parameters, (asset_specific, protocol_default, global_default) for liquidation thresholds, and (historical, fixed, provider) for APY/funding rates. Critical for institutional compliance. |
accuracy_estimate |
AccuracyEstimate | None
|
Estimated accuracy of this backtest based on strategy type and data quality tier. Provides expected accuracy range (e.g., "90-95%") and primary error source. Derived from ACCURACY_MATRIX based on documented accuracy limitations. |
gas_prices_used
class-attribute
instance-attribute
¶
Optional detailed gas price records for each trade during the backtest.
When track_gas_prices=True in config, this list contains a GasPriceRecord for each trade showing the gas price used, its source, and USD cost. Useful for detailed gas cost analysis but may increase result size.
gas_price_summary
class-attribute
instance-attribute
¶
Summary statistics for gas prices used during the backtest.
Contains min, max, mean, std of gas prices in gwei plus source breakdown. Always populated when trades occurred, regardless of track_gas_prices setting.
parameter_sources
class-attribute
instance-attribute
¶
Tracks the source of all configuration parameters for audit purposes.
Contains detailed records of where each configuration value came from: - Config parameters: default, config_file, env_var, explicit - Liquidation thresholds: asset_specific, protocol_default, global_default - APY/funding rates: historical, fixed, provider
This information is critical for institutional compliance and audit trails. When institutional_mode=True, this is always populated. The tracker provides summary dicts (config_sources, liquidation_sources, apy_funding_sources) for quick inspection and a full list of ParameterSourceRecord objects for detailed analysis.
accuracy_estimate
class-attribute
instance-attribute
¶
Estimated accuracy of this backtest based on strategy type and data quality.
Provides a quick reference showing expected accuracy range (e.g., "90-95%") based on the detected strategy type (LP, perp, lending, arbitrage, spot) and the data quality tier used (FULL, PRE_CACHE, CURRENT_ONLY).
The estimate is derived from the ACCURACY_MATRIX which is based on documented accuracy limitations and golden test tolerances. See docs/ACCURACY_LIMITATIONS.md for the full accuracy matrix and methodology.
Example usage
if result.accuracy_estimate: print(f"Expected accuracy: {result.accuracy_estimate.confidence_interval}") print(f"Primary error source: {result.accuracy_estimate.primary_error_source}")
data_coverage_metrics
class-attribute
instance-attribute
¶
Data coverage metrics tracking confidence levels across all position types.
Provides detailed breakdown of data quality for LP, Perp, Lending, and Slippage calculations. Includes confidence level breakdowns (high/medium/low) and data sources used for each position type.
The data_coverage_pct property gives overall percentage of HIGH confidence data points across all categories.
Example usage
if result.data_coverage_metrics: print(f"Data coverage: {result.data_coverage_metrics.data_coverage_pct:.1f}%") print(f"LP HIGH: {result.data_coverage_metrics.lp_metrics.high_confidence_pct:.1f}%")
simulation_duration_days
property
¶
Get the simulated duration in days.
used_any_fallback
property
¶
Check if any fallbacks were used during the backtest.
Returns True if the fallback_usage dict has any non-zero counts. When this is True, the backtest may have reduced accuracy due to using fallback values instead of real market data.
add_error
¶
Add an error record and log it with timestamp and context.
This method is used to track errors that occurred during the backtest, along with their timestamps, classification, and handling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
error_dict
|
dict[str, Any]
|
Serialized error record from ErrorRecord.to_dict() or equivalent dict with keys: timestamp, error_type, error_message, classification, context, handled |
required |
summary
¶
Generate a human-readable summary of backtest results.
Returns:
| Type | Description |
|---|---|
str
|
Multi-line string with formatted backtest results |
from_dict
classmethod
¶
Deserialize from dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary with serialized BacktestResult data |
required |
Returns:
| Type | Description |
|---|---|
BacktestResult
|
BacktestResult instance |
BacktestMetrics¶
almanak.framework.backtesting.BacktestMetrics
dataclass
¶
BacktestMetrics(
total_pnl_usd: Decimal = Decimal("0"),
net_pnl_usd: Decimal = Decimal("0"),
sharpe_ratio: Decimal = Decimal("0"),
max_drawdown_pct: Decimal = Decimal("0"),
win_rate: Decimal = Decimal("0"),
total_trades: int = 0,
profit_factor: Decimal = Decimal("0"),
total_return_pct: Decimal = Decimal("0"),
annualized_return_pct: Decimal = Decimal("0"),
total_fees_usd: Decimal = Decimal("0"),
total_slippage_usd: Decimal = Decimal("0"),
total_gas_usd: Decimal = Decimal("0"),
winning_trades: int = 0,
losing_trades: int = 0,
avg_trade_pnl_usd: Decimal = Decimal("0"),
largest_win_usd: Decimal = Decimal("0"),
largest_loss_usd: Decimal = Decimal("0"),
avg_win_usd: Decimal = Decimal("0"),
avg_loss_usd: Decimal = Decimal("0"),
volatility: Decimal = Decimal("0"),
sortino_ratio: Decimal = Decimal("0"),
calmar_ratio: Decimal = Decimal("0"),
total_fees_earned_usd: Decimal = Decimal("0"),
fees_by_pool: dict[str, Decimal] = dict(),
lp_fee_confidence_breakdown: dict[str, int] = dict(),
total_funding_paid: Decimal = Decimal("0"),
total_funding_received: Decimal = Decimal("0"),
liquidations_count: int = 0,
liquidation_losses_usd: Decimal = Decimal("0"),
max_margin_utilization: Decimal = Decimal("0"),
total_interest_earned: Decimal = Decimal("0"),
total_interest_paid: Decimal = Decimal("0"),
min_health_factor: Decimal = Decimal("999"),
health_factor_warnings: int = 0,
avg_gas_price_gwei: Decimal = Decimal("0"),
max_gas_price_gwei: Decimal = Decimal("0"),
total_gas_cost_usd: Decimal = Decimal("0"),
total_mev_cost_usd: Decimal = Decimal("0"),
total_leverage: Decimal = Decimal("0"),
max_net_delta: dict[str, Decimal] = dict(),
correlation_risk: Decimal | None = None,
liquidation_cascade_risk: Decimal = Decimal("0"),
information_ratio: Decimal | None = None,
beta: Decimal | None = None,
alpha: Decimal | None = None,
benchmark_return: Decimal | None = None,
pnl_by_protocol: dict[str, Decimal] = dict(),
pnl_by_intent_type: dict[str, Decimal] = dict(),
pnl_by_asset: dict[str, Decimal] = dict(),
realized_pnl: Decimal = Decimal("0"),
unrealized_pnl: Decimal = Decimal("0"),
)
Performance metrics calculated from backtest results.
All financial values are in USD. Ratios are decimal (0.1 = 10%).
Attributes:
| Name | Type | Description |
|---|---|---|
total_pnl_usd |
Decimal
|
Total PnL before execution costs |
net_pnl_usd |
Decimal
|
Net PnL after all execution costs |
sharpe_ratio |
Decimal
|
Risk-adjusted return (annualized, assuming 0 risk-free rate) |
max_drawdown_pct |
Decimal
|
Maximum peak-to-trough decline as decimal (0.1 = 10%) |
win_rate |
Decimal
|
Percentage of profitable trades as decimal (0.6 = 60%) |
total_trades |
int
|
Total number of trades executed |
profit_factor |
Decimal
|
Ratio of gross profit to gross loss |
total_return_pct |
Decimal
|
Total return as decimal (0.15 = 15% return) |
annualized_return_pct |
Decimal
|
Annualized return as decimal |
total_fees_usd |
Decimal
|
Total protocol fees paid |
total_slippage_usd |
Decimal
|
Total slippage incurred |
total_gas_usd |
Decimal
|
Total gas costs |
winning_trades |
int
|
Number of profitable trades |
losing_trades |
int
|
Number of losing trades |
avg_trade_pnl_usd |
Decimal
|
Average PnL per trade |
largest_win_usd |
Decimal
|
Largest single winning trade |
largest_loss_usd |
Decimal
|
Largest single losing trade |
avg_win_usd |
Decimal
|
Average winning trade PnL |
avg_loss_usd |
Decimal
|
Average losing trade PnL |
volatility |
Decimal
|
Annualized volatility of returns as decimal |
sortino_ratio |
Decimal
|
Downside risk-adjusted return |
calmar_ratio |
Decimal
|
Return / max drawdown |
total_fees_earned_usd |
Decimal
|
Total fees earned from LP positions in USD |
fees_by_pool |
dict[str, Decimal]
|
Dict mapping pool identifier to fees earned in USD |
total_funding_paid |
Decimal
|
Total funding payments made from perp positions in USD |
total_funding_received |
Decimal
|
Total funding payments received by perp positions in USD |
liquidations_count |
int
|
Number of liquidation events that occurred |
liquidation_losses_usd |
Decimal
|
Total losses from liquidations in USD |
max_margin_utilization |
Decimal
|
Maximum margin utilization ratio observed during backtest (0-1) |
total_interest_earned |
Decimal
|
Total interest earned from lending supply positions in USD |
total_interest_paid |
Decimal
|
Total interest paid on borrow positions in USD |
min_health_factor |
Decimal
|
Minimum health factor observed for lending positions during backtest (lower = more risk) |
health_factor_warnings |
int
|
Number of times health factor dropped below warning threshold |
avg_gas_price_gwei |
Decimal
|
Average gas price in gwei across all trades (for cost analysis) |
max_gas_price_gwei |
Decimal
|
Maximum gas price in gwei observed during backtest (for peak cost analysis) |
total_gas_cost_usd |
Decimal
|
Total gas costs in USD (same as total_gas_usd, kept for API consistency) |
total_mev_cost_usd |
Decimal
|
Total estimated MEV (sandwich attack) costs in USD across all trades |
total_leverage |
Decimal
|
Total portfolio leverage ratio (sum of all position notionals / equity) |
max_net_delta |
dict[str, Decimal]
|
Maximum net delta exposure observed per asset (token symbol -> max delta) |
correlation_risk |
Decimal | None
|
Portfolio correlation risk score (0-1, higher = more correlated positions) |
liquidation_cascade_risk |
Decimal
|
Risk of cascading liquidations across protocols (0-1, higher = more risk) |
information_ratio |
Decimal | None
|
Information ratio measuring risk-adjusted excess return vs benchmark (None if not calculated) |
beta |
Decimal | None
|
Portfolio beta measuring sensitivity to benchmark movements (None if not calculated) |
alpha |
Decimal | None
|
Jensen's alpha measuring excess return beyond what beta would predict (None if not calculated) |
benchmark_return |
Decimal | None
|
Total return of the benchmark over the backtest period as decimal (None if not calculated) |
pnl_by_protocol |
dict[str, Decimal]
|
PnL breakdown by protocol (e.g., {"uniswap_v3": Decimal("100"), "aave_v3": Decimal("-50")}) |
pnl_by_intent_type |
dict[str, Decimal]
|
PnL breakdown by intent type (e.g., {"SWAP": Decimal("75"), "LP_OPEN": Decimal("25")}) |
pnl_by_asset |
dict[str, Decimal]
|
PnL breakdown by asset (e.g., {"ETH": Decimal("80"), "USDC": Decimal("20")}) |
realized_pnl |
Decimal
|
Total realized PnL from closed positions in USD |
unrealized_pnl |
Decimal
|
Total unrealized PnL from open positions in USD |
lp_fee_confidence_breakdown
class-attribute
instance-attribute
¶
Count of LP positions by fee confidence level.
Example: {"high": 2, "medium": 1, "low": 0} - high: Fees calculated using actual historical volume data from subgraph - medium: Fees calculated using interpolated or estimated data - low: Fees calculated using multiplier heuristic
total_execution_cost_usd
property
¶
Get total execution costs (fees + slippage + gas).
PaperTradingSummary¶
almanak.framework.backtesting.PaperTradingSummary
dataclass
¶
PaperTradingSummary(
strategy_id: str,
start_time: datetime,
duration: timedelta,
total_trades: int,
successful_trades: int,
failed_trades: int,
chain: str = "arbitrum",
initial_balances: dict[str, Decimal] = dict(),
final_balances: dict[str, Decimal] = dict(),
total_gas_used: int = 0,
total_gas_cost_usd: Decimal = Decimal("0"),
pnl_usd: Decimal | None = None,
error_summary: dict[str, int] = dict(),
trades: list[PaperTrade] = list(),
errors: list[PaperTradeError] = list(),
)
Summary of a paper trading session.
This dataclass provides an overview of the paper trading session, including trade counts, timing, and basic performance metrics.
Attributes:
| Name | Type | Description |
|---|---|---|
strategy_id |
str
|
Identifier of the strategy being tested |
start_time |
datetime
|
When the session started |
duration |
timedelta
|
How long the session ran |
total_trades |
int
|
Total number of trades attempted |
successful_trades |
int
|
Number of successful trades |
failed_trades |
int
|
Number of failed trades |
end_time |
datetime
|
When the session ended (computed) |
chain |
str
|
Target blockchain |
initial_balances |
dict[str, Decimal]
|
Starting token balances |
final_balances |
dict[str, Decimal]
|
Ending token balances |
total_gas_used |
int
|
Total gas consumed |
total_gas_cost_usd |
Decimal
|
Total gas cost in USD |
pnl_usd |
Decimal | None
|
Estimated PnL in USD (if available) |
error_summary |
dict[str, int]
|
Count of errors by type |
trades |
list[PaperTrade]
|
List of successful trades |
errors |
list[PaperTradeError]
|
List of trade errors |
avg_gas_per_trade
property
¶
Calculate average gas used per successful trade.
summary
¶
Generate a human-readable summary.
Returns:
| Type | Description |
|---|---|
str
|
Multi-line string with formatted session summary |
from_dict
classmethod
¶
Deserialize from dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary with serialized PaperTradingSummary data |
required |
Returns:
| Type | Description |
|---|---|
PaperTradingSummary
|
PaperTradingSummary instance |
Data Providers¶
HistoricalDataProvider¶
almanak.framework.backtesting.HistoricalDataProvider
¶
Bases: Protocol
Protocol defining the interface for historical data providers.
Historical data providers are responsible for fetching price and market data for past time periods. They are used by the PnL backtesting engine to simulate strategy execution.
Implementations should handle: - Fetching historical prices for specified tokens - Providing OHLCV data when available - Rate limiting and caching as needed - Graceful handling of missing data
Example implementation
class MyDataProvider: async def get_price( self, token: str, timestamp: datetime ) -> Decimal: # Fetch price from data source ...
async def get_ohlcv(
self, token: str, start: datetime, end: datetime, interval: int
) -> list[OHLCV]:
# Fetch OHLCV data
...
async def iterate(
self, config: HistoricalDataConfig
) -> AsyncIterator[tuple[datetime, MarketState]]:
# Yield market states for each time point
...
min_timestamp
property
¶
Return the earliest timestamp with available data, or None if unknown.
max_timestamp
property
¶
Return the latest timestamp with available data, or None if unknown.
get_price
async
¶
Get the price of a token at a specific timestamp.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
token
|
str
|
Token symbol (e.g., "WETH", "USDC", "ARB") |
required |
timestamp
|
datetime
|
The historical point in time |
required |
Returns:
| Type | Description |
|---|---|
Decimal
|
Price in USD at the specified timestamp |
Raises:
| Type | Description |
|---|---|
ValueError
|
If price data is not available for the token/timestamp |
DataSourceUnavailable
|
If the data source is unavailable |
get_ohlcv
async
¶
get_ohlcv(
token: str,
start: datetime,
end: datetime,
interval_seconds: int = 3600,
) -> list[OHLCV]
Get OHLCV data for a token over a time range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
token
|
str
|
Token symbol (e.g., "WETH", "USDC", "ARB") |
required |
start
|
datetime
|
Start of the time range (inclusive) |
required |
end
|
datetime
|
End of the time range (inclusive) |
required |
interval_seconds
|
int
|
Candle interval in seconds (default: 3600 = 1 hour) |
3600
|
Returns:
| Type | Description |
|---|---|
list[OHLCV]
|
List of OHLCV data points, sorted by timestamp ascending |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data is not available for the token/range |
DataSourceUnavailable
|
If the data source is unavailable |
iterate
async
¶
Iterate through historical market states.
This is the primary method used by the backtesting engine. It yields market state snapshots at regular intervals throughout the configured time range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
HistoricalDataConfig
|
Configuration specifying time range, interval, and tokens |
required |
Yields:
| Type | Description |
|---|---|
AsyncIterator[tuple[datetime, MarketState]]
|
Tuples of (timestamp, MarketState) for each time point |
Raises:
| Type | Description |
|---|---|
DataSourceUnavailable
|
If the data source is unavailable |
Example
async for timestamp, market_state in provider.iterate(config): eth_price = market_state.get_price("WETH") # Process market state
HistoricalDataConfig¶
almanak.framework.backtesting.HistoricalDataConfig
dataclass
¶
HistoricalDataConfig(
start_time: datetime,
end_time: datetime,
interval_seconds: int = 3600,
tokens: list[str] = (lambda: ["WETH", "USDC"])(),
chains: list[str] = (lambda: ["arbitrum"])(),
include_ohlcv: bool = True,
include_gas_prices: bool = False,
)
Configuration for historical data retrieval.
Specifies the time range, interval, and tokens to fetch for a backtest simulation.
Attributes:
| Name | Type | Description |
|---|---|---|
start_time |
datetime
|
Start of the historical period (inclusive) |
end_time |
datetime
|
End of the historical period (inclusive) |
interval_seconds |
int
|
Time between data points in seconds (default: 3600 = 1 hour) |
tokens |
list[str]
|
List of token symbols to fetch prices for |
chains |
list[str]
|
List of chain identifiers to fetch data for (default: ["arbitrum"]) |
include_ohlcv |
bool
|
Whether to fetch OHLCV data (default: True) |
include_gas_prices |
bool
|
Whether to fetch historical gas prices (default: False) |
estimated_data_points
property
¶
Get the estimated number of data points.
Crisis Scenarios¶
CrisisScenario¶
almanak.framework.backtesting.CrisisScenario
dataclass
¶
A historical crisis scenario for backtesting.
This dataclass represents a period of significant market stress that can be used for stress-testing trading strategies.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Unique identifier for the scenario (lowercase, underscores) |
start_date |
datetime
|
Beginning of the crisis period |
end_date |
datetime
|
End of the crisis period |
description |
str
|
Human-readable description of the crisis event |
Properties
duration_days: Number of days in the crisis period
Example
scenario = CrisisScenario( ... name="custom_crisis", ... start_date=datetime(2023, 3, 10), ... end_date=datetime(2023, 3, 15), ... description="SVB collapse", ... ) scenario.duration_days 5
to_dict
¶
Serialize to dictionary.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dictionary with scenario data suitable for JSON serialization. |
from_dict
classmethod
¶
Deserialize from dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Dictionary with serialized CrisisScenario data |
required |
Returns:
| Type | Description |
|---|---|
CrisisScenario
|
CrisisScenario instance |
Parallel Execution¶
almanak.framework.backtesting.run_parallel_backtests
async
¶
run_parallel_backtests(
configs: list[PnLBacktestConfig],
strategy_factory: Callable[[], Any],
data_provider_factory: Callable[[], Any],
backtester_factory: Callable[
[Any, dict[str, Any], dict[str, Any]], Any
],
fee_models: dict[str, Any] | None = None,
slippage_models: dict[str, Any] | None = None,
workers: int | None = None,
) -> list[ParallelBacktestResult]
Run multiple backtests in parallel using a process pool.
This function distributes backtest execution across multiple processes for improved performance on multi-core systems. Each backtest runs in its own process with its own instances of strategy, data provider, and backtester created via factory functions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
configs
|
list[PnLBacktestConfig]
|
List of PnLBacktestConfig objects to run |
required |
strategy_factory
|
Callable[[], Any]
|
Factory function that returns a new strategy instance. Must be picklable (e.g., a module-level function). |
required |
data_provider_factory
|
Callable[[], Any]
|
Factory function that returns a new data provider. Must be picklable (e.g., a module-level function). |
required |
backtester_factory
|
Callable[[Any, dict[str, Any], dict[str, Any]], Any]
|
Factory function that returns a new PnLBacktester. Takes (data_provider, fee_models, slippage_models) as arguments. |
required |
fee_models
|
dict[str, Any] | None
|
Optional dict of fee models to pass to backtester factory. |
None
|
slippage_models
|
dict[str, Any] | None
|
Optional dict of slippage models to pass to backtester factory. |
None
|
workers
|
int | None
|
Number of worker processes. Defaults to CPU count - 1. |
None
|
Returns:
| Type | Description |
|---|---|
list[ParallelBacktestResult]
|
List of ParallelBacktestResult in the same order as input configs. |
list[ParallelBacktestResult]
|
Each result indicates success or failure with associated data. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If configs list is empty |
Example
def create_strategy(): return MyStrategy(param1=10, param2=0.5)
def create_data_provider(): return CoinGeckoDataProvider()
def create_backtester(provider, fee_models, slippage_models): return PnLBacktester(provider, fee_models, slippage_models)
results = await run_parallel_backtests( configs=[config1, config2, config3], strategy_factory=create_strategy, data_provider_factory=create_data_provider, backtester_factory=create_backtester, workers=4, )
Note
- Factory functions must be picklable (module-level functions, not lambdas)
- Each worker process creates its own instances to avoid sharing state
- Results are returned in the same order as input configs