Backtesting¶

Dual-engine backtesting system: PnL simulation with historical prices and paper trading on Anvil forks.

PnL Backtester¶

PnLBacktester¶

almanak.framework.backtesting.PnLBacktester `dataclass` ¶

PnLBacktester(
    data_provider: HistoricalDataProvider,
    fee_models: dict[str, FeeModel],
    slippage_models: dict[str, SlippageModel],
    strategy_type: str | None = "auto",
    gas_provider: GasPriceProvider | None = None,
    data_config: BacktestDataConfig | None = None,
    _mev_simulator: MEVSimulator | None = None,
    _current_backtest_id: str = "",
    _adapter: StrategyBacktestAdapter | None = None,
    _detected_strategy_type: StrategyTypeHint | None = None,
    _error_handler: BacktestErrorHandler | None = None,
    _fallback_usage: dict[str, int] | None = None,
    _gas_price_records: list[GasPriceRecord] | None = None,
)

Main PnL backtesting engine for historical strategy simulation.

The PnLBacktester simulates strategy execution against historical price data to evaluate performance. It:

Iterates through historical market data at configured intervals
Calls strategy.decide() with a MarketSnapshot for each time step
Simulates intent execution with configurable fee/slippage models
Tracks portfolio state and builds an equity curve
Calculates comprehensive performance metrics

Attributes:

Name	Type	Description
`data_provider`	`HistoricalDataProvider`	Historical data provider (e.g., CoinGeckoDataProvider)
`fee_models`	`dict[str, FeeModel]`	Dict mapping protocol -> FeeModel (or "default" for all)
`slippage_models`	`dict[str, SlippageModel]`	Dict mapping protocol -> SlippageModel
`gas_provider`	`GasPriceProvider \| None`	Optional gas price provider for historical gas prices. When provided and config.use_historical_gas_gwei=True, the engine will fetch historical gas prices at each simulation timestamp.
`mev_simulator`	`GasPriceProvider \| None`	Optional MEV simulator (created dynamically based on config)
`strategy_type`	`str \| None`	Optional explicit strategy type for adapter selection. If "auto" (default), the type is detected from strategy metadata. Valid values: "lp", "perp", "lending", "arbitrage", "swap", "auto", or None.
`data_config`	`BacktestDataConfig \| None`	Optional BacktestDataConfig for controlling historical data providers in adapters. When provided, adapters will use historical volume, funding rates, and APY data from real sources instead of fallback values.

Example

backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, )

result = await backtester.backtest(my_strategy, config) print(result.summary())

With explicit strategy type:¶

backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, strategy_type="lp", # Force LP adapter )

With BacktestDataConfig for historical data:¶

from almanak.framework.backtesting.config import BacktestDataConfig

data_config = BacktestDataConfig( use_historical_volume=True, use_historical_funding=True, use_historical_apy=True, ) backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, data_config=data_config, )

gas_provider `class-attribute` `instance-attribute` ¶

gas_provider: GasPriceProvider | None = None

Optional gas price provider for historical gas prices.

When provided and config.use_historical_gas_gwei=True, the engine will fetch historical gas prices at each simulation timestamp instead of using the static config.gas_price_gwei value.

Example

from almanak.framework.backtesting.pnl.providers import EtherscanGasPriceProvider

gas_provider = EtherscanGasPriceProvider( api_keys={"ethereum": "your-key"}, ) backtester = PnLBacktester( data_provider=data_provider, fee_models=fee_models, slippage_models=slippage_models, gas_provider=gas_provider, )

data_config `class-attribute` `instance-attribute` ¶

data_config: BacktestDataConfig | None = None

Optional BacktestDataConfig for controlling historical data providers.

When provided, this configuration is passed to strategy-specific adapters (LP, Perp, Lending) to control historical data provider behavior: - use_historical_volume: Fetch LP fee data from subgraphs - use_historical_funding: Fetch perp funding rates from APIs - use_historical_apy: Fetch lending APY from subgraphs - strict_historical_mode: Fail if historical data unavailable - Fallback values for when historical data is unavailable - Rate limiting configuration for CoinGecko and The Graph - Cache settings for persistent data storage

Example

from almanak.framework.backtesting.config import BacktestDataConfig

data_config = BacktestDataConfig( use_historical_volume=True, use_historical_funding=True, use_historical_apy=True, strict_historical_mode=False, ) backtester = PnLBacktester( data_provider=data_provider, fee_models=fee_models, slippage_models=slippage_models, data_config=data_config, )

__post_init__ ¶

__post_init__() -> None

Validate configuration after initialization.

run_preflight_validation `async` ¶

run_preflight_validation(
    config: PnLBacktestConfig,
) -> PreflightReport

Run preflight validation checks before starting a backtest.

Performs validation checks to ensure data requirements can be met: - Checks price data availability for all tokens in config - Verifies data provider capabilities match requirements - Tests archive node accessibility if historical TWAP/Chainlink needed - Estimates data coverage based on provider capabilities

Parameters:

Name	Type	Description	Default
`config`	`PnLBacktestConfig`	Backtest configuration specifying tokens, time range, etc.	required

Returns:

Type	Description
`PreflightReport`	PreflightReport with pass/fail status and detailed check results.

Example

preflight = await backtester.run_preflight_validation(config) if not preflight.passed: print(preflight.summary()) # Handle validation failure else: result = await backtester.backtest(strategy, config)

backtest `async` ¶

backtest(
    strategy: BacktestableStrategy,
    config: PnLBacktestConfig,
) -> BacktestResult

Run a backtest for a strategy over the configured period.

This method: 1. Initializes a simulated portfolio with initial capital 2. Creates a HistoricalDataConfig from the backtest config 3. Iterates through historical market states 4. For each time step: a. Creates a MarketSnapshot from MarketState b. Calls strategy.decide(snapshot) to get intent c. Queues intent for execution (with inclusion delay) d. Executes queued intents e. Marks portfolio to market 5. Calculates final metrics and returns BacktestResult

Parameters:

Name	Type	Description	Default
`strategy`	`BacktestableStrategy`	Strategy to backtest (must implement BacktestableStrategy)	required
`config`	`PnLBacktestConfig`	Backtest configuration (time range, capital, models, etc.)	required

Returns:

Type	Description
`BacktestResult`	BacktestResult with metrics, trades, and equity curve

Raises:

Type	Description
`ValueError`	If strategy is not compatible with backtesting

get_fee_model ¶

get_fee_model(protocol: str) -> FeeModel

Get the fee model for a protocol.

Parameters:

Name	Type	Description	Default
`protocol`	`str`	Protocol name (e.g., "uniswap_v3", "aave_v3")	required

Returns:

Type	Description
`FeeModel`	FeeModel for the protocol, or default if not found

get_slippage_model ¶

get_slippage_model(protocol: str) -> SlippageModel

Get the slippage model for a protocol.

Parameters:

Name	Type	Description	Default
`protocol`	`str`	Protocol name (e.g., "uniswap_v3", "gmx")	required

Returns:

Type	Description
`SlippageModel`	SlippageModel for the protocol, or default if not found

PnLBacktestConfig¶

almanak.framework.backtesting.PnLBacktestConfig `dataclass` ¶

PnLBacktestConfig(
    start_time: datetime,
    end_time: datetime,
    interval_seconds: int = 3600,
    initial_capital_usd: Decimal = Decimal("10000"),
    fee_model: str = "realistic",
    slippage_model: str = "realistic",
    include_gas_costs: bool = True,
    gas_price_gwei: Decimal = Decimal("30"),
    inclusion_delay_blocks: int = 1,
    chain: str = "arbitrum",
    tokens: list[str] = (lambda: ["WETH", "USDC"])(),
    benchmark_token: str = "WETH",
    risk_free_rate: Decimal = Decimal("0.05"),
    trading_days_per_year: int = 365,
    initial_margin_ratio: Decimal = Decimal("0.1"),
    maintenance_margin_ratio: Decimal = Decimal("0.05"),
    mev_simulation_enabled: bool = False,
    auto_correct_positions: bool = False,
    reconciliation_alert_threshold_pct: Decimal = Decimal(
        "0.05"
    ),
    random_seed: int | None = None,
    strict_reproducibility: bool = False,
    staleness_threshold_seconds: int = 3600,
    institutional_mode: bool = False,
    min_data_coverage: Decimal = Decimal("0.98"),
    allow_hardcoded_fallback: bool = False,
    allow_degraded_data: bool = True,
    require_symbol_mapping: bool = False,
    use_historical_gas_prices: bool = False,
    gas_eth_price_override: Decimal | None = None,
    use_historical_gas_gwei: bool = False,
    track_gas_prices: bool = False,
    preflight_validation: bool = True,
    fail_on_preflight_error: bool = True,
)

Configuration for a PnL backtest simulation.

Controls all parameters of the backtest including time range, initial capital, fee/slippage models, gas costs, and execution delay simulation.

Attributes:

Name	Type	Description
`start_time`	`datetime`	Start of the backtest period (inclusive)
`end_time`	`datetime`	End of the backtest period (inclusive)
`interval_seconds`	`int`	Time between simulation ticks in seconds (default: 3600 = 1 hour)
`initial_capital_usd`	`Decimal`	Starting capital in USD
`fee_model`	`str`	Fee model to use - 'realistic', 'zero', or protocol-specific (e.g., 'uniswap_v3', 'aave_v3', 'gmx')
`slippage_model`	`str`	Slippage model to use - 'realistic', 'zero', or protocol-specific (e.g., 'liquidity_aware', 'constant')
`include_gas_costs`	`bool`	Whether to include gas costs in PnL calculations
`gas_price_gwei`	`Decimal`	Gas price to use for cost calculations (default: 30 gwei)
`inclusion_delay_blocks`	`int`	Number of blocks to delay intent execution to simulate realistic trade timing (default: 1). When > 0, intents are queued and executed in the next iteration(s) rather than immediately.
`chain`	`str`	Blockchain to simulate execution on (default: 'arbitrum')
`tokens`	`list[str]`	List of tokens to track prices for (default: ['WETH', 'USDC'])
`benchmark_token`	`str`	Token to use for benchmark comparisons (default: 'WETH')
`risk_free_rate`	`Decimal`	Annual risk-free rate for Sharpe ratio calculation (default: 0.05)
`trading_days_per_year`	`int`	Number of trading days for annualization (default: 365)
`initial_margin_ratio`	`Decimal`	Initial margin ratio for opening perp positions (default: 0.1 = 10%)
`maintenance_margin_ratio`	`Decimal`	Maintenance margin ratio for liquidation (default: 0.05 = 5%)
`mev_simulation_enabled`	`bool`	Enable MEV cost simulation for realistic execution costs (default: False)
`auto_correct_positions`	`bool`	Enable auto-correction of tracked positions when discrepancies are detected
`reconciliation_alert_threshold_pct`	`Decimal`	Threshold percentage for triggering reconciliation alerts (default: 5%)

Example

config = PnLBacktestConfig( start_time=datetime(2024, 1, 1), end_time=datetime(2024, 6, 1), initial_capital_usd=Decimal("10000"), ) print(f"Duration: {config.duration_days:.1f} days") print(f"Estimated ticks: {config.estimated_ticks}")

initial_margin_ratio `class-attribute` `instance-attribute` ¶

initial_margin_ratio: Decimal = Decimal('0.1')

Initial margin ratio required to open a position (default: 0.1 = 10%). This is the minimum margin/position_value ratio required to open a new perp position.

maintenance_margin_ratio `class-attribute` `instance-attribute` ¶

maintenance_margin_ratio: Decimal = Decimal('0.05')

Maintenance margin ratio for liquidation threshold (default: 0.05 = 5%). When margin/position_value falls below this, the position may be liquidated.

mev_simulation_enabled `class-attribute` `instance-attribute` ¶

mev_simulation_enabled: bool = False

Enable MEV (Maximal Extractable Value) cost simulation (default: False). When enabled, simulates sandwich attack probability and additional slippage based on trade size and token characteristics. Adds estimated MEV costs to trade records and total MEV cost to backtest metrics.

auto_correct_positions `class-attribute` `instance-attribute` ¶

auto_correct_positions: bool = False

Enable automatic position correction when discrepancies are detected (default: False). When enabled, the reconciliation process will update tracked positions to match actual on-chain state when discrepancies exceed the alert threshold. Corrected positions will have auto_corrected=True in their ReconciliationEvent.

reconciliation_alert_threshold_pct `class-attribute` `instance-attribute` ¶

reconciliation_alert_threshold_pct: Decimal = Decimal(
    "0.05"
)

Threshold percentage for triggering reconciliation alerts (default: 5%). When discrepancy_pct exceeds this threshold, an alert is emitted. Set to 0 to alert on any discrepancy, or higher values to only alert on significant drift.

random_seed `class-attribute` `instance-attribute` ¶

random_seed: int | None = None

Random seed for reproducibility (default: None = no seed). When set, any randomness in the backtest (e.g., Monte Carlo simulations, random sampling) will use this seed for reproducibility. The seed is recorded in the config for re-running identical backtests.

strict_reproducibility `class-attribute` `instance-attribute` ¶

strict_reproducibility: bool = False

Enforce strict reproducibility mode (default: False).

When enabled, the backtest will raise errors instead of using fallbacks that could produce non-deterministic results:

Raises ValueError if simulation timestamp is missing (instead of using datetime.now())
Raises ValueError if required historical data is unavailable
Requires all price sources to provide historical data, not just current prices

Use this mode when you need byte-identical results across multiple runs with the same configuration and random_seed. When disabled, the backtester will use reasonable defaults and log warnings instead of failing.

staleness_threshold_seconds `class-attribute` `instance-attribute` ¶

staleness_threshold_seconds: int = 3600

Threshold in seconds for marking price data as stale (default: 3600 = 1 hour).

Price data older than this threshold relative to the simulation timestamp will be counted as stale in the data quality report. This helps identify backtests that may be using outdated price information.

Set to 0 to disable staleness tracking.

institutional_mode `class-attribute` `instance-attribute` ¶

institutional_mode: bool = False

Enable institutional-grade enforcement mode (default: False).

When enabled, applies stricter data quality requirements suitable for institutional trading operations:

Fails backtest if data coverage is below min_data_coverage threshold
Disables hardcoded price fallbacks (allow_hardcoded_fallback=False)
Requires historical price data from verified sources
Enforces strict reproducibility (strict_reproducibility=True)

This mode is designed for production-grade backtests where data quality and reproducibility are critical. Use for institutional trading strategies or when accurate PnL calculations are required for compliance/reporting.

min_data_coverage `class-attribute` `instance-attribute` ¶

min_data_coverage: Decimal = Decimal('0.98')

Minimum data coverage ratio required in institutional mode (default: 0.98 = 98%).

When institutional_mode is enabled, the backtest will fail if the actual data coverage ratio (successful price lookups / total lookups) falls below this threshold.

When institutional_mode is disabled, this threshold is only used for warnings in the data quality report, not enforcement.

Valid range: 0.0 to 1.0 (0% to 100%)

allow_hardcoded_fallback `class-attribute` `instance-attribute` ¶

allow_hardcoded_fallback: bool = False

Allow hardcoded price fallbacks when price data is unavailable (default: False).

When disabled (default): The backtester will raise an error if it cannot find price data, ensuring that all valuations use actual market prices. This is the institutional-grade setting for production backtests.

When enabled: The backtester may use hardcoded fallback prices for tokens when historical price data is unavailable. This can mask data quality issues and should only be used for development/testing where price accuracy is not critical.

Note: This is automatically set to False when institutional_mode=True in post_init, as institutional-grade backtests should never use arbitrary hardcoded prices.

Environment variable: Set ALMANAK_ALLOW_HARDCODED_PRICES=1 to override for testing scenarios where you need relaxed defaults.

allow_degraded_data `class-attribute` `instance-attribute` ¶

allow_degraded_data: bool = True

Allow backtests to proceed with degraded or incomplete data (default: True).

When enabled, the backtester will continue execution even when: - Some price data is missing or interpolated - Data sources return stale information - Historical data has gaps

When disabled, the backtester will fail fast if data quality issues are detected, ensuring only high-quality data is used for analysis.

Note: This is automatically set to False when institutional_mode=True in post_init, as institutional-grade backtests require complete data.

require_symbol_mapping `class-attribute` `instance-attribute` ¶

require_symbol_mapping: bool = False

Require all token addresses to be resolved to symbols (default: False).

When enabled, the backtester will fail if any token address cannot be resolved to a human-readable symbol. This ensures all trade records and reports use consistent, recognizable token names.

When disabled, unresolved token addresses are used as-is (checksummed), which may make reports harder to read and audit.

Note: This is automatically set to True when institutional_mode=True in post_init, as institutional-grade backtests require clear symbol identification for compliance and reporting purposes.

use_historical_gas_prices `class-attribute` `instance-attribute` ¶

use_historical_gas_prices: bool = False

Use historical gas prices for accurate gas cost simulation (default: False).

When enabled, the backtester will attempt to fetch historical ETH prices at each simulation timestamp to calculate gas costs more accurately. This provides realistic gas cost estimates that reflect market conditions at the time of simulated trades.

When disabled, gas costs use the current ETH price or gas_eth_price_override if specified. This is faster but less accurate for historical backtests.

Note: Requires a data provider that supports historical price lookups.

gas_eth_price_override `class-attribute` `instance-attribute` ¶

gas_eth_price_override: Decimal | None = None

Override ETH price for gas cost calculations (default: None = use market price).

When set, this value is used as the ETH price for all gas cost calculations, ignoring both historical and current market prices. This is useful for:

Testing with a fixed ETH price for reproducibility
Stress testing with extreme ETH price scenarios
Backtests where gas cost accuracy is not critical

When None, gas costs use: 1. Historical ETH price (if use_historical_gas_prices=True) 2. Current ETH price from data provider 3. Default fallback ($3000) with warning if unavailable

Value should be in USD (e.g., Decimal("3000") for $3000 per ETH).

use_historical_gas_gwei `class-attribute` `instance-attribute` ¶

use_historical_gas_gwei: bool = False

Use historical gas prices (gwei) from gas price provider (default: False).

When enabled and a gas_provider is attached to the PnLBacktester, the engine will fetch historical gas prices at each simulation timestamp instead of using the static gas_price_gwei value. This provides more realistic gas cost estimates that reflect network congestion at historical timestamps.

Priority order for gas price (gwei): 1. Historical gas price from gas_provider (if use_historical_gas_gwei=True) 2. MarketState.gas_price_gwei (if populated by data provider) 3. config.gas_price_gwei (static default: 30 gwei)

When disabled, gas costs use the static gas_price_gwei for all trades, which is faster but may not reflect actual network conditions.

Note: Requires a GasPriceProvider (e.g., EtherscanGasPriceProvider) to be passed to the PnLBacktester. If enabled without a provider, falls back to MarketState.gas_price_gwei or config.gas_price_gwei with a warning.

track_gas_prices `class-attribute` `instance-attribute` ¶

track_gas_prices: bool = False

Track detailed gas price records for each trade (default: False).

When enabled, the backtester records a GasPriceRecord for each trade, capturing the gas price in gwei, source, and USD cost. These records are stored in BacktestResult.gas_prices_used for detailed analysis.

This is useful for: - Analyzing gas price volatility impact on strategy performance - Understanding gas cost breakdown by source (historical vs config) - Auditing gas costs in institutional-grade backtests

When disabled, only summary statistics (gas_price_summary) are populated from the TradeRecord.gas_price_gwei values, reducing result size.

Note: Gas price summary statistics are always calculated regardless of this setting, since TradeRecord already contains gas_price_gwei.

preflight_validation `class-attribute` `instance-attribute` ¶

preflight_validation: bool = True

Enable preflight validation before running backtest (default: True).

When enabled, the backtester performs validation checks before starting the simulation to ensure data requirements can be met:

Checks price data availability for all tokens in config
Verifies data provider capabilities match requirements
Tests archive node accessibility if historical TWAP/Chainlink needed
Reports estimated data coverage and potential gaps

Results are returned in a PreflightReport with pass/fail and details. This helps identify data issues early, before spending time on a backtest that would fail or produce inaccurate results.

When disabled, the backtest proceeds without validation, which is faster but may encounter data issues during simulation.

fail_on_preflight_error `class-attribute` `instance-attribute` ¶

fail_on_preflight_error: bool = True

Fail fast if preflight validation fails (default: True).

When enabled (True): If preflight validation detects critical issues (e.g., missing price data, insufficient data coverage), the backtester raises PreflightValidationError with an actionable error message that includes: - What failed (specific checks that did not pass) - Why it failed (the underlying cause) - How to fix it (recommendations for resolution)

When disabled (False): The backtester logs warnings about preflight issues but continues in degraded mode. This is useful for exploratory backtests where you want to see partial results even with data gaps.

The preflight_passed field in BacktestResult indicates whether preflight validation passed, regardless of this setting.

Note: This setting only applies when preflight_validation=True.

duration_seconds `property` ¶

duration_seconds: int

Get the total backtest duration in seconds.

duration_days `property` ¶

duration_days: float

Get the total backtest duration in days.

duration_hours `property` ¶

duration_hours: float

Get the total backtest duration in hours.

estimated_ticks `property` ¶

estimated_ticks: int

Get the estimated number of simulation ticks.

interval_hours `property` ¶

interval_hours: float

Get the interval between ticks in hours.

__post_init__ ¶

__post_init__() -> None

Validate configuration after initialization.

get_gas_cost_usd ¶

get_gas_cost_usd(
    gas_used: int, eth_price_usd: Decimal
) -> Decimal

Calculate gas cost in USD for a given amount of gas used.

Parameters:

Name	Type	Description	Default
`gas_used`	`int`	Amount of gas consumed by the transaction	required
`eth_price_usd`	`Decimal`	Current ETH price in USD	required

Returns:

Type	Description
`Decimal`	Gas cost in USD

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dictionary.

to_dict_with_metadata ¶

to_dict_with_metadata(
    data_provider_info: dict[str, Any] | None = None,
) -> dict[str, Any]

Serialize to dictionary with full metadata for reproducibility.

This method extends to_dict() to include additional metadata needed to reproduce a backtest exactly, such as: - Data provider versions and timestamps - SDK/framework versions - Run timestamp

Parameters:

Name	Type	Description	Default
`data_provider_info`	`dict[str, Any] \| None`	Optional dict containing data provider information: - name: Provider name (e.g., "coingecko", "chainlink") - version: Provider version if available - data_fetched_at: ISO timestamp when data was fetched - cache_hit_rate: Optional cache hit rate percentage - additional provider-specific metadata	`None`

Returns:

Type	Description
`dict[str, Any]`	Dictionary with full config and metadata for reproducibility

calculate_config_hash ¶

calculate_config_hash() -> str

Calculate a deterministic hash of the configuration for verification.

The hash is calculated from all configuration parameters that affect backtest results, excluding runtime metadata like timestamps. This enables verification that a backtest was run with identical config.

The hash uses SHA-256 and includes: - Time range (start_time, end_time, interval_seconds) - Capital settings (initial_capital_usd) - Model settings (fee_model, slippage_model) - Gas settings (include_gas_costs, gas_price_gwei) - Execution settings (inclusion_delay_blocks) - Chain and token settings - Metrics settings (benchmark_token, risk_free_rate, etc.) - Margin settings - Other simulation parameters

Returns:

Type	Description
`str`	64-character hex string (SHA-256 hash)

Example

config = PnLBacktestConfig(...) hash1 = config.calculate_config_hash()

Same config produces same hash¶

config2 = PnLBacktestConfig(...) # identical params hash2 = config2.calculate_config_hash() assert hash1 == hash2

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> PnLBacktestConfig

Deserialize from dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary containing config fields	required

Returns:

Type	Description
`PnLBacktestConfig`	PnLBacktestConfig instance

repr ¶

__repr__() -> str

Return a human-readable representation.

Paper Trader¶

PaperTrader¶

almanak.framework.backtesting.PaperTrader `dataclass` ¶

PaperTrader(
    fork_manager: RollingForkManager,
    portfolio_tracker: PaperPortfolioTracker,
    config: PaperTraderConfig,
    event_callback: PaperTradeEventCallback | None = None,
)

Main paper trading engine for fork-based strategy simulation.

The PaperTrader executes strategy decisions on local Anvil forks, providing accurate simulation of real DeFi execution. It:

Manages fork lifecycle via RollingForkManager
Calls strategy.decide() at configured intervals
Compiles intents to ActionBundles
Executes transactions on the fork via ExecutionOrchestrator
Tracks portfolio state and records trades
Calculates comprehensive performance metrics

Attributes:

Name	Type	Description
`fork_manager`	`RollingForkManager`	RollingForkManager for Anvil fork lifecycle
`portfolio_tracker`	`PaperPortfolioTracker`	PaperPortfolioTracker for state tracking
`config`	`PaperTraderConfig`	PaperTraderConfig with execution parameters
`event_callback`	`PaperTradeEventCallback \| None`	Optional callback for trading events

Example

trader = PaperTrader( fork_manager=fork_manager, portfolio_tracker=portfolio_tracker, config=PaperTraderConfig(tick_interval_seconds=60), )

Run for 1 hour¶

result = await trader.run(my_strategy, duration_seconds=3600)

Or run indefinitely until stopped¶

await trader.start(my_strategy)

... later ...¶

await trader.stop()

__post_init__ ¶

__post_init__() -> None

Validate configuration after initialization.

run `async` ¶

run(
    strategy: PaperTradeableStrategy,
    duration_seconds: float | None = None,
    max_ticks: int | None = None,
) -> BacktestResult

Run a paper trading session for the specified duration.

This is the main entry point for paper trading. It: 1. Initializes the fork and orchestrator 2. Runs the trading loop for the specified duration 3. Calculates and returns metrics

Parameters:

Name	Type	Description	Default
`strategy`	`PaperTradeableStrategy`	Strategy to paper trade	required
`duration_seconds`	`float \| None`	Maximum duration in seconds (None = config default)	`None`
`max_ticks`	`int \| None`	Maximum number of ticks (None = no limit)	`None`

Returns:

Type	Description
`BacktestResult`	BacktestResult with comprehensive metrics and trades

Raises:

Type	Description
`RuntimeError`	If already running

start `async` ¶

start(strategy: PaperTradeableStrategy) -> None

Start continuous paper trading until stop() is called.

This method runs paper trading indefinitely. Call stop() to end the session gracefully.

Parameters:

Name	Type	Description	Default
`strategy`	`PaperTradeableStrategy`	Strategy to paper trade	required

Raises:

Type	Description
`RuntimeError`	If already running

stop `async` ¶

stop() -> None

Stop the current paper trading session.

Signals the trading loop to exit gracefully. The current tick will complete before stopping.

is_running ¶

is_running() -> bool

Check if paper trading is currently active.

Returns:

Type	Description
`bool`	True if a session is running

tick `async` ¶

tick() -> PaperTrade | None

Execute one trading cycle (tick) manually.

This method allows manual tick execution for testing or custom integration. It performs one complete trading cycle:

Optionally resets fork to latest block (based on config)
Creates MarketSnapshot from current fork state
Calls strategy.decide(snapshot) to get intent
If intent returned (non-HOLD), executes via orchestrator on fork
Records trade result in portfolio_tracker
Handles and records errors gracefully

Prerequisites

PaperTrader must be initialized (call start() or run() first)
A strategy must be set via _current_strategy

Returns:

Type	Description
`PaperTrade \| None`	PaperTrade if a trade was executed successfully, None otherwise
`PaperTrade \| None`	(including HOLD decisions, errors, or no strategy set)

Example

Manual tick control¶

trader = PaperTrader(fork_manager, portfolio_tracker, config) await trader._initialize_fork() await trader._initialize_orchestrator() trader._current_strategy = my_strategy trader._running = True

Execute single tick¶

trade = await trader.tick() if trade: print(f"Trade executed: {trade.tx_hash}")

run_loop `async` ¶

run_loop(
    strategy: PaperTradeableStrategy,
    max_ticks: int | None = None,
) -> PaperTradingSummary

Run a paper trading session with a simple tick loop.

This method implements the classic paper trading loop pattern: 1. Initialize fork and orchestrator 2. Loop: call tick(), sleep for tick_interval_seconds 3. Stop when max_ticks reached or _running becomes False 4. Cleanup in finally block

Unlike run(), which returns a comprehensive BacktestResult, this method returns a simpler PaperTradingSummary focused on trade statistics.

Parameters:

Name	Type	Description	Default
`strategy`	`PaperTradeableStrategy`	Strategy to paper trade	required
`max_ticks`	`int \| None`	Maximum number of ticks to run (None = use config.max_ticks, if that's also None, runs until stop() is called)	`None`

Returns:

Type	Description
`PaperTradingSummary`	PaperTradingSummary with session statistics and trade details

Raises:

Type	Description
`RuntimeError`	If already running

Example

trader = PaperTrader(fork_manager, portfolio_tracker, config) summary = await trader.run_loop(my_strategy, max_ticks=100) print(summary.summary())

PaperTraderConfig¶

almanak.framework.backtesting.PaperTraderConfig `dataclass` ¶

PaperTraderConfig(
    chain: str,
    rpc_url: str,
    strategy_id: str,
    initial_eth: Decimal = Decimal("10"),
    initial_tokens: dict[str, Decimal] = dict(),
    tick_interval_seconds: int = 60,
    max_ticks: int | None = None,
    anvil_port: int = 8546,
    reset_fork_every_tick: bool = True,
    startup_timeout_seconds: float = 30.0,
    auto_impersonate: bool = True,
    block_time: int | None = None,
    wallet_address: str | None = None,
    log_trades: bool = True,
    log_level: str = "INFO",
    price_source: Literal[
        "coingecko", "chainlink", "twap", "auto"
    ] = "auto",
    strict_price_mode: bool = True,
    allow_hardcoded_fallback: bool | None = None,
)

Configuration for a paper trading session.

Controls all parameters of the paper trading session including chain, initial balances, tick intervals, and Anvil fork settings.

Paper trading executes real transactions on a local Anvil fork, allowing strategies to be validated with actual DeFi protocol interactions before deployment with real capital.

Attributes:

Name	Type	Description
`chain`	`str`	Blockchain to paper trade on (e.g., "arbitrum", "ethereum")
`rpc_url`	`str`	Archive RPC URL to fork from (Alchemy, Infura, etc.)
`strategy_id`	`str`	Identifier of the strategy being tested
`initial_eth`	`Decimal`	Initial ETH balance for the paper wallet (default: 10)
`initial_tokens`	`dict[str, Decimal]`	Dict of token symbol to amount for initial balances
`tick_interval_seconds`	`int`	Time between trading ticks in seconds (default: 60)
`max_ticks`	`int \| None`	Maximum number of ticks to run, None = run indefinitely
`anvil_port`	`int`	Port to run Anvil on (default: 8546)
`reset_fork_every_tick`	`bool`	Whether to reset fork to latest block each tick (default: True)
`startup_timeout_seconds`	`float`	Timeout for Anvil startup (default: 30)
`auto_impersonate`	`bool`	Enable auto-impersonation for any address (default: True)
`block_time`	`int \| None`	Optional block time in seconds (default: None = instant)
`wallet_address`	`str \| None`	Optional paper wallet address (default: None = auto-generated)
`log_trades`	`bool`	Whether to log individual trades (default: True)
`log_level`	`str`	Logging level for paper trader (default: "INFO")
`price_source`	`Literal['coingecko', 'chainlink', 'twap', 'auto']`	Price source to use ('coingecko', 'chainlink', 'twap', 'auto')

Example

config = PaperTraderConfig( chain="arbitrum", rpc_url="https://arb1.arbitrum.io/rpc", strategy_id="momentum_v1", initial_eth=Decimal("10"), initial_tokens={"USDC": Decimal("10000")}, ) print(f"Chain: {config.chain} (ID: {config.chain_id})") print(f"Max duration: {config.max_duration_seconds}s")

price_source `class-attribute` `instance-attribute` ¶

price_source: Literal[
    "coingecko", "chainlink", "twap", "auto"
] = "auto"

Price source to use for portfolio valuation.

Options

'coingecko': Use CoinGecko API for market prices. Best for: General tokens, off-chain price feeds, no RPC needed.
'chainlink': Use Chainlink oracles for on-chain prices. Best for: Major tokens with Chainlink feeds, trustless pricing.
'twap': Use time-weighted average price from DEX pools. Best for: On-chain pricing, newer tokens, DEX-native prices.
'auto' (default): Automatic fallback chain - tries Chainlink first, falls back to TWAP, then CoinGecko if others fail.

strict_price_mode `class-attribute` `instance-attribute` ¶

strict_price_mode: bool = True

Whether to fail when price providers cannot return a price.

When True (default): Raises ValueError if all price providers fail for a token. This is the institutional-grade setting that ensures all prices are from real data sources. Use this for production backtests where accuracy is critical. Error messages include the failed token and chain for debugging.

When False: Falls back to hardcoded prices for common tokens (ETH=$3000, BTC=$60000, etc.) when all price providers fail. This allows backtests to complete but may produce inaccurate results. Only use this for development/testing where price accuracy is not critical.

Note: This is the inverse of the deprecated allow_hardcoded_fallback field. If both are set, strict_price_mode takes precedence.

Environment variable: Set ALMANAK_ALLOW_HARDCODED_PRICES=1 to override strict_price_mode=False for testing scenarios.

allow_hardcoded_fallback `class-attribute` `instance-attribute` ¶

allow_hardcoded_fallback: bool | None = None

DEPRECATED: Use strict_price_mode instead.

This field is kept for backward compatibility. If set, it will be converted to the equivalent strict_price_mode value (allow_hardcoded_fallback=False is equivalent to strict_price_mode=True).

Will be removed in a future version.

chain_id `property` ¶

chain_id: int

Get the chain ID for the configured chain.

max_duration_seconds `property` ¶

max_duration_seconds: int | None

Get the maximum duration in seconds, or None if indefinite.

max_duration_minutes `property` ¶

max_duration_minutes: float | None

Get the maximum duration in minutes, or None if indefinite.

max_duration_hours `property` ¶

max_duration_hours: float | None

Get the maximum duration in hours, or None if indefinite.

tick_interval_minutes `property` ¶

tick_interval_minutes: float

Get the tick interval in minutes.

fork_rpc_url `property` ¶

fork_rpc_url: str

Get the local fork RPC URL.

__post_init__ ¶

__post_init__() -> None

Validate configuration after initialization.

get_initial_balances ¶

get_initial_balances() -> dict[str, Decimal]

Get all initial balances including ETH.

Returns:

Type	Description
`dict[str, Decimal]`	Dictionary of token symbol to initial balance amount

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dictionary.

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> PaperTraderConfig

Deserialize from dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary containing config fields	required

Returns:

Type	Description
`PaperTraderConfig`	PaperTraderConfig instance

repr ¶

__repr__() -> str

Return a human-readable representation.

Results¶

BacktestResult¶

almanak.framework.backtesting.BacktestResult `dataclass` ¶

BacktestResult(
    engine: BacktestEngine,
    strategy_id: str,
    start_time: datetime,
    end_time: datetime,
    metrics: BacktestMetrics,
    trades: list[TradeRecord] = list(),
    equity_curve: list[EquityPoint] = list(),
    initial_capital_usd: Decimal = Decimal("10000"),
    final_capital_usd: Decimal = Decimal("10000"),
    chain: str = "arbitrum",
    run_started_at: datetime | None = None,
    run_ended_at: datetime | None = None,
    run_duration_seconds: float = 0.0,
    config: dict[str, Any] = dict(),
    error: str | None = None,
    lending_liquidations: list[
        LendingLiquidationEvent
    ] = list(),
    aggregated_portfolio_view: AggregatedPortfolioView
    | None = None,
    reconciliation_events: list[
        ReconciliationEvent
    ] = list(),
    walk_forward_results: WalkForwardResult | None = None,
    monte_carlo_results: MonteCarloSimulationResult
    | None = None,
    crisis_results: CrisisMetrics | None = None,
    errors: list[dict[str, Any]] = list(),
    backtest_id: str | None = None,
    phase_timings: list[dict[str, Any]] = list(),
    config_hash: str | None = None,
    execution_delayed_at_end: int = 0,
    data_source_capabilities: dict[
        str, HistoricalDataCapability
    ] = dict(),
    data_source_warnings: list[str] = list(),
    data_quality: DataQualityReport | None = None,
    institutional_compliance: bool = True,
    compliance_violations: list[str] = list(),
    fallback_usage: dict[str, int] = dict(),
    preflight_report: PreflightReport | None = None,
    preflight_passed: bool = True,
    gas_prices_used: list[GasPriceRecord] = list(),
    gas_price_summary: GasPriceSummary | None = None,
    parameter_sources: ParameterSourceTracker | None = None,
    accuracy_estimate: AccuracyEstimate | None = None,
    data_coverage_metrics: DataCoverageMetrics
    | None = None,
)

Complete results from a backtest run.

This model is used by both the PnL Backtester and Paper Trader to provide consistent result formatting and analysis.

Attributes:

Name	Type	Description
`engine`	`BacktestEngine`	Which backtesting engine was used (pnl or paper)
`strategy_id`	`str`	Identifier of the strategy being tested
`start_time`	`datetime`	When the backtest started (simulation time)
`end_time`	`datetime`	When the backtest ended (simulation time)
`metrics`	`BacktestMetrics`	Calculated performance metrics
`trades`	`list[TradeRecord]`	List of all trade records
`equity_curve`	`list[EquityPoint]`	Portfolio value over time
`initial_capital_usd`	`Decimal`	Starting capital in USD
`final_capital_usd`	`Decimal`	Ending capital in USD
`chain`	`str`	Target blockchain (arbitrum, base, etc.)
`run_started_at`	`datetime \| None`	When the backtest run actually started (wall time)
`run_ended_at`	`datetime \| None`	When the backtest run actually completed (wall time)
`run_duration_seconds`	`float`	Wall clock duration of the backtest run
`config`	`dict[str, Any]`	Configuration used for the backtest
`error`	`str \| None`	Error message if backtest failed
`lending_liquidations`	`list[LendingLiquidationEvent]`	List of lending liquidation events that occurred
`aggregated_portfolio_view`	`AggregatedPortfolioView \| None`	Tick-by-tick portfolio state snapshots with risk scores
`reconciliation_events`	`list[ReconciliationEvent]`	List of position reconciliation events (discrepancies detected)
`walk_forward_results`	`WalkForwardResult \| None`	Results from walk-forward optimization (if run with --walk-forward)
`monte_carlo_results`	`MonteCarloSimulationResult \| None`	Results from Monte Carlo simulation (if run with --monte-carlo). Contains return confidence intervals, drawdown probabilities, and path statistics.
`crisis_results`	`CrisisMetrics \| None`	Crisis-specific metrics when backtest was run during a crisis scenario. Contains drawdown analysis, recovery time, and comparison to normal period performance.
`errors`	`list[dict[str, Any]]`	List of error records as dictionaries with timestamps and context for debugging and analysis. Each error dict contains: timestamp, error_type, error_message, classification (with error_type, category, is_recoverable, is_fatal, is_non_critical, suggested_action), context, and handled action.
`backtest_id`	`str \| None`	Unique correlation ID (UUID) for this backtest run. Used for structured logging and tracing across all log messages generated during this backtest.
`phase_timings`	`list[dict[str, Any]]`	List of phase timing records showing how long each backtest phase took. Each record contains: phase_name, start_time, end_time, duration_seconds, error. Useful for performance analysis and identifying bottlenecks.
`config_hash`	`str \| None`	SHA-256 hash of the configuration used for this backtest. Enables verification that a backtest was run with identical configuration. Calculated from all parameters that affect backtest results, excluding runtime metadata.
`execution_delayed_at_end`	`int`	Count of pending intents executed at simulation end. These were queued due to inclusion_delay_blocks > 0 and executed with the last market state when the simulation completed.
`data_source_capabilities`	`dict[str, HistoricalDataCapability]`	Dictionary mapping data provider names to their HistoricalDataCapability enum values. Shows which providers were used and their ability to provide accurate historical data (FULL, CURRENT_ONLY, PRE_CACHE). Useful for understanding potential data quality limitations in the backtest.
`data_source_warnings`	`list[str]`	List of warning messages about data source limitations. Generated when providers with CURRENT_ONLY or PRE_CACHE capability are used, as these may affect backtest accuracy.
`data_quality`	`DataQualityReport \| None`	Data quality metrics for the backtest run. Includes coverage ratio, source breakdown, stale data count, and interpolation count. Useful for understanding data reliability and identifying potential accuracy issues.
`institutional_compliance`	`bool`	Whether the backtest run meets institutional standards. Set to False when any strict reproducibility, data quality, or compliance check fails. Use compliance_violations to see which checks failed.
`compliance_violations`	`list[str]`	List of compliance violations that caused institutional_compliance to be set to False. Each entry describes a specific compliance failure such as "CURRENT_ONLY data provider used", "Symbol mapping failed for 0x...", "Data coverage below minimum threshold (95% < 98%)".
`fallback_usage`	`dict[str, int]`	Dictionary tracking count of each fallback type used during the backtest. Keys include: "hardcoded_price", "default_gas_price", "default_usd_amount". Empty dict means no fallbacks were used, which is the desired state for institutional-grade backtests.
`preflight_report`	`PreflightReport \| None`	Preflight validation report from checks run before the backtest. Contains pass/fail status, individual check results, estimated data coverage, and recommendations for fixing any issues. None if preflight validation was disabled.
`preflight_passed`	`bool`	Whether preflight validation passed (True) or failed (False). Defaults to True if preflight validation was disabled. This is a convenience field for quick checks - for full details, inspect preflight_report.
`parameter_sources`	`ParameterSourceTracker \| None`	Tracks the source of all configuration parameters for audit purposes. Contains detailed records of where each configuration value came from (default, config_file, env_var, explicit) for config parameters, (asset_specific, protocol_default, global_default) for liquidation thresholds, and (historical, fixed, provider) for APY/funding rates. Critical for institutional compliance.
`accuracy_estimate`	`AccuracyEstimate \| None`	Estimated accuracy of this backtest based on strategy type and data quality tier. Provides expected accuracy range (e.g., "90-95%") and primary error source. Derived from ACCURACY_MATRIX based on documented accuracy limitations.

gas_prices_used `class-attribute` `instance-attribute` ¶

gas_prices_used: list[GasPriceRecord] = field(
    default_factory=list
)

Optional detailed gas price records for each trade during the backtest.

When track_gas_prices=True in config, this list contains a GasPriceRecord for each trade showing the gas price used, its source, and USD cost. Useful for detailed gas cost analysis but may increase result size.

gas_price_summary `class-attribute` `instance-attribute` ¶

gas_price_summary: GasPriceSummary | None = None

Summary statistics for gas prices used during the backtest.

Contains min, max, mean, std of gas prices in gwei plus source breakdown. Always populated when trades occurred, regardless of track_gas_prices setting.

parameter_sources `class-attribute` `instance-attribute` ¶

parameter_sources: ParameterSourceTracker | None = None

Tracks the source of all configuration parameters for audit purposes.

Contains detailed records of where each configuration value came from: - Config parameters: default, config_file, env_var, explicit - Liquidation thresholds: asset_specific, protocol_default, global_default - APY/funding rates: historical, fixed, provider

This information is critical for institutional compliance and audit trails. When institutional_mode=True, this is always populated. The tracker provides summary dicts (config_sources, liquidation_sources, apy_funding_sources) for quick inspection and a full list of ParameterSourceRecord objects for detailed analysis.

accuracy_estimate `class-attribute` `instance-attribute` ¶

accuracy_estimate: AccuracyEstimate | None = None

Estimated accuracy of this backtest based on strategy type and data quality.

Provides a quick reference showing expected accuracy range (e.g., "90-95%") based on the detected strategy type (LP, perp, lending, arbitrage, spot) and the data quality tier used (FULL, PRE_CACHE, CURRENT_ONLY).

The estimate is derived from the ACCURACY_MATRIX which is based on documented accuracy limitations and golden test tolerances. See docs/ACCURACY_LIMITATIONS.md for the full accuracy matrix and methodology.

Example usage

if result.accuracy_estimate: print(f"Expected accuracy: {result.accuracy_estimate.confidence_interval}") print(f"Primary error source: {result.accuracy_estimate.primary_error_source}")

data_coverage_metrics `class-attribute` `instance-attribute` ¶

data_coverage_metrics: DataCoverageMetrics | None = None

Data coverage metrics tracking confidence levels across all position types.

Provides detailed breakdown of data quality for LP, Perp, Lending, and Slippage calculations. Includes confidence level breakdowns (high/medium/low) and data sources used for each position type.

The data_coverage_pct property gives overall percentage of HIGH confidence data points across all categories.

Example usage

if result.data_coverage_metrics: print(f"Data coverage: {result.data_coverage_metrics.data_coverage_pct:.1f}%") print(f"LP HIGH: {result.data_coverage_metrics.lp_metrics.high_confidence_pct:.1f}%")

success `property` ¶

success: bool

Check if backtest completed successfully.

simulation_duration_days `property` ¶

simulation_duration_days: float

Get the simulated duration in days.

total_return_pct `property` ¶

total_return_pct: Decimal

Get total return as a percentage.

used_any_fallback `property` ¶

used_any_fallback: bool

Check if any fallbacks were used during the backtest.

Returns True if the fallback_usage dict has any non-zero counts. When this is True, the backtest may have reduced accuracy due to using fallback values instead of real market data.

add_error ¶

add_error(error_dict: dict[str, Any]) -> None

Add an error record and log it with timestamp and context.

This method is used to track errors that occurred during the backtest, along with their timestamps, classification, and handling.

Parameters:

Name	Type	Description	Default
`error_dict`	`dict[str, Any]`	Serialized error record from ErrorRecord.to_dict() or equivalent dict with keys: timestamp, error_type, error_message, classification, context, handled	required

summary ¶

summary() -> str

Generate a human-readable summary of backtest results.

Returns:

Type	Description
`str`	Multi-line string with formatted backtest results

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dictionary.

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> BacktestResult

Deserialize from dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary with serialized BacktestResult data	required

Returns:

Type	Description
`BacktestResult`	BacktestResult instance

BacktestMetrics¶

almanak.framework.backtesting.BacktestMetrics `dataclass` ¶

BacktestMetrics(
    total_pnl_usd: Decimal = Decimal("0"),
    net_pnl_usd: Decimal = Decimal("0"),
    sharpe_ratio: Decimal = Decimal("0"),
    max_drawdown_pct: Decimal = Decimal("0"),
    win_rate: Decimal = Decimal("0"),
    total_trades: int = 0,
    profit_factor: Decimal = Decimal("0"),
    total_return_pct: Decimal = Decimal("0"),
    annualized_return_pct: Decimal = Decimal("0"),
    total_fees_usd: Decimal = Decimal("0"),
    total_slippage_usd: Decimal = Decimal("0"),
    total_gas_usd: Decimal = Decimal("0"),
    winning_trades: int = 0,
    losing_trades: int = 0,
    avg_trade_pnl_usd: Decimal = Decimal("0"),
    largest_win_usd: Decimal = Decimal("0"),
    largest_loss_usd: Decimal = Decimal("0"),
    avg_win_usd: Decimal = Decimal("0"),
    avg_loss_usd: Decimal = Decimal("0"),
    volatility: Decimal = Decimal("0"),
    sortino_ratio: Decimal = Decimal("0"),
    calmar_ratio: Decimal = Decimal("0"),
    total_fees_earned_usd: Decimal = Decimal("0"),
    fees_by_pool: dict[str, Decimal] = dict(),
    lp_fee_confidence_breakdown: dict[str, int] = dict(),
    total_funding_paid: Decimal = Decimal("0"),
    total_funding_received: Decimal = Decimal("0"),
    liquidations_count: int = 0,
    liquidation_losses_usd: Decimal = Decimal("0"),
    max_margin_utilization: Decimal = Decimal("0"),
    total_interest_earned: Decimal = Decimal("0"),
    total_interest_paid: Decimal = Decimal("0"),
    min_health_factor: Decimal = Decimal("999"),
    health_factor_warnings: int = 0,
    avg_gas_price_gwei: Decimal = Decimal("0"),
    max_gas_price_gwei: Decimal = Decimal("0"),
    total_gas_cost_usd: Decimal = Decimal("0"),
    total_mev_cost_usd: Decimal = Decimal("0"),
    total_leverage: Decimal = Decimal("0"),
    max_net_delta: dict[str, Decimal] = dict(),
    correlation_risk: Decimal | None = None,
    liquidation_cascade_risk: Decimal = Decimal("0"),
    information_ratio: Decimal | None = None,
    beta: Decimal | None = None,
    alpha: Decimal | None = None,
    benchmark_return: Decimal | None = None,
    pnl_by_protocol: dict[str, Decimal] = dict(),
    pnl_by_intent_type: dict[str, Decimal] = dict(),
    pnl_by_asset: dict[str, Decimal] = dict(),
    realized_pnl: Decimal = Decimal("0"),
    unrealized_pnl: Decimal = Decimal("0"),
)

Performance metrics calculated from backtest results.

All financial values are in USD. Ratios are decimal (0.1 = 10%).

Attributes:

Name	Type	Description
`total_pnl_usd`	`Decimal`	Total PnL before execution costs
`net_pnl_usd`	`Decimal`	Net PnL after all execution costs
`sharpe_ratio`	`Decimal`	Risk-adjusted return (annualized, assuming 0 risk-free rate)
`max_drawdown_pct`	`Decimal`	Maximum peak-to-trough decline as decimal (0.1 = 10%)
`win_rate`	`Decimal`	Percentage of profitable trades as decimal (0.6 = 60%)
`total_trades`	`int`	Total number of trades executed
`profit_factor`	`Decimal`	Ratio of gross profit to gross loss
`total_return_pct`	`Decimal`	Total return as decimal (0.15 = 15% return)
`annualized_return_pct`	`Decimal`	Annualized return as decimal
`total_fees_usd`	`Decimal`	Total protocol fees paid
`total_slippage_usd`	`Decimal`	Total slippage incurred
`total_gas_usd`	`Decimal`	Total gas costs
`winning_trades`	`int`	Number of profitable trades
`losing_trades`	`int`	Number of losing trades
`avg_trade_pnl_usd`	`Decimal`	Average PnL per trade
`largest_win_usd`	`Decimal`	Largest single winning trade
`largest_loss_usd`	`Decimal`	Largest single losing trade
`avg_win_usd`	`Decimal`	Average winning trade PnL
`avg_loss_usd`	`Decimal`	Average losing trade PnL
`volatility`	`Decimal`	Annualized volatility of returns as decimal
`sortino_ratio`	`Decimal`	Downside risk-adjusted return
`calmar_ratio`	`Decimal`	Return / max drawdown
`total_fees_earned_usd`	`Decimal`	Total fees earned from LP positions in USD
`fees_by_pool`	`dict[str, Decimal]`	Dict mapping pool identifier to fees earned in USD
`total_funding_paid`	`Decimal`	Total funding payments made from perp positions in USD
`total_funding_received`	`Decimal`	Total funding payments received by perp positions in USD
`liquidations_count`	`int`	Number of liquidation events that occurred
`liquidation_losses_usd`	`Decimal`	Total losses from liquidations in USD
`max_margin_utilization`	`Decimal`	Maximum margin utilization ratio observed during backtest (0-1)
`total_interest_earned`	`Decimal`	Total interest earned from lending supply positions in USD
`total_interest_paid`	`Decimal`	Total interest paid on borrow positions in USD
`min_health_factor`	`Decimal`	Minimum health factor observed for lending positions during backtest (lower = more risk)
`health_factor_warnings`	`int`	Number of times health factor dropped below warning threshold
`avg_gas_price_gwei`	`Decimal`	Average gas price in gwei across all trades (for cost analysis)
`max_gas_price_gwei`	`Decimal`	Maximum gas price in gwei observed during backtest (for peak cost analysis)
`total_gas_cost_usd`	`Decimal`	Total gas costs in USD (same as total_gas_usd, kept for API consistency)
`total_mev_cost_usd`	`Decimal`	Total estimated MEV (sandwich attack) costs in USD across all trades
`total_leverage`	`Decimal`	Total portfolio leverage ratio (sum of all position notionals / equity)
`max_net_delta`	`dict[str, Decimal]`	Maximum net delta exposure observed per asset (token symbol -> max delta)
`correlation_risk`	`Decimal \| None`	Portfolio correlation risk score (0-1, higher = more correlated positions)
`liquidation_cascade_risk`	`Decimal`	Risk of cascading liquidations across protocols (0-1, higher = more risk)
`information_ratio`	`Decimal \| None`	Information ratio measuring risk-adjusted excess return vs benchmark (None if not calculated)
`beta`	`Decimal \| None`	Portfolio beta measuring sensitivity to benchmark movements (None if not calculated)
`alpha`	`Decimal \| None`	Jensen's alpha measuring excess return beyond what beta would predict (None if not calculated)
`benchmark_return`	`Decimal \| None`	Total return of the benchmark over the backtest period as decimal (None if not calculated)
`pnl_by_protocol`	`dict[str, Decimal]`	PnL breakdown by protocol (e.g., {"uniswap_v3": Decimal("100"), "aave_v3": Decimal("-50")})
`pnl_by_intent_type`	`dict[str, Decimal]`	PnL breakdown by intent type (e.g., {"SWAP": Decimal("75"), "LP_OPEN": Decimal("25")})
`pnl_by_asset`	`dict[str, Decimal]`	PnL breakdown by asset (e.g., {"ETH": Decimal("80"), "USDC": Decimal("20")})
`realized_pnl`	`Decimal`	Total realized PnL from closed positions in USD
`unrealized_pnl`	`Decimal`	Total unrealized PnL from open positions in USD

lp_fee_confidence_breakdown `class-attribute` `instance-attribute` ¶

lp_fee_confidence_breakdown: dict[str, int] = field(
    default_factory=dict
)

Count of LP positions by fee confidence level.

Example: {"high": 2, "medium": 1, "low": 0} - high: Fees calculated using actual historical volume data from subgraph - medium: Fees calculated using interpolated or estimated data - low: Fees calculated using multiplier heuristic

total_execution_cost_usd `property` ¶

total_execution_cost_usd: Decimal

Get total execution costs (fees + slippage + gas).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dictionary.

PaperTradingSummary¶

almanak.framework.backtesting.PaperTradingSummary `dataclass` ¶

PaperTradingSummary(
    strategy_id: str,
    start_time: datetime,
    duration: timedelta,
    total_trades: int,
    successful_trades: int,
    failed_trades: int,
    chain: str = "arbitrum",
    initial_balances: dict[str, Decimal] = dict(),
    final_balances: dict[str, Decimal] = dict(),
    total_gas_used: int = 0,
    total_gas_cost_usd: Decimal = Decimal("0"),
    pnl_usd: Decimal | None = None,
    error_summary: dict[str, int] = dict(),
    trades: list[PaperTrade] = list(),
    errors: list[PaperTradeError] = list(),
)

Summary of a paper trading session.

This dataclass provides an overview of the paper trading session, including trade counts, timing, and basic performance metrics.

Attributes:

Name	Type	Description
`strategy_id`	`str`	Identifier of the strategy being tested
`start_time`	`datetime`	When the session started
`duration`	`timedelta`	How long the session ran
`total_trades`	`int`	Total number of trades attempted
`successful_trades`	`int`	Number of successful trades
`failed_trades`	`int`	Number of failed trades
`end_time`	`datetime`	When the session ended (computed)
`chain`	`str`	Target blockchain
`initial_balances`	`dict[str, Decimal]`	Starting token balances
`final_balances`	`dict[str, Decimal]`	Ending token balances
`total_gas_used`	`int`	Total gas consumed
`total_gas_cost_usd`	`Decimal`	Total gas cost in USD
`pnl_usd`	`Decimal \| None`	Estimated PnL in USD (if available)
`error_summary`	`dict[str, int]`	Count of errors by type
`trades`	`list[PaperTrade]`	List of successful trades
`errors`	`list[PaperTradeError]`	List of trade errors

end_time `property` ¶

end_time: datetime

Get the session end time.

success_rate `property` ¶

success_rate: Decimal

Calculate the success rate as a decimal (0.0 to 1.0).

duration_seconds `property` ¶

duration_seconds: float

Get duration in seconds.

duration_minutes `property` ¶

duration_minutes: float

Get duration in minutes.

duration_hours `property` ¶

duration_hours: float

Get duration in hours.

trades_per_hour `property` ¶

trades_per_hour: Decimal

Calculate average trades per hour.

avg_gas_per_trade `property` ¶

avg_gas_per_trade: int

Calculate average gas used per successful trade.

summary ¶

summary() -> str

Generate a human-readable summary.

Returns:

Type	Description
`str`	Multi-line string with formatted session summary

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dictionary.

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> PaperTradingSummary

Deserialize from dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary with serialized PaperTradingSummary data	required

Returns:

Type	Description
`PaperTradingSummary`	PaperTradingSummary instance

Data Providers¶

HistoricalDataProvider¶

almanak.framework.backtesting.HistoricalDataProvider ¶

Bases: Protocol

Protocol defining the interface for historical data providers.

Historical data providers are responsible for fetching price and market data for past time periods. They are used by the PnL backtesting engine to simulate strategy execution.

Implementations should handle: - Fetching historical prices for specified tokens - Providing OHLCV data when available - Rate limiting and caching as needed - Graceful handling of missing data

Example implementation

class MyDataProvider: async def get_price( self, token: str, timestamp: datetime ) -> Decimal: # Fetch price from data source ...

async def get_ohlcv(
    self, token: str, start: datetime, end: datetime, interval: int
) -> list[OHLCV]:
    # Fetch OHLCV data
    ...

async def iterate(
    self, config: HistoricalDataConfig
) -> AsyncIterator[tuple[datetime, MarketState]]:
    # Yield market states for each time point
    ...

provider_name `property` ¶

provider_name: str

Return the unique name of this data provider.

supported_tokens `property` ¶

supported_tokens: list[str]

Return list of supported token symbols.

supported_chains `property` ¶

supported_chains: list[str]

Return list of supported chain identifiers.

min_timestamp `property` ¶

min_timestamp: datetime | None

Return the earliest timestamp with available data, or None if unknown.

max_timestamp `property` ¶

max_timestamp: datetime | None

Return the latest timestamp with available data, or None if unknown.

get_price `async` ¶

get_price(token: str, timestamp: datetime) -> Decimal

Get the price of a token at a specific timestamp.

Parameters:

Name	Type	Description	Default
`token`	`str`	Token symbol (e.g., "WETH", "USDC", "ARB")	required
`timestamp`	`datetime`	The historical point in time	required

Returns:

Type	Description
`Decimal`	Price in USD at the specified timestamp

Raises:

Type	Description
`ValueError`	If price data is not available for the token/timestamp
`DataSourceUnavailable`	If the data source is unavailable

get_ohlcv `async` ¶

get_ohlcv(
    token: str,
    start: datetime,
    end: datetime,
    interval_seconds: int = 3600,
) -> list[OHLCV]

Get OHLCV data for a token over a time range.

Parameters:

Name	Type	Description	Default
`token`	`str`	Token symbol (e.g., "WETH", "USDC", "ARB")	required
`start`	`datetime`	Start of the time range (inclusive)	required
`end`	`datetime`	End of the time range (inclusive)	required
`interval_seconds`	`int`	Candle interval in seconds (default: 3600 = 1 hour)	`3600`

Returns:

Type	Description
`list[OHLCV]`	List of OHLCV data points, sorted by timestamp ascending

Raises:

Type	Description
`ValueError`	If data is not available for the token/range
`DataSourceUnavailable`	If the data source is unavailable

iterate `async` ¶

iterate(
    config: HistoricalDataConfig,
) -> AsyncIterator[tuple[datetime, MarketState]]

Iterate through historical market states.

This is the primary method used by the backtesting engine. It yields market state snapshots at regular intervals throughout the configured time range.

Parameters:

Name	Type	Description	Default
`config`	`HistoricalDataConfig`	Configuration specifying time range, interval, and tokens	required

Yields:

Type	Description
`AsyncIterator[tuple[datetime, MarketState]]`	Tuples of (timestamp, MarketState) for each time point

Raises:

Type	Description
`DataSourceUnavailable`	If the data source is unavailable

Example

async for timestamp, market_state in provider.iterate(config): eth_price = market_state.get_price("WETH") # Process market state

HistoricalDataConfig¶

almanak.framework.backtesting.HistoricalDataConfig `dataclass` ¶

HistoricalDataConfig(
    start_time: datetime,
    end_time: datetime,
    interval_seconds: int = 3600,
    tokens: list[str] = (lambda: ["WETH", "USDC"])(),
    chains: list[str] = (lambda: ["arbitrum"])(),
    include_ohlcv: bool = True,
    include_gas_prices: bool = False,
)

Configuration for historical data retrieval.

Specifies the time range, interval, and tokens to fetch for a backtest simulation.

Attributes:

Name	Type	Description
`start_time`	`datetime`	Start of the historical period (inclusive)
`end_time`	`datetime`	End of the historical period (inclusive)
`interval_seconds`	`int`	Time between data points in seconds (default: 3600 = 1 hour)
`tokens`	`list[str]`	List of token symbols to fetch prices for
`chains`	`list[str]`	List of chain identifiers to fetch data for (default: ["arbitrum"])
`include_ohlcv`	`bool`	Whether to fetch OHLCV data (default: True)
`include_gas_prices`	`bool`	Whether to fetch historical gas prices (default: False)

duration_seconds `property` ¶

duration_seconds: int

Get the total duration in seconds.

duration_days `property` ¶

duration_days: float

Get the total duration in days.

estimated_data_points `property` ¶

estimated_data_points: int

Get the estimated number of data points.

__post_init__ ¶

__post_init__() -> None

Validate configuration after initialization.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dictionary.

Crisis Scenarios¶

CrisisScenario¶

almanak.framework.backtesting.CrisisScenario `dataclass` ¶

CrisisScenario(
    name: str,
    start_date: datetime,
    end_date: datetime,
    description: str,
)

A historical crisis scenario for backtesting.

This dataclass represents a period of significant market stress that can be used for stress-testing trading strategies.

Attributes:

Name	Type	Description
`name`	`str`	Unique identifier for the scenario (lowercase, underscores)
`start_date`	`datetime`	Beginning of the crisis period
`end_date`	`datetime`	End of the crisis period
`description`	`str`	Human-readable description of the crisis event

Properties

duration_days: Number of days in the crisis period

Example

scenario = CrisisScenario( ... name="custom_crisis", ... start_date=datetime(2023, 3, 10), ... end_date=datetime(2023, 3, 15), ... description="SVB collapse", ... ) scenario.duration_days 5

duration_days `property` ¶

duration_days: int

Calculate the duration of the crisis in days.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize to dictionary.

Returns:

Type	Description
`dict[str, Any]`	Dictionary with scenario data suitable for JSON serialization.

from_dict `classmethod` ¶

from_dict(data: dict[str, Any]) -> CrisisScenario

Deserialize from dictionary.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary with serialized CrisisScenario data	required

Returns:

Type	Description
`CrisisScenario`	CrisisScenario instance

str ¶

__str__() -> str

Human-readable string representation.

Parallel Execution¶

almanak.framework.backtesting.run_parallel_backtests `async` ¶

run_parallel_backtests(
    configs: list[PnLBacktestConfig],
    strategy_factory: Callable[[], Any],
    data_provider_factory: Callable[[], Any],
    backtester_factory: Callable[
        [Any, dict[str, Any], dict[str, Any]], Any
    ],
    fee_models: dict[str, Any] | None = None,
    slippage_models: dict[str, Any] | None = None,
    workers: int | None = None,
) -> list[ParallelBacktestResult]

Run multiple backtests in parallel using a process pool.

This function distributes backtest execution across multiple processes for improved performance on multi-core systems. Each backtest runs in its own process with its own instances of strategy, data provider, and backtester created via factory functions.

Parameters:

Name	Type	Description	Default
`configs`	`list[PnLBacktestConfig]`	List of PnLBacktestConfig objects to run	required
`strategy_factory`	`Callable[[], Any]`	Factory function that returns a new strategy instance. Must be picklable (e.g., a module-level function).	required
`data_provider_factory`	`Callable[[], Any]`	Factory function that returns a new data provider. Must be picklable (e.g., a module-level function).	required
`backtester_factory`	`Callable[[Any, dict[str, Any], dict[str, Any]], Any]`	Factory function that returns a new PnLBacktester. Takes (data_provider, fee_models, slippage_models) as arguments.	required
`fee_models`	`dict[str, Any] \| None`	Optional dict of fee models to pass to backtester factory.	`None`
`slippage_models`	`dict[str, Any] \| None`	Optional dict of slippage models to pass to backtester factory.	`None`
`workers`	`int \| None`	Number of worker processes. Defaults to CPU count - 1.	`None`

Returns:

Type	Description
`list[ParallelBacktestResult]`	List of ParallelBacktestResult in the same order as input configs.
`list[ParallelBacktestResult]`	Each result indicates success or failure with associated data.

Raises:

Type	Description
`ValueError`	If configs list is empty

Example

def create_strategy(): return MyStrategy(param1=10, param2=0.5)

def create_data_provider(): return CoinGeckoDataProvider()

def create_backtester(provider, fee_models, slippage_models): return PnLBacktester(provider, fee_models, slippage_models)

results = await run_parallel_backtests( configs=[config1, config2, config3], strategy_factory=create_strategy, data_provider_factory=create_data_provider, backtester_factory=create_backtester, workers=4, )

Note

Factory functions must be picklable (module-level functions, not lambdas)
Each worker process creates its own instances to avoid sharing state
Results are returned in the same order as input configs

Backtesting¶

PnL Backtester¶

PnLBacktester¶

almanak.framework.backtesting.PnLBacktester dataclass ¶

With explicit strategy type:¶

With BacktestDataConfig for historical data:¶

gas_provider class-attribute instance-attribute ¶

data_config class-attribute instance-attribute ¶

__post_init__ ¶

run_preflight_validation async ¶

backtest async ¶

get_fee_model ¶

get_slippage_model ¶

PnLBacktestConfig¶

almanak.framework.backtesting.PnLBacktestConfig dataclass ¶

initial_margin_ratio class-attribute instance-attribute ¶

maintenance_margin_ratio class-attribute instance-attribute ¶

mev_simulation_enabled class-attribute instance-attribute ¶

auto_correct_positions class-attribute instance-attribute ¶

reconciliation_alert_threshold_pct class-attribute instance-attribute ¶

random_seed class-attribute instance-attribute ¶

strict_reproducibility class-attribute instance-attribute ¶

staleness_threshold_seconds class-attribute instance-attribute ¶

institutional_mode class-attribute instance-attribute ¶

min_data_coverage class-attribute instance-attribute ¶

allow_hardcoded_fallback class-attribute instance-attribute ¶

allow_degraded_data class-attribute instance-attribute ¶

require_symbol_mapping class-attribute instance-attribute ¶

use_historical_gas_prices class-attribute instance-attribute ¶

gas_eth_price_override class-attribute instance-attribute ¶

use_historical_gas_gwei class-attribute instance-attribute ¶

track_gas_prices class-attribute instance-attribute ¶

preflight_validation class-attribute instance-attribute ¶

fail_on_preflight_error class-attribute instance-attribute ¶

duration_seconds property ¶

duration_days property ¶

duration_hours property ¶

estimated_ticks property ¶

interval_hours property ¶

__post_init__ ¶

get_gas_cost_usd ¶

to_dict ¶

to_dict_with_metadata ¶

calculate_config_hash ¶

Same config produces same hash¶

from_dict classmethod ¶

__repr__ ¶

Paper Trader¶

PaperTrader¶

almanak.framework.backtesting.PaperTrader dataclass ¶

Run for 1 hour¶

Or run indefinitely until stopped¶

... later ...¶

__post_init__ ¶

run async ¶

start async ¶

stop async ¶

is_running ¶

tick async ¶

Manual tick control¶

Execute single tick¶

run_loop async ¶

PaperTraderConfig¶

almanak.framework.backtesting.PaperTraderConfig dataclass ¶

price_source class-attribute instance-attribute ¶

strict_price_mode class-attribute instance-attribute ¶

allow_hardcoded_fallback class-attribute instance-attribute ¶

chain_id property ¶

max_duration_seconds property ¶

max_duration_minutes property ¶

max_duration_hours property ¶

tick_interval_minutes property ¶

fork_rpc_url property ¶

__post_init__ ¶

get_initial_balances ¶

to_dict ¶

from_dict classmethod ¶

__repr__ ¶

Results¶

BacktestResult¶

almanak.framework.backtesting.PnLBacktester `dataclass` ¶

gas_provider `class-attribute` `instance-attribute` ¶

data_config `class-attribute` `instance-attribute` ¶

run_preflight_validation `async` ¶

backtest `async` ¶

almanak.framework.backtesting.PnLBacktestConfig `dataclass` ¶

initial_margin_ratio `class-attribute` `instance-attribute` ¶

maintenance_margin_ratio `class-attribute` `instance-attribute` ¶

mev_simulation_enabled `class-attribute` `instance-attribute` ¶

auto_correct_positions `class-attribute` `instance-attribute` ¶

reconciliation_alert_threshold_pct `class-attribute` `instance-attribute` ¶

random_seed `class-attribute` `instance-attribute` ¶

strict_reproducibility `class-attribute` `instance-attribute` ¶

staleness_threshold_seconds `class-attribute` `instance-attribute` ¶

institutional_mode `class-attribute` `instance-attribute` ¶

min_data_coverage `class-attribute` `instance-attribute` ¶

allow_hardcoded_fallback `class-attribute` `instance-attribute` ¶

allow_degraded_data `class-attribute` `instance-attribute` ¶

require_symbol_mapping `class-attribute` `instance-attribute` ¶

use_historical_gas_prices `class-attribute` `instance-attribute` ¶

gas_eth_price_override `class-attribute` `instance-attribute` ¶

use_historical_gas_gwei `class-attribute` `instance-attribute` ¶

track_gas_prices `class-attribute` `instance-attribute` ¶

preflight_validation `class-attribute` `instance-attribute` ¶

fail_on_preflight_error `class-attribute` `instance-attribute` ¶

duration_seconds `property` ¶

duration_days `property` ¶

duration_hours `property` ¶

estimated_ticks `property` ¶

interval_hours `property` ¶

from_dict `classmethod` ¶

repr ¶

almanak.framework.backtesting.PaperTrader `dataclass` ¶

run `async` ¶

start `async` ¶

stop `async` ¶

tick `async` ¶

run_loop `async` ¶

almanak.framework.backtesting.PaperTraderConfig `dataclass` ¶

price_source `class-attribute` `instance-attribute` ¶

strict_price_mode `class-attribute` `instance-attribute` ¶

allow_hardcoded_fallback `class-attribute` `instance-attribute` ¶

chain_id `property` ¶

max_duration_seconds `property` ¶

max_duration_minutes `property` ¶

max_duration_hours `property` ¶

tick_interval_minutes `property` ¶

fork_rpc_url `property` ¶

from_dict `classmethod` ¶

repr ¶

almanak.framework.backtesting.BacktestResult `dataclass` ¶

gas_prices_used `class-attribute` `instance-attribute` ¶

gas_price_summary `class-attribute` `instance-attribute` ¶

parameter_sources `class-attribute` `instance-attribute` ¶

accuracy_estimate `class-attribute` `instance-attribute` ¶

data_coverage_metrics `class-attribute` `instance-attribute` ¶

success `property` ¶

simulation_duration_days `property` ¶

total_return_pct `property` ¶

used_any_fallback `property` ¶

from_dict `classmethod` ¶

almanak.framework.backtesting.BacktestMetrics `dataclass` ¶

lp_fee_confidence_breakdown `class-attribute` `instance-attribute` ¶

total_execution_cost_usd `property` ¶

almanak.framework.backtesting.PaperTradingSummary `dataclass` ¶

end_time `property` ¶

success_rate `property` ¶

duration_seconds `property` ¶

duration_minutes `property` ¶

duration_hours `property` ¶

trades_per_hour `property` ¶

avg_gas_per_trade `property` ¶

from_dict `classmethod` ¶

provider_name `property` ¶

supported_tokens `property` ¶

supported_chains `property` ¶

min_timestamp `property` ¶

max_timestamp `property` ¶

get_price `async` ¶

get_ohlcv `async` ¶