Skip to content

Backtesting

Dual-engine backtesting system: PnL simulation with historical prices and paper trading on Anvil forks.

PnL Backtester

PnLBacktester

almanak.framework.backtesting.PnLBacktester dataclass

PnLBacktester(
    data_provider: HistoricalDataProvider,
    fee_models: dict[str, FeeModel],
    slippage_models: dict[str, SlippageModel],
    strategy_type: str | None = "auto",
    gas_provider: GasPriceProvider | None = None,
    data_config: BacktestDataConfig | None = None,
    _mev_simulator: MEVSimulator | None = None,
    _current_backtest_id: str = "",
    _adapter: StrategyBacktestAdapter | None = None,
    _detected_strategy_type: StrategyTypeHint | None = None,
    _error_handler: BacktestErrorHandler | None = None,
    _fallback_usage: dict[str, int] | None = None,
    _gas_price_records: list[GasPriceRecord] | None = None,
)

Main PnL backtesting engine for historical strategy simulation.

The PnLBacktester simulates strategy execution against historical price data to evaluate performance. It:

  1. Iterates through historical market data at configured intervals
  2. Calls strategy.decide() with a MarketSnapshot for each time step
  3. Simulates intent execution with configurable fee/slippage models
  4. Tracks portfolio state and builds an equity curve
  5. Calculates comprehensive performance metrics

Attributes:

Name Type Description
data_provider HistoricalDataProvider

Historical data provider (e.g., CoinGeckoDataProvider)

fee_models dict[str, FeeModel]

Dict mapping protocol -> FeeModel (or "default" for all)

slippage_models dict[str, SlippageModel]

Dict mapping protocol -> SlippageModel

gas_provider GasPriceProvider | None

Optional gas price provider for historical gas prices. When provided and config.use_historical_gas_gwei=True, the engine will fetch historical gas prices at each simulation timestamp.

mev_simulator GasPriceProvider | None

Optional MEV simulator (created dynamically based on config)

strategy_type str | None

Optional explicit strategy type for adapter selection. If "auto" (default), the type is detected from strategy metadata. Valid values: "lp", "perp", "lending", "arbitrage", "swap", "auto", or None.

data_config BacktestDataConfig | None

Optional BacktestDataConfig for controlling historical data providers in adapters. When provided, adapters will use historical volume, funding rates, and APY data from real sources instead of fallback values.

Example

backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, )

result = await backtester.backtest(my_strategy, config) print(result.summary())

With explicit strategy type:

backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, strategy_type="lp", # Force LP adapter )

With BacktestDataConfig for historical data:

from almanak.framework.backtesting.config import BacktestDataConfig

data_config = BacktestDataConfig( use_historical_volume=True, use_historical_funding=True, use_historical_apy=True, ) backtester = PnLBacktester( data_provider=CoinGeckoDataProvider(), fee_models={"default": DefaultFeeModel()}, slippage_models={"default": DefaultSlippageModel()}, data_config=data_config, )

gas_provider class-attribute instance-attribute

gas_provider: GasPriceProvider | None = None

Optional gas price provider for historical gas prices.

When provided and config.use_historical_gas_gwei=True, the engine will fetch historical gas prices at each simulation timestamp instead of using the static config.gas_price_gwei value.

Example

from almanak.framework.backtesting.pnl.providers import EtherscanGasPriceProvider

gas_provider = EtherscanGasPriceProvider( api_keys={"ethereum": "your-key"}, ) backtester = PnLBacktester( data_provider=data_provider, fee_models=fee_models, slippage_models=slippage_models, gas_provider=gas_provider, )

data_config class-attribute instance-attribute

data_config: BacktestDataConfig | None = None

Optional BacktestDataConfig for controlling historical data providers.

When provided, this configuration is passed to strategy-specific adapters (LP, Perp, Lending) to control historical data provider behavior: - use_historical_volume: Fetch LP fee data from subgraphs - use_historical_funding: Fetch perp funding rates from APIs - use_historical_apy: Fetch lending APY from subgraphs - strict_historical_mode: Fail if historical data unavailable - Fallback values for when historical data is unavailable - Rate limiting configuration for CoinGecko and The Graph - Cache settings for persistent data storage

Example

from almanak.framework.backtesting.config import BacktestDataConfig

data_config = BacktestDataConfig( use_historical_volume=True, use_historical_funding=True, use_historical_apy=True, strict_historical_mode=False, ) backtester = PnLBacktester( data_provider=data_provider, fee_models=fee_models, slippage_models=slippage_models, data_config=data_config, )

__post_init__

__post_init__() -> None

Validate configuration after initialization.

run_preflight_validation async

run_preflight_validation(
    config: PnLBacktestConfig,
) -> PreflightReport

Run preflight validation checks before starting a backtest.

Performs validation checks to ensure data requirements can be met: - Checks price data availability for all tokens in config - Verifies data provider capabilities match requirements - Tests archive node accessibility if historical TWAP/Chainlink needed - Estimates data coverage based on provider capabilities

Parameters:

Name Type Description Default
config PnLBacktestConfig

Backtest configuration specifying tokens, time range, etc.

required

Returns:

Type Description
PreflightReport

PreflightReport with pass/fail status and detailed check results.

Example

preflight = await backtester.run_preflight_validation(config) if not preflight.passed: print(preflight.summary()) # Handle validation failure else: result = await backtester.backtest(strategy, config)

backtest async

backtest(
    strategy: BacktestableStrategy,
    config: PnLBacktestConfig,
) -> BacktestResult

Run a backtest for a strategy over the configured period.

This method: 1. Initializes a simulated portfolio with initial capital 2. Creates a HistoricalDataConfig from the backtest config 3. Iterates through historical market states 4. For each time step: a. Creates a MarketSnapshot from MarketState b. Calls strategy.decide(snapshot) to get intent c. Queues intent for execution (with inclusion delay) d. Executes queued intents e. Marks portfolio to market 5. Calculates final metrics and returns BacktestResult

Parameters:

Name Type Description Default
strategy BacktestableStrategy

Strategy to backtest (must implement BacktestableStrategy)

required
config PnLBacktestConfig

Backtest configuration (time range, capital, models, etc.)

required

Returns:

Type Description
BacktestResult

BacktestResult with metrics, trades, and equity curve

Raises:

Type Description
ValueError

If strategy is not compatible with backtesting

get_fee_model

get_fee_model(protocol: str) -> FeeModel

Get the fee model for a protocol.

Parameters:

Name Type Description Default
protocol str

Protocol name (e.g., "uniswap_v3", "aave_v3")

required

Returns:

Type Description
FeeModel

FeeModel for the protocol, or default if not found

get_slippage_model

get_slippage_model(protocol: str) -> SlippageModel

Get the slippage model for a protocol.

Parameters:

Name Type Description Default
protocol str

Protocol name (e.g., "uniswap_v3", "gmx")

required

Returns:

Type Description
SlippageModel

SlippageModel for the protocol, or default if not found

PnLBacktestConfig

almanak.framework.backtesting.PnLBacktestConfig dataclass

PnLBacktestConfig(
    start_time: datetime,
    end_time: datetime,
    interval_seconds: int = 3600,
    initial_capital_usd: Decimal = Decimal("10000"),
    fee_model: str = "realistic",
    slippage_model: str = "realistic",
    include_gas_costs: bool = True,
    gas_price_gwei: Decimal = Decimal("30"),
    inclusion_delay_blocks: int = 1,
    chain: str = "arbitrum",
    tokens: list[str] = (lambda: ["WETH", "USDC"])(),
    benchmark_token: str = "WETH",
    risk_free_rate: Decimal = Decimal("0.05"),
    trading_days_per_year: int = 365,
    initial_margin_ratio: Decimal = Decimal("0.1"),
    maintenance_margin_ratio: Decimal = Decimal("0.05"),
    mev_simulation_enabled: bool = False,
    auto_correct_positions: bool = False,
    reconciliation_alert_threshold_pct: Decimal = Decimal(
        "0.05"
    ),
    random_seed: int | None = None,
    strict_reproducibility: bool = False,
    staleness_threshold_seconds: int = 3600,
    institutional_mode: bool = False,
    min_data_coverage: Decimal = Decimal("0.98"),
    allow_hardcoded_fallback: bool = False,
    allow_degraded_data: bool = True,
    require_symbol_mapping: bool = False,
    use_historical_gas_prices: bool = False,
    gas_eth_price_override: Decimal | None = None,
    use_historical_gas_gwei: bool = False,
    track_gas_prices: bool = False,
    preflight_validation: bool = True,
    fail_on_preflight_error: bool = True,
)

Configuration for a PnL backtest simulation.

Controls all parameters of the backtest including time range, initial capital, fee/slippage models, gas costs, and execution delay simulation.

Attributes:

Name Type Description
start_time datetime

Start of the backtest period (inclusive)

end_time datetime

End of the backtest period (inclusive)

interval_seconds int

Time between simulation ticks in seconds (default: 3600 = 1 hour)

initial_capital_usd Decimal

Starting capital in USD

fee_model str

Fee model to use - 'realistic', 'zero', or protocol-specific (e.g., 'uniswap_v3', 'aave_v3', 'gmx')

slippage_model str

Slippage model to use - 'realistic', 'zero', or protocol-specific (e.g., 'liquidity_aware', 'constant')

include_gas_costs bool

Whether to include gas costs in PnL calculations

gas_price_gwei Decimal

Gas price to use for cost calculations (default: 30 gwei)

inclusion_delay_blocks int

Number of blocks to delay intent execution to simulate realistic trade timing (default: 1). When > 0, intents are queued and executed in the next iteration(s) rather than immediately.

chain str

Blockchain to simulate execution on (default: 'arbitrum')

tokens list[str]

List of tokens to track prices for (default: ['WETH', 'USDC'])

benchmark_token str

Token to use for benchmark comparisons (default: 'WETH')

risk_free_rate Decimal

Annual risk-free rate for Sharpe ratio calculation (default: 0.05)

trading_days_per_year int

Number of trading days for annualization (default: 365)

initial_margin_ratio Decimal

Initial margin ratio for opening perp positions (default: 0.1 = 10%)

maintenance_margin_ratio Decimal

Maintenance margin ratio for liquidation (default: 0.05 = 5%)

mev_simulation_enabled bool

Enable MEV cost simulation for realistic execution costs (default: False)

auto_correct_positions bool

Enable auto-correction of tracked positions when discrepancies are detected

reconciliation_alert_threshold_pct Decimal

Threshold percentage for triggering reconciliation alerts (default: 5%)

Example

config = PnLBacktestConfig( start_time=datetime(2024, 1, 1), end_time=datetime(2024, 6, 1), initial_capital_usd=Decimal("10000"), ) print(f"Duration: {config.duration_days:.1f} days") print(f"Estimated ticks: {config.estimated_ticks}")

initial_margin_ratio class-attribute instance-attribute

initial_margin_ratio: Decimal = Decimal('0.1')

Initial margin ratio required to open a position (default: 0.1 = 10%). This is the minimum margin/position_value ratio required to open a new perp position.

maintenance_margin_ratio class-attribute instance-attribute

maintenance_margin_ratio: Decimal = Decimal('0.05')

Maintenance margin ratio for liquidation threshold (default: 0.05 = 5%). When margin/position_value falls below this, the position may be liquidated.

mev_simulation_enabled class-attribute instance-attribute

mev_simulation_enabled: bool = False

Enable MEV (Maximal Extractable Value) cost simulation (default: False). When enabled, simulates sandwich attack probability and additional slippage based on trade size and token characteristics. Adds estimated MEV costs to trade records and total MEV cost to backtest metrics.

auto_correct_positions class-attribute instance-attribute

auto_correct_positions: bool = False

Enable automatic position correction when discrepancies are detected (default: False). When enabled, the reconciliation process will update tracked positions to match actual on-chain state when discrepancies exceed the alert threshold. Corrected positions will have auto_corrected=True in their ReconciliationEvent.

reconciliation_alert_threshold_pct class-attribute instance-attribute

reconciliation_alert_threshold_pct: Decimal = Decimal(
    "0.05"
)

Threshold percentage for triggering reconciliation alerts (default: 5%). When discrepancy_pct exceeds this threshold, an alert is emitted. Set to 0 to alert on any discrepancy, or higher values to only alert on significant drift.

random_seed class-attribute instance-attribute

random_seed: int | None = None

Random seed for reproducibility (default: None = no seed). When set, any randomness in the backtest (e.g., Monte Carlo simulations, random sampling) will use this seed for reproducibility. The seed is recorded in the config for re-running identical backtests.

strict_reproducibility class-attribute instance-attribute

strict_reproducibility: bool = False

Enforce strict reproducibility mode (default: False).

When enabled, the backtest will raise errors instead of using fallbacks that could produce non-deterministic results:

  • Raises ValueError if simulation timestamp is missing (instead of using datetime.now())
  • Raises ValueError if required historical data is unavailable
  • Requires all price sources to provide historical data, not just current prices

Use this mode when you need byte-identical results across multiple runs with the same configuration and random_seed. When disabled, the backtester will use reasonable defaults and log warnings instead of failing.

staleness_threshold_seconds class-attribute instance-attribute

staleness_threshold_seconds: int = 3600

Threshold in seconds for marking price data as stale (default: 3600 = 1 hour).

Price data older than this threshold relative to the simulation timestamp will be counted as stale in the data quality report. This helps identify backtests that may be using outdated price information.

Set to 0 to disable staleness tracking.

institutional_mode class-attribute instance-attribute

institutional_mode: bool = False

Enable institutional-grade enforcement mode (default: False).

When enabled, applies stricter data quality requirements suitable for institutional trading operations:

  • Fails backtest if data coverage is below min_data_coverage threshold
  • Disables hardcoded price fallbacks (allow_hardcoded_fallback=False)
  • Requires historical price data from verified sources
  • Enforces strict reproducibility (strict_reproducibility=True)

This mode is designed for production-grade backtests where data quality and reproducibility are critical. Use for institutional trading strategies or when accurate PnL calculations are required for compliance/reporting.

min_data_coverage class-attribute instance-attribute

min_data_coverage: Decimal = Decimal('0.98')

Minimum data coverage ratio required in institutional mode (default: 0.98 = 98%).

When institutional_mode is enabled, the backtest will fail if the actual data coverage ratio (successful price lookups / total lookups) falls below this threshold.

When institutional_mode is disabled, this threshold is only used for warnings in the data quality report, not enforcement.

Valid range: 0.0 to 1.0 (0% to 100%)

allow_hardcoded_fallback class-attribute instance-attribute

allow_hardcoded_fallback: bool = False

Allow hardcoded price fallbacks when price data is unavailable (default: False).

When disabled (default): The backtester will raise an error if it cannot find price data, ensuring that all valuations use actual market prices. This is the institutional-grade setting for production backtests.

When enabled: The backtester may use hardcoded fallback prices for tokens when historical price data is unavailable. This can mask data quality issues and should only be used for development/testing where price accuracy is not critical.

Note: This is automatically set to False when institutional_mode=True in post_init, as institutional-grade backtests should never use arbitrary hardcoded prices.

Environment variable: Set ALMANAK_ALLOW_HARDCODED_PRICES=1 to override for testing scenarios where you need relaxed defaults.

allow_degraded_data class-attribute instance-attribute

allow_degraded_data: bool = True

Allow backtests to proceed with degraded or incomplete data (default: True).

When enabled, the backtester will continue execution even when: - Some price data is missing or interpolated - Data sources return stale information - Historical data has gaps

When disabled, the backtester will fail fast if data quality issues are detected, ensuring only high-quality data is used for analysis.

Note: This is automatically set to False when institutional_mode=True in post_init, as institutional-grade backtests require complete data.

require_symbol_mapping class-attribute instance-attribute

require_symbol_mapping: bool = False

Require all token addresses to be resolved to symbols (default: False).

When enabled, the backtester will fail if any token address cannot be resolved to a human-readable symbol. This ensures all trade records and reports use consistent, recognizable token names.

When disabled, unresolved token addresses are used as-is (checksummed), which may make reports harder to read and audit.

Note: This is automatically set to True when institutional_mode=True in post_init, as institutional-grade backtests require clear symbol identification for compliance and reporting purposes.

use_historical_gas_prices class-attribute instance-attribute

use_historical_gas_prices: bool = False

Use historical gas prices for accurate gas cost simulation (default: False).

When enabled, the backtester will attempt to fetch historical ETH prices at each simulation timestamp to calculate gas costs more accurately. This provides realistic gas cost estimates that reflect market conditions at the time of simulated trades.

When disabled, gas costs use the current ETH price or gas_eth_price_override if specified. This is faster but less accurate for historical backtests.

Note: Requires a data provider that supports historical price lookups.

gas_eth_price_override class-attribute instance-attribute

gas_eth_price_override: Decimal | None = None

Override ETH price for gas cost calculations (default: None = use market price).

When set, this value is used as the ETH price for all gas cost calculations, ignoring both historical and current market prices. This is useful for:

  • Testing with a fixed ETH price for reproducibility
  • Stress testing with extreme ETH price scenarios
  • Backtests where gas cost accuracy is not critical

When None, gas costs use: 1. Historical ETH price (if use_historical_gas_prices=True) 2. Current ETH price from data provider 3. Default fallback ($3000) with warning if unavailable

Value should be in USD (e.g., Decimal("3000") for $3000 per ETH).

use_historical_gas_gwei class-attribute instance-attribute

use_historical_gas_gwei: bool = False

Use historical gas prices (gwei) from gas price provider (default: False).

When enabled and a gas_provider is attached to the PnLBacktester, the engine will fetch historical gas prices at each simulation timestamp instead of using the static gas_price_gwei value. This provides more realistic gas cost estimates that reflect network congestion at historical timestamps.

Priority order for gas price (gwei): 1. Historical gas price from gas_provider (if use_historical_gas_gwei=True) 2. MarketState.gas_price_gwei (if populated by data provider) 3. config.gas_price_gwei (static default: 30 gwei)

When disabled, gas costs use the static gas_price_gwei for all trades, which is faster but may not reflect actual network conditions.

Note: Requires a GasPriceProvider (e.g., EtherscanGasPriceProvider) to be passed to the PnLBacktester. If enabled without a provider, falls back to MarketState.gas_price_gwei or config.gas_price_gwei with a warning.

track_gas_prices class-attribute instance-attribute

track_gas_prices: bool = False

Track detailed gas price records for each trade (default: False).

When enabled, the backtester records a GasPriceRecord for each trade, capturing the gas price in gwei, source, and USD cost. These records are stored in BacktestResult.gas_prices_used for detailed analysis.

This is useful for: - Analyzing gas price volatility impact on strategy performance - Understanding gas cost breakdown by source (historical vs config) - Auditing gas costs in institutional-grade backtests

When disabled, only summary statistics (gas_price_summary) are populated from the TradeRecord.gas_price_gwei values, reducing result size.

Note: Gas price summary statistics are always calculated regardless of this setting, since TradeRecord already contains gas_price_gwei.

preflight_validation class-attribute instance-attribute

preflight_validation: bool = True

Enable preflight validation before running backtest (default: True).

When enabled, the backtester performs validation checks before starting the simulation to ensure data requirements can be met:

  • Checks price data availability for all tokens in config
  • Verifies data provider capabilities match requirements
  • Tests archive node accessibility if historical TWAP/Chainlink needed
  • Reports estimated data coverage and potential gaps

Results are returned in a PreflightReport with pass/fail and details. This helps identify data issues early, before spending time on a backtest that would fail or produce inaccurate results.

When disabled, the backtest proceeds without validation, which is faster but may encounter data issues during simulation.

fail_on_preflight_error class-attribute instance-attribute

fail_on_preflight_error: bool = True

Fail fast if preflight validation fails (default: True).

When enabled (True): If preflight validation detects critical issues (e.g., missing price data, insufficient data coverage), the backtester raises PreflightValidationError with an actionable error message that includes: - What failed (specific checks that did not pass) - Why it failed (the underlying cause) - How to fix it (recommendations for resolution)

When disabled (False): The backtester logs warnings about preflight issues but continues in degraded mode. This is useful for exploratory backtests where you want to see partial results even with data gaps.

The preflight_passed field in BacktestResult indicates whether preflight validation passed, regardless of this setting.

Note: This setting only applies when preflight_validation=True.

duration_seconds property

duration_seconds: int

Get the total backtest duration in seconds.

duration_days property

duration_days: float

Get the total backtest duration in days.

duration_hours property

duration_hours: float

Get the total backtest duration in hours.

estimated_ticks property

estimated_ticks: int

Get the estimated number of simulation ticks.

interval_hours property

interval_hours: float

Get the interval between ticks in hours.

__post_init__

__post_init__() -> None

Validate configuration after initialization.

get_gas_cost_usd

get_gas_cost_usd(
    gas_used: int, eth_price_usd: Decimal
) -> Decimal

Calculate gas cost in USD for a given amount of gas used.

Parameters:

Name Type Description Default
gas_used int

Amount of gas consumed by the transaction

required
eth_price_usd Decimal

Current ETH price in USD

required

Returns:

Type Description
Decimal

Gas cost in USD

to_dict

to_dict() -> dict[str, Any]

Serialize to dictionary.

to_dict_with_metadata

to_dict_with_metadata(
    data_provider_info: dict[str, Any] | None = None,
) -> dict[str, Any]

Serialize to dictionary with full metadata for reproducibility.

This method extends to_dict() to include additional metadata needed to reproduce a backtest exactly, such as: - Data provider versions and timestamps - SDK/framework versions - Run timestamp

Parameters:

Name Type Description Default
data_provider_info dict[str, Any] | None

Optional dict containing data provider information: - name: Provider name (e.g., "coingecko", "chainlink") - version: Provider version if available - data_fetched_at: ISO timestamp when data was fetched - cache_hit_rate: Optional cache hit rate percentage - additional provider-specific metadata

None

Returns:

Type Description
dict[str, Any]

Dictionary with full config and metadata for reproducibility

calculate_config_hash

calculate_config_hash() -> str

Calculate a deterministic hash of the configuration for verification.

The hash is calculated from all configuration parameters that affect backtest results, excluding runtime metadata like timestamps. This enables verification that a backtest was run with identical config.

The hash uses SHA-256 and includes: - Time range (start_time, end_time, interval_seconds) - Capital settings (initial_capital_usd) - Model settings (fee_model, slippage_model) - Gas settings (include_gas_costs, gas_price_gwei) - Execution settings (inclusion_delay_blocks) - Chain and token settings - Metrics settings (benchmark_token, risk_free_rate, etc.) - Margin settings - Other simulation parameters

Returns:

Type Description
str

64-character hex string (SHA-256 hash)

Example

config = PnLBacktestConfig(...) hash1 = config.calculate_config_hash()

Same config produces same hash

config2 = PnLBacktestConfig(...) # identical params hash2 = config2.calculate_config_hash() assert hash1 == hash2

from_dict classmethod

from_dict(data: dict[str, Any]) -> PnLBacktestConfig

Deserialize from dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary containing config fields

required

Returns:

Type Description
PnLBacktestConfig

PnLBacktestConfig instance

__repr__

__repr__() -> str

Return a human-readable representation.

Paper Trader

PaperTrader

almanak.framework.backtesting.PaperTrader dataclass

PaperTrader(
    fork_manager: RollingForkManager,
    portfolio_tracker: PaperPortfolioTracker,
    config: PaperTraderConfig,
    event_callback: PaperTradeEventCallback | None = None,
)

Main paper trading engine for fork-based strategy simulation.

The PaperTrader executes strategy decisions on local Anvil forks, providing accurate simulation of real DeFi execution. It:

  1. Manages fork lifecycle via RollingForkManager
  2. Calls strategy.decide() at configured intervals
  3. Compiles intents to ActionBundles
  4. Executes transactions on the fork via ExecutionOrchestrator
  5. Tracks portfolio state and records trades
  6. Calculates comprehensive performance metrics

Attributes:

Name Type Description
fork_manager RollingForkManager

RollingForkManager for Anvil fork lifecycle

portfolio_tracker PaperPortfolioTracker

PaperPortfolioTracker for state tracking

config PaperTraderConfig

PaperTraderConfig with execution parameters

event_callback PaperTradeEventCallback | None

Optional callback for trading events

Example

trader = PaperTrader( fork_manager=fork_manager, portfolio_tracker=portfolio_tracker, config=PaperTraderConfig(tick_interval_seconds=60), )

Run for 1 hour

result = await trader.run(my_strategy, duration_seconds=3600)

Or run indefinitely until stopped

await trader.start(my_strategy)

... later ...

await trader.stop()

__post_init__

__post_init__() -> None

Validate configuration after initialization.

run async

run(
    strategy: PaperTradeableStrategy,
    duration_seconds: float | None = None,
    max_ticks: int | None = None,
) -> BacktestResult

Run a paper trading session for the specified duration.

This is the main entry point for paper trading. It: 1. Initializes the fork and orchestrator 2. Runs the trading loop for the specified duration 3. Calculates and returns metrics

Parameters:

Name Type Description Default
strategy PaperTradeableStrategy

Strategy to paper trade

required
duration_seconds float | None

Maximum duration in seconds (None = config default)

None
max_ticks int | None

Maximum number of ticks (None = no limit)

None

Returns:

Type Description
BacktestResult

BacktestResult with comprehensive metrics and trades

Raises:

Type Description
RuntimeError

If already running

start async

start(strategy: PaperTradeableStrategy) -> None

Start continuous paper trading until stop() is called.

This method runs paper trading indefinitely. Call stop() to end the session gracefully.

Parameters:

Name Type Description Default
strategy PaperTradeableStrategy

Strategy to paper trade

required

Raises:

Type Description
RuntimeError

If already running

stop async

stop() -> None

Stop the current paper trading session.

Signals the trading loop to exit gracefully. The current tick will complete before stopping.

is_running

is_running() -> bool

Check if paper trading is currently active.

Returns:

Type Description
bool

True if a session is running

tick async

tick() -> PaperTrade | None

Execute one trading cycle (tick) manually.

This method allows manual tick execution for testing or custom integration. It performs one complete trading cycle:

  1. Optionally resets fork to latest block (based on config)
  2. Creates MarketSnapshot from current fork state
  3. Calls strategy.decide(snapshot) to get intent
  4. If intent returned (non-HOLD), executes via orchestrator on fork
  5. Records trade result in portfolio_tracker
  6. Handles and records errors gracefully
Prerequisites
  • PaperTrader must be initialized (call start() or run() first)
  • A strategy must be set via _current_strategy

Returns:

Type Description
PaperTrade | None

PaperTrade if a trade was executed successfully, None otherwise

PaperTrade | None

(including HOLD decisions, errors, or no strategy set)

Example

Manual tick control

trader = PaperTrader(fork_manager, portfolio_tracker, config) await trader._initialize_fork() await trader._initialize_orchestrator() trader._current_strategy = my_strategy trader._running = True

Execute single tick

trade = await trader.tick() if trade: print(f"Trade executed: {trade.tx_hash}")

run_loop async

run_loop(
    strategy: PaperTradeableStrategy,
    max_ticks: int | None = None,
) -> PaperTradingSummary

Run a paper trading session with a simple tick loop.

This method implements the classic paper trading loop pattern: 1. Initialize fork and orchestrator 2. Loop: call tick(), sleep for tick_interval_seconds 3. Stop when max_ticks reached or _running becomes False 4. Cleanup in finally block

Unlike run(), which returns a comprehensive BacktestResult, this method returns a simpler PaperTradingSummary focused on trade statistics.

Parameters:

Name Type Description Default
strategy PaperTradeableStrategy

Strategy to paper trade

required
max_ticks int | None

Maximum number of ticks to run (None = use config.max_ticks, if that's also None, runs until stop() is called)

None

Returns:

Type Description
PaperTradingSummary

PaperTradingSummary with session statistics and trade details

Raises:

Type Description
RuntimeError

If already running

Example

trader = PaperTrader(fork_manager, portfolio_tracker, config) summary = await trader.run_loop(my_strategy, max_ticks=100) print(summary.summary())

PaperTraderConfig

almanak.framework.backtesting.PaperTraderConfig dataclass

PaperTraderConfig(
    chain: str,
    rpc_url: str,
    strategy_id: str,
    initial_eth: Decimal = Decimal("10"),
    initial_tokens: dict[str, Decimal] = dict(),
    tick_interval_seconds: int = 60,
    max_ticks: int | None = None,
    anvil_port: int = 8546,
    reset_fork_every_tick: bool = True,
    startup_timeout_seconds: float = 30.0,
    auto_impersonate: bool = True,
    block_time: int | None = None,
    wallet_address: str | None = None,
    log_trades: bool = True,
    log_level: str = "INFO",
    price_source: Literal[
        "coingecko", "chainlink", "twap", "auto"
    ] = "auto",
    strict_price_mode: bool = True,
    allow_hardcoded_fallback: bool | None = None,
)

Configuration for a paper trading session.

Controls all parameters of the paper trading session including chain, initial balances, tick intervals, and Anvil fork settings.

Paper trading executes real transactions on a local Anvil fork, allowing strategies to be validated with actual DeFi protocol interactions before deployment with real capital.

Attributes:

Name Type Description
chain str

Blockchain to paper trade on (e.g., "arbitrum", "ethereum")

rpc_url str

Archive RPC URL to fork from (Alchemy, Infura, etc.)

strategy_id str

Identifier of the strategy being tested

initial_eth Decimal

Initial ETH balance for the paper wallet (default: 10)

initial_tokens dict[str, Decimal]

Dict of token symbol to amount for initial balances

tick_interval_seconds int

Time between trading ticks in seconds (default: 60)

max_ticks int | None

Maximum number of ticks to run, None = run indefinitely

anvil_port int

Port to run Anvil on (default: 8546)

reset_fork_every_tick bool

Whether to reset fork to latest block each tick (default: True)

startup_timeout_seconds float

Timeout for Anvil startup (default: 30)

auto_impersonate bool

Enable auto-impersonation for any address (default: True)

block_time int | None

Optional block time in seconds (default: None = instant)

wallet_address str | None

Optional paper wallet address (default: None = auto-generated)

log_trades bool

Whether to log individual trades (default: True)

log_level str

Logging level for paper trader (default: "INFO")

price_source Literal['coingecko', 'chainlink', 'twap', 'auto']

Price source to use ('coingecko', 'chainlink', 'twap', 'auto')

Example

config = PaperTraderConfig( chain="arbitrum", rpc_url="https://arb1.arbitrum.io/rpc", strategy_id="momentum_v1", initial_eth=Decimal("10"), initial_tokens={"USDC": Decimal("10000")}, ) print(f"Chain: {config.chain} (ID: {config.chain_id})") print(f"Max duration: {config.max_duration_seconds}s")

price_source class-attribute instance-attribute

price_source: Literal[
    "coingecko", "chainlink", "twap", "auto"
] = "auto"

Price source to use for portfolio valuation.

Options
  • 'coingecko': Use CoinGecko API for market prices. Best for: General tokens, off-chain price feeds, no RPC needed.
  • 'chainlink': Use Chainlink oracles for on-chain prices. Best for: Major tokens with Chainlink feeds, trustless pricing.
  • 'twap': Use time-weighted average price from DEX pools. Best for: On-chain pricing, newer tokens, DEX-native prices.
  • 'auto' (default): Automatic fallback chain - tries Chainlink first, falls back to TWAP, then CoinGecko if others fail.

strict_price_mode class-attribute instance-attribute

strict_price_mode: bool = True

Whether to fail when price providers cannot return a price.

When True (default): Raises ValueError if all price providers fail for a token. This is the institutional-grade setting that ensures all prices are from real data sources. Use this for production backtests where accuracy is critical. Error messages include the failed token and chain for debugging.

When False: Falls back to hardcoded prices for common tokens (ETH=$3000, BTC=$60000, etc.) when all price providers fail. This allows backtests to complete but may produce inaccurate results. Only use this for development/testing where price accuracy is not critical.

Note: This is the inverse of the deprecated allow_hardcoded_fallback field. If both are set, strict_price_mode takes precedence.

Environment variable: Set ALMANAK_ALLOW_HARDCODED_PRICES=1 to override strict_price_mode=False for testing scenarios.

allow_hardcoded_fallback class-attribute instance-attribute

allow_hardcoded_fallback: bool | None = None

DEPRECATED: Use strict_price_mode instead.

This field is kept for backward compatibility. If set, it will be converted to the equivalent strict_price_mode value (allow_hardcoded_fallback=False is equivalent to strict_price_mode=True).

Will be removed in a future version.

chain_id property

chain_id: int

Get the chain ID for the configured chain.

max_duration_seconds property

max_duration_seconds: int | None

Get the maximum duration in seconds, or None if indefinite.

max_duration_minutes property

max_duration_minutes: float | None

Get the maximum duration in minutes, or None if indefinite.

max_duration_hours property

max_duration_hours: float | None

Get the maximum duration in hours, or None if indefinite.

tick_interval_minutes property

tick_interval_minutes: float

Get the tick interval in minutes.

fork_rpc_url property

fork_rpc_url: str

Get the local fork RPC URL.

__post_init__

__post_init__() -> None

Validate configuration after initialization.

get_initial_balances

get_initial_balances() -> dict[str, Decimal]

Get all initial balances including ETH.

Returns:

Type Description
dict[str, Decimal]

Dictionary of token symbol to initial balance amount

to_dict

to_dict() -> dict[str, Any]

Serialize to dictionary.

from_dict classmethod

from_dict(data: dict[str, Any]) -> PaperTraderConfig

Deserialize from dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary containing config fields

required

Returns:

Type Description
PaperTraderConfig

PaperTraderConfig instance

__repr__

__repr__() -> str

Return a human-readable representation.

Results

BacktestResult

almanak.framework.backtesting.BacktestResult dataclass

BacktestResult(
    engine: BacktestEngine,
    strategy_id: str,
    start_time: datetime,
    end_time: datetime,
    metrics: BacktestMetrics,
    trades: list[TradeRecord] = list(),
    equity_curve: list[EquityPoint] = list(),
    initial_capital_usd: Decimal = Decimal("10000"),
    final_capital_usd: Decimal = Decimal("10000"),
    chain: str = "arbitrum",
    run_started_at: datetime | None = None,
    run_ended_at: datetime | None = None,
    run_duration_seconds: float = 0.0,
    config: dict[str, Any] = dict(),
    error: str | None = None,
    lending_liquidations: list[
        LendingLiquidationEvent
    ] = list(),
    aggregated_portfolio_view: AggregatedPortfolioView
    | None = None,
    reconciliation_events: list[
        ReconciliationEvent
    ] = list(),
    walk_forward_results: WalkForwardResult | None = None,
    monte_carlo_results: MonteCarloSimulationResult
    | None = None,
    crisis_results: CrisisMetrics | None = None,
    errors: list[dict[str, Any]] = list(),
    backtest_id: str | None = None,
    phase_timings: list[dict[str, Any]] = list(),
    config_hash: str | None = None,
    execution_delayed_at_end: int = 0,
    data_source_capabilities: dict[
        str, HistoricalDataCapability
    ] = dict(),
    data_source_warnings: list[str] = list(),
    data_quality: DataQualityReport | None = None,
    institutional_compliance: bool = True,
    compliance_violations: list[str] = list(),
    fallback_usage: dict[str, int] = dict(),
    preflight_report: PreflightReport | None = None,
    preflight_passed: bool = True,
    gas_prices_used: list[GasPriceRecord] = list(),
    gas_price_summary: GasPriceSummary | None = None,
    parameter_sources: ParameterSourceTracker | None = None,
    accuracy_estimate: AccuracyEstimate | None = None,
    data_coverage_metrics: DataCoverageMetrics
    | None = None,
)

Complete results from a backtest run.

This model is used by both the PnL Backtester and Paper Trader to provide consistent result formatting and analysis.

Attributes:

Name Type Description
engine BacktestEngine

Which backtesting engine was used (pnl or paper)

strategy_id str

Identifier of the strategy being tested

start_time datetime

When the backtest started (simulation time)

end_time datetime

When the backtest ended (simulation time)

metrics BacktestMetrics

Calculated performance metrics

trades list[TradeRecord]

List of all trade records

equity_curve list[EquityPoint]

Portfolio value over time

initial_capital_usd Decimal

Starting capital in USD

final_capital_usd Decimal

Ending capital in USD

chain str

Target blockchain (arbitrum, base, etc.)

run_started_at datetime | None

When the backtest run actually started (wall time)

run_ended_at datetime | None

When the backtest run actually completed (wall time)

run_duration_seconds float

Wall clock duration of the backtest run

config dict[str, Any]

Configuration used for the backtest

error str | None

Error message if backtest failed

lending_liquidations list[LendingLiquidationEvent]

List of lending liquidation events that occurred

aggregated_portfolio_view AggregatedPortfolioView | None

Tick-by-tick portfolio state snapshots with risk scores

reconciliation_events list[ReconciliationEvent]

List of position reconciliation events (discrepancies detected)

walk_forward_results WalkForwardResult | None

Results from walk-forward optimization (if run with --walk-forward)

monte_carlo_results MonteCarloSimulationResult | None

Results from Monte Carlo simulation (if run with --monte-carlo). Contains return confidence intervals, drawdown probabilities, and path statistics.

crisis_results CrisisMetrics | None

Crisis-specific metrics when backtest was run during a crisis scenario. Contains drawdown analysis, recovery time, and comparison to normal period performance.

errors list[dict[str, Any]]

List of error records as dictionaries with timestamps and context for debugging and analysis. Each error dict contains: timestamp, error_type, error_message, classification (with error_type, category, is_recoverable, is_fatal, is_non_critical, suggested_action), context, and handled action.

backtest_id str | None

Unique correlation ID (UUID) for this backtest run. Used for structured logging and tracing across all log messages generated during this backtest.

phase_timings list[dict[str, Any]]

List of phase timing records showing how long each backtest phase took. Each record contains: phase_name, start_time, end_time, duration_seconds, error. Useful for performance analysis and identifying bottlenecks.

config_hash str | None

SHA-256 hash of the configuration used for this backtest. Enables verification that a backtest was run with identical configuration. Calculated from all parameters that affect backtest results, excluding runtime metadata.

execution_delayed_at_end int

Count of pending intents executed at simulation end. These were queued due to inclusion_delay_blocks > 0 and executed with the last market state when the simulation completed.

data_source_capabilities dict[str, HistoricalDataCapability]

Dictionary mapping data provider names to their HistoricalDataCapability enum values. Shows which providers were used and their ability to provide accurate historical data (FULL, CURRENT_ONLY, PRE_CACHE). Useful for understanding potential data quality limitations in the backtest.

data_source_warnings list[str]

List of warning messages about data source limitations. Generated when providers with CURRENT_ONLY or PRE_CACHE capability are used, as these may affect backtest accuracy.

data_quality DataQualityReport | None

Data quality metrics for the backtest run. Includes coverage ratio, source breakdown, stale data count, and interpolation count. Useful for understanding data reliability and identifying potential accuracy issues.

institutional_compliance bool

Whether the backtest run meets institutional standards. Set to False when any strict reproducibility, data quality, or compliance check fails. Use compliance_violations to see which checks failed.

compliance_violations list[str]

List of compliance violations that caused institutional_compliance to be set to False. Each entry describes a specific compliance failure such as "CURRENT_ONLY data provider used", "Symbol mapping failed for 0x...", "Data coverage below minimum threshold (95% < 98%)".

fallback_usage dict[str, int]

Dictionary tracking count of each fallback type used during the backtest. Keys include: "hardcoded_price", "default_gas_price", "default_usd_amount". Empty dict means no fallbacks were used, which is the desired state for institutional-grade backtests.

preflight_report PreflightReport | None

Preflight validation report from checks run before the backtest. Contains pass/fail status, individual check results, estimated data coverage, and recommendations for fixing any issues. None if preflight validation was disabled.

preflight_passed bool

Whether preflight validation passed (True) or failed (False). Defaults to True if preflight validation was disabled. This is a convenience field for quick checks - for full details, inspect preflight_report.

parameter_sources ParameterSourceTracker | None

Tracks the source of all configuration parameters for audit purposes. Contains detailed records of where each configuration value came from (default, config_file, env_var, explicit) for config parameters, (asset_specific, protocol_default, global_default) for liquidation thresholds, and (historical, fixed, provider) for APY/funding rates. Critical for institutional compliance.

accuracy_estimate AccuracyEstimate | None

Estimated accuracy of this backtest based on strategy type and data quality tier. Provides expected accuracy range (e.g., "90-95%") and primary error source. Derived from ACCURACY_MATRIX based on documented accuracy limitations.

gas_prices_used class-attribute instance-attribute

gas_prices_used: list[GasPriceRecord] = field(
    default_factory=list
)

Optional detailed gas price records for each trade during the backtest.

When track_gas_prices=True in config, this list contains a GasPriceRecord for each trade showing the gas price used, its source, and USD cost. Useful for detailed gas cost analysis but may increase result size.

gas_price_summary class-attribute instance-attribute

gas_price_summary: GasPriceSummary | None = None

Summary statistics for gas prices used during the backtest.

Contains min, max, mean, std of gas prices in gwei plus source breakdown. Always populated when trades occurred, regardless of track_gas_prices setting.

parameter_sources class-attribute instance-attribute

parameter_sources: ParameterSourceTracker | None = None

Tracks the source of all configuration parameters for audit purposes.

Contains detailed records of where each configuration value came from: - Config parameters: default, config_file, env_var, explicit - Liquidation thresholds: asset_specific, protocol_default, global_default - APY/funding rates: historical, fixed, provider

This information is critical for institutional compliance and audit trails. When institutional_mode=True, this is always populated. The tracker provides summary dicts (config_sources, liquidation_sources, apy_funding_sources) for quick inspection and a full list of ParameterSourceRecord objects for detailed analysis.

accuracy_estimate class-attribute instance-attribute

accuracy_estimate: AccuracyEstimate | None = None

Estimated accuracy of this backtest based on strategy type and data quality.

Provides a quick reference showing expected accuracy range (e.g., "90-95%") based on the detected strategy type (LP, perp, lending, arbitrage, spot) and the data quality tier used (FULL, PRE_CACHE, CURRENT_ONLY).

The estimate is derived from the ACCURACY_MATRIX which is based on documented accuracy limitations and golden test tolerances. See docs/ACCURACY_LIMITATIONS.md for the full accuracy matrix and methodology.

Example usage

if result.accuracy_estimate: print(f"Expected accuracy: {result.accuracy_estimate.confidence_interval}") print(f"Primary error source: {result.accuracy_estimate.primary_error_source}")

data_coverage_metrics class-attribute instance-attribute

data_coverage_metrics: DataCoverageMetrics | None = None

Data coverage metrics tracking confidence levels across all position types.

Provides detailed breakdown of data quality for LP, Perp, Lending, and Slippage calculations. Includes confidence level breakdowns (high/medium/low) and data sources used for each position type.

The data_coverage_pct property gives overall percentage of HIGH confidence data points across all categories.

Example usage

if result.data_coverage_metrics: print(f"Data coverage: {result.data_coverage_metrics.data_coverage_pct:.1f}%") print(f"LP HIGH: {result.data_coverage_metrics.lp_metrics.high_confidence_pct:.1f}%")

success property

success: bool

Check if backtest completed successfully.

simulation_duration_days property

simulation_duration_days: float

Get the simulated duration in days.

total_return_pct property

total_return_pct: Decimal

Get total return as a percentage.

used_any_fallback property

used_any_fallback: bool

Check if any fallbacks were used during the backtest.

Returns True if the fallback_usage dict has any non-zero counts. When this is True, the backtest may have reduced accuracy due to using fallback values instead of real market data.

add_error

add_error(error_dict: dict[str, Any]) -> None

Add an error record and log it with timestamp and context.

This method is used to track errors that occurred during the backtest, along with their timestamps, classification, and handling.

Parameters:

Name Type Description Default
error_dict dict[str, Any]

Serialized error record from ErrorRecord.to_dict() or equivalent dict with keys: timestamp, error_type, error_message, classification, context, handled

required

summary

summary() -> str

Generate a human-readable summary of backtest results.

Returns:

Type Description
str

Multi-line string with formatted backtest results

to_dict

to_dict() -> dict[str, Any]

Serialize to dictionary.

from_dict classmethod

from_dict(data: dict[str, Any]) -> BacktestResult

Deserialize from dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary with serialized BacktestResult data

required

Returns:

Type Description
BacktestResult

BacktestResult instance

BacktestMetrics

almanak.framework.backtesting.BacktestMetrics dataclass

BacktestMetrics(
    total_pnl_usd: Decimal = Decimal("0"),
    net_pnl_usd: Decimal = Decimal("0"),
    sharpe_ratio: Decimal = Decimal("0"),
    max_drawdown_pct: Decimal = Decimal("0"),
    win_rate: Decimal = Decimal("0"),
    total_trades: int = 0,
    profit_factor: Decimal = Decimal("0"),
    total_return_pct: Decimal = Decimal("0"),
    annualized_return_pct: Decimal = Decimal("0"),
    total_fees_usd: Decimal = Decimal("0"),
    total_slippage_usd: Decimal = Decimal("0"),
    total_gas_usd: Decimal = Decimal("0"),
    winning_trades: int = 0,
    losing_trades: int = 0,
    avg_trade_pnl_usd: Decimal = Decimal("0"),
    largest_win_usd: Decimal = Decimal("0"),
    largest_loss_usd: Decimal = Decimal("0"),
    avg_win_usd: Decimal = Decimal("0"),
    avg_loss_usd: Decimal = Decimal("0"),
    volatility: Decimal = Decimal("0"),
    sortino_ratio: Decimal = Decimal("0"),
    calmar_ratio: Decimal = Decimal("0"),
    total_fees_earned_usd: Decimal = Decimal("0"),
    fees_by_pool: dict[str, Decimal] = dict(),
    lp_fee_confidence_breakdown: dict[str, int] = dict(),
    total_funding_paid: Decimal = Decimal("0"),
    total_funding_received: Decimal = Decimal("0"),
    liquidations_count: int = 0,
    liquidation_losses_usd: Decimal = Decimal("0"),
    max_margin_utilization: Decimal = Decimal("0"),
    total_interest_earned: Decimal = Decimal("0"),
    total_interest_paid: Decimal = Decimal("0"),
    min_health_factor: Decimal = Decimal("999"),
    health_factor_warnings: int = 0,
    avg_gas_price_gwei: Decimal = Decimal("0"),
    max_gas_price_gwei: Decimal = Decimal("0"),
    total_gas_cost_usd: Decimal = Decimal("0"),
    total_mev_cost_usd: Decimal = Decimal("0"),
    total_leverage: Decimal = Decimal("0"),
    max_net_delta: dict[str, Decimal] = dict(),
    correlation_risk: Decimal | None = None,
    liquidation_cascade_risk: Decimal = Decimal("0"),
    information_ratio: Decimal | None = None,
    beta: Decimal | None = None,
    alpha: Decimal | None = None,
    benchmark_return: Decimal | None = None,
    pnl_by_protocol: dict[str, Decimal] = dict(),
    pnl_by_intent_type: dict[str, Decimal] = dict(),
    pnl_by_asset: dict[str, Decimal] = dict(),
    realized_pnl: Decimal = Decimal("0"),
    unrealized_pnl: Decimal = Decimal("0"),
)

Performance metrics calculated from backtest results.

All financial values are in USD. Ratios are decimal (0.1 = 10%).

Attributes:

Name Type Description
total_pnl_usd Decimal

Total PnL before execution costs

net_pnl_usd Decimal

Net PnL after all execution costs

sharpe_ratio Decimal

Risk-adjusted return (annualized, assuming 0 risk-free rate)

max_drawdown_pct Decimal

Maximum peak-to-trough decline as decimal (0.1 = 10%)

win_rate Decimal

Percentage of profitable trades as decimal (0.6 = 60%)

total_trades int

Total number of trades executed

profit_factor Decimal

Ratio of gross profit to gross loss

total_return_pct Decimal

Total return as decimal (0.15 = 15% return)

annualized_return_pct Decimal

Annualized return as decimal

total_fees_usd Decimal

Total protocol fees paid

total_slippage_usd Decimal

Total slippage incurred

total_gas_usd Decimal

Total gas costs

winning_trades int

Number of profitable trades

losing_trades int

Number of losing trades

avg_trade_pnl_usd Decimal

Average PnL per trade

largest_win_usd Decimal

Largest single winning trade

largest_loss_usd Decimal

Largest single losing trade

avg_win_usd Decimal

Average winning trade PnL

avg_loss_usd Decimal

Average losing trade PnL

volatility Decimal

Annualized volatility of returns as decimal

sortino_ratio Decimal

Downside risk-adjusted return

calmar_ratio Decimal

Return / max drawdown

total_fees_earned_usd Decimal

Total fees earned from LP positions in USD

fees_by_pool dict[str, Decimal]

Dict mapping pool identifier to fees earned in USD

total_funding_paid Decimal

Total funding payments made from perp positions in USD

total_funding_received Decimal

Total funding payments received by perp positions in USD

liquidations_count int

Number of liquidation events that occurred

liquidation_losses_usd Decimal

Total losses from liquidations in USD

max_margin_utilization Decimal

Maximum margin utilization ratio observed during backtest (0-1)

total_interest_earned Decimal

Total interest earned from lending supply positions in USD

total_interest_paid Decimal

Total interest paid on borrow positions in USD

min_health_factor Decimal

Minimum health factor observed for lending positions during backtest (lower = more risk)

health_factor_warnings int

Number of times health factor dropped below warning threshold

avg_gas_price_gwei Decimal

Average gas price in gwei across all trades (for cost analysis)

max_gas_price_gwei Decimal

Maximum gas price in gwei observed during backtest (for peak cost analysis)

total_gas_cost_usd Decimal

Total gas costs in USD (same as total_gas_usd, kept for API consistency)

total_mev_cost_usd Decimal

Total estimated MEV (sandwich attack) costs in USD across all trades

total_leverage Decimal

Total portfolio leverage ratio (sum of all position notionals / equity)

max_net_delta dict[str, Decimal]

Maximum net delta exposure observed per asset (token symbol -> max delta)

correlation_risk Decimal | None

Portfolio correlation risk score (0-1, higher = more correlated positions)

liquidation_cascade_risk Decimal

Risk of cascading liquidations across protocols (0-1, higher = more risk)

information_ratio Decimal | None

Information ratio measuring risk-adjusted excess return vs benchmark (None if not calculated)

beta Decimal | None

Portfolio beta measuring sensitivity to benchmark movements (None if not calculated)

alpha Decimal | None

Jensen's alpha measuring excess return beyond what beta would predict (None if not calculated)

benchmark_return Decimal | None

Total return of the benchmark over the backtest period as decimal (None if not calculated)

pnl_by_protocol dict[str, Decimal]

PnL breakdown by protocol (e.g., {"uniswap_v3": Decimal("100"), "aave_v3": Decimal("-50")})

pnl_by_intent_type dict[str, Decimal]

PnL breakdown by intent type (e.g., {"SWAP": Decimal("75"), "LP_OPEN": Decimal("25")})

pnl_by_asset dict[str, Decimal]

PnL breakdown by asset (e.g., {"ETH": Decimal("80"), "USDC": Decimal("20")})

realized_pnl Decimal

Total realized PnL from closed positions in USD

unrealized_pnl Decimal

Total unrealized PnL from open positions in USD

lp_fee_confidence_breakdown class-attribute instance-attribute

lp_fee_confidence_breakdown: dict[str, int] = field(
    default_factory=dict
)

Count of LP positions by fee confidence level.

Example: {"high": 2, "medium": 1, "low": 0} - high: Fees calculated using actual historical volume data from subgraph - medium: Fees calculated using interpolated or estimated data - low: Fees calculated using multiplier heuristic

total_execution_cost_usd property

total_execution_cost_usd: Decimal

Get total execution costs (fees + slippage + gas).

to_dict

to_dict() -> dict[str, Any]

Serialize to dictionary.

PaperTradingSummary

almanak.framework.backtesting.PaperTradingSummary dataclass

PaperTradingSummary(
    strategy_id: str,
    start_time: datetime,
    duration: timedelta,
    total_trades: int,
    successful_trades: int,
    failed_trades: int,
    chain: str = "arbitrum",
    initial_balances: dict[str, Decimal] = dict(),
    final_balances: dict[str, Decimal] = dict(),
    total_gas_used: int = 0,
    total_gas_cost_usd: Decimal = Decimal("0"),
    pnl_usd: Decimal | None = None,
    error_summary: dict[str, int] = dict(),
    trades: list[PaperTrade] = list(),
    errors: list[PaperTradeError] = list(),
)

Summary of a paper trading session.

This dataclass provides an overview of the paper trading session, including trade counts, timing, and basic performance metrics.

Attributes:

Name Type Description
strategy_id str

Identifier of the strategy being tested

start_time datetime

When the session started

duration timedelta

How long the session ran

total_trades int

Total number of trades attempted

successful_trades int

Number of successful trades

failed_trades int

Number of failed trades

end_time datetime

When the session ended (computed)

chain str

Target blockchain

initial_balances dict[str, Decimal]

Starting token balances

final_balances dict[str, Decimal]

Ending token balances

total_gas_used int

Total gas consumed

total_gas_cost_usd Decimal

Total gas cost in USD

pnl_usd Decimal | None

Estimated PnL in USD (if available)

error_summary dict[str, int]

Count of errors by type

trades list[PaperTrade]

List of successful trades

errors list[PaperTradeError]

List of trade errors

end_time property

end_time: datetime

Get the session end time.

success_rate property

success_rate: Decimal

Calculate the success rate as a decimal (0.0 to 1.0).

duration_seconds property

duration_seconds: float

Get duration in seconds.

duration_minutes property

duration_minutes: float

Get duration in minutes.

duration_hours property

duration_hours: float

Get duration in hours.

trades_per_hour property

trades_per_hour: Decimal

Calculate average trades per hour.

avg_gas_per_trade property

avg_gas_per_trade: int

Calculate average gas used per successful trade.

summary

summary() -> str

Generate a human-readable summary.

Returns:

Type Description
str

Multi-line string with formatted session summary

to_dict

to_dict() -> dict[str, Any]

Serialize to dictionary.

from_dict classmethod

from_dict(data: dict[str, Any]) -> PaperTradingSummary

Deserialize from dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary with serialized PaperTradingSummary data

required

Returns:

Type Description
PaperTradingSummary

PaperTradingSummary instance

Data Providers

HistoricalDataProvider

almanak.framework.backtesting.HistoricalDataProvider

Bases: Protocol

Protocol defining the interface for historical data providers.

Historical data providers are responsible for fetching price and market data for past time periods. They are used by the PnL backtesting engine to simulate strategy execution.

Implementations should handle: - Fetching historical prices for specified tokens - Providing OHLCV data when available - Rate limiting and caching as needed - Graceful handling of missing data

Example implementation

class MyDataProvider: async def get_price( self, token: str, timestamp: datetime ) -> Decimal: # Fetch price from data source ...

async def get_ohlcv(
    self, token: str, start: datetime, end: datetime, interval: int
) -> list[OHLCV]:
    # Fetch OHLCV data
    ...

async def iterate(
    self, config: HistoricalDataConfig
) -> AsyncIterator[tuple[datetime, MarketState]]:
    # Yield market states for each time point
    ...

provider_name property

provider_name: str

Return the unique name of this data provider.

supported_tokens property

supported_tokens: list[str]

Return list of supported token symbols.

supported_chains property

supported_chains: list[str]

Return list of supported chain identifiers.

min_timestamp property

min_timestamp: datetime | None

Return the earliest timestamp with available data, or None if unknown.

max_timestamp property

max_timestamp: datetime | None

Return the latest timestamp with available data, or None if unknown.

get_price async

get_price(token: str, timestamp: datetime) -> Decimal

Get the price of a token at a specific timestamp.

Parameters:

Name Type Description Default
token str

Token symbol (e.g., "WETH", "USDC", "ARB")

required
timestamp datetime

The historical point in time

required

Returns:

Type Description
Decimal

Price in USD at the specified timestamp

Raises:

Type Description
ValueError

If price data is not available for the token/timestamp

DataSourceUnavailable

If the data source is unavailable

get_ohlcv async

get_ohlcv(
    token: str,
    start: datetime,
    end: datetime,
    interval_seconds: int = 3600,
) -> list[OHLCV]

Get OHLCV data for a token over a time range.

Parameters:

Name Type Description Default
token str

Token symbol (e.g., "WETH", "USDC", "ARB")

required
start datetime

Start of the time range (inclusive)

required
end datetime

End of the time range (inclusive)

required
interval_seconds int

Candle interval in seconds (default: 3600 = 1 hour)

3600

Returns:

Type Description
list[OHLCV]

List of OHLCV data points, sorted by timestamp ascending

Raises:

Type Description
ValueError

If data is not available for the token/range

DataSourceUnavailable

If the data source is unavailable

iterate async

iterate(
    config: HistoricalDataConfig,
) -> AsyncIterator[tuple[datetime, MarketState]]

Iterate through historical market states.

This is the primary method used by the backtesting engine. It yields market state snapshots at regular intervals throughout the configured time range.

Parameters:

Name Type Description Default
config HistoricalDataConfig

Configuration specifying time range, interval, and tokens

required

Yields:

Type Description
AsyncIterator[tuple[datetime, MarketState]]

Tuples of (timestamp, MarketState) for each time point

Raises:

Type Description
DataSourceUnavailable

If the data source is unavailable

Example

async for timestamp, market_state in provider.iterate(config): eth_price = market_state.get_price("WETH") # Process market state

HistoricalDataConfig

almanak.framework.backtesting.HistoricalDataConfig dataclass

HistoricalDataConfig(
    start_time: datetime,
    end_time: datetime,
    interval_seconds: int = 3600,
    tokens: list[str] = (lambda: ["WETH", "USDC"])(),
    chains: list[str] = (lambda: ["arbitrum"])(),
    include_ohlcv: bool = True,
    include_gas_prices: bool = False,
)

Configuration for historical data retrieval.

Specifies the time range, interval, and tokens to fetch for a backtest simulation.

Attributes:

Name Type Description
start_time datetime

Start of the historical period (inclusive)

end_time datetime

End of the historical period (inclusive)

interval_seconds int

Time between data points in seconds (default: 3600 = 1 hour)

tokens list[str]

List of token symbols to fetch prices for

chains list[str]

List of chain identifiers to fetch data for (default: ["arbitrum"])

include_ohlcv bool

Whether to fetch OHLCV data (default: True)

include_gas_prices bool

Whether to fetch historical gas prices (default: False)

duration_seconds property

duration_seconds: int

Get the total duration in seconds.

duration_days property

duration_days: float

Get the total duration in days.

estimated_data_points property

estimated_data_points: int

Get the estimated number of data points.

__post_init__

__post_init__() -> None

Validate configuration after initialization.

to_dict

to_dict() -> dict[str, Any]

Serialize to dictionary.

Crisis Scenarios

CrisisScenario

almanak.framework.backtesting.CrisisScenario dataclass

CrisisScenario(
    name: str,
    start_date: datetime,
    end_date: datetime,
    description: str,
)

A historical crisis scenario for backtesting.

This dataclass represents a period of significant market stress that can be used for stress-testing trading strategies.

Attributes:

Name Type Description
name str

Unique identifier for the scenario (lowercase, underscores)

start_date datetime

Beginning of the crisis period

end_date datetime

End of the crisis period

description str

Human-readable description of the crisis event

Properties

duration_days: Number of days in the crisis period

Example

scenario = CrisisScenario( ... name="custom_crisis", ... start_date=datetime(2023, 3, 10), ... end_date=datetime(2023, 3, 15), ... description="SVB collapse", ... ) scenario.duration_days 5

duration_days property

duration_days: int

Calculate the duration of the crisis in days.

to_dict

to_dict() -> dict[str, Any]

Serialize to dictionary.

Returns:

Type Description
dict[str, Any]

Dictionary with scenario data suitable for JSON serialization.

from_dict classmethod

from_dict(data: dict[str, Any]) -> CrisisScenario

Deserialize from dictionary.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary with serialized CrisisScenario data

required

Returns:

Type Description
CrisisScenario

CrisisScenario instance

__str__

__str__() -> str

Human-readable string representation.

Parallel Execution

almanak.framework.backtesting.run_parallel_backtests async

run_parallel_backtests(
    configs: list[PnLBacktestConfig],
    strategy_factory: Callable[[], Any],
    data_provider_factory: Callable[[], Any],
    backtester_factory: Callable[
        [Any, dict[str, Any], dict[str, Any]], Any
    ],
    fee_models: dict[str, Any] | None = None,
    slippage_models: dict[str, Any] | None = None,
    workers: int | None = None,
) -> list[ParallelBacktestResult]

Run multiple backtests in parallel using a process pool.

This function distributes backtest execution across multiple processes for improved performance on multi-core systems. Each backtest runs in its own process with its own instances of strategy, data provider, and backtester created via factory functions.

Parameters:

Name Type Description Default
configs list[PnLBacktestConfig]

List of PnLBacktestConfig objects to run

required
strategy_factory Callable[[], Any]

Factory function that returns a new strategy instance. Must be picklable (e.g., a module-level function).

required
data_provider_factory Callable[[], Any]

Factory function that returns a new data provider. Must be picklable (e.g., a module-level function).

required
backtester_factory Callable[[Any, dict[str, Any], dict[str, Any]], Any]

Factory function that returns a new PnLBacktester. Takes (data_provider, fee_models, slippage_models) as arguments.

required
fee_models dict[str, Any] | None

Optional dict of fee models to pass to backtester factory.

None
slippage_models dict[str, Any] | None

Optional dict of slippage models to pass to backtester factory.

None
workers int | None

Number of worker processes. Defaults to CPU count - 1.

None

Returns:

Type Description
list[ParallelBacktestResult]

List of ParallelBacktestResult in the same order as input configs.

list[ParallelBacktestResult]

Each result indicates success or failure with associated data.

Raises:

Type Description
ValueError

If configs list is empty

Example

def create_strategy(): return MyStrategy(param1=10, param2=0.5)

def create_data_provider(): return CoinGeckoDataProvider()

def create_backtester(provider, fee_models, slippage_models): return PnLBacktester(provider, fee_models, slippage_models)

results = await run_parallel_backtests( configs=[config1, config2, config3], strategy_factory=create_strategy, data_provider_factory=create_data_provider, backtester_factory=create_backtester, workers=4, )

Note
  • Factory functions must be picklable (module-level functions, not lambdas)
  • Each worker process creates its own instances to avoid sharing state
  • Results are returned in the same order as input configs