Accelerated Backtesting

Overview

Standard HftBacktest provides highly accurate results by simulating every market event, tracking queue positions, and modeling latencies precisely. However, this accuracy comes at a cost: backtesting can be slow, especially when:

Testing hundreds of parameter combinations
Running multi-month backtests
Iterating on new strategy ideas
Performing walk-forward optimization

Accelerated backtesting sacrifices some accuracy for dramatic speed improvements (often 10-50x faster) by:

Precomputing fill conditions for time intervals
Skipping detailed queue position tracking
Ignoring order response latency
Processing data in larger chunks

When to Use Accelerated Backtesting:

Parameter optimization and grid searches
Rapid idea validation
Initial strategy development
When queue position is less critical (small tick size markets)

When to Use Standard Backtesting:

Final strategy validation
Production deployment decisions
Large tick size markets
Queue-sensitive strategies

Key Differences

Aspect	Standard Backtest	Accelerated Backtest
Queue Position	Tracked with probability models	Ignored (no partial fills)
Order Response Latency	Modeled accurately	Ignored (immediate state update)
Feed Latency	Modeled	Modeled (preserved)
Fill Simulation	Event-by-event	Precomputed per interval
Partial Fills	Supported	Not supported
Speed	1x (baseline)	10-50x faster
Accuracy	Highest	Reduced, but sufficient for parameter search

How It Works

Fill Conditions

Instead of checking fills on every market event, accelerated backtesting precomputes fill prices for each time interval: For Buy Orders:

bid_fill_price = min(
    lowest_best_ask_in_interval,
    lowest_sell_trade_price_in_interval + one_tick
)

# Your buy order fills if:
order_price >= bid_fill_price

For Sell Orders:

ask_fill_price = max(
    highest_best_bid_in_interval,  
    highest_buy_trade_price_in_interval - one_tick
)

# Your sell order fills if:
order_price <= ask_fill_price

Important: Because queue position is not considered:

order_price == trade_price does NOT fill (need to cross)
Orders are either fully filled or not filled at all
No partial fills

Data Structure

Accelerated backtesting uses preprocessed data with precomputed fill prices:

                      row[t]                                    row[t+1]
Local
+----------------------------------------------------------+-------------------
|local_ts[t]                                               |local_ts[t+1]
|best_bid[t]                                               |best_bid[t+1]
|best_ask[t]                                               |best_ask[t+1]
+----------------------------------------------------------+-------------------
Exchange
+----------------------------------------------------------+-------------------
|                                            bid_fill[t+1] |
|                                            ask_fill[t+1] |
+-------------------------+--------------------------------+-------------------
|  order entry latency    |order_ack_ts[t]                |
|  at local_ts[t]         |best_bid_ack[t]                |
|                         |best_ask_ack[t]                |
|          bid_fill_ack[t]|          bid_fill_after_ack[t]|
|          ask_fill_ack[t]|          ask_fill_after_ack[t]|
+-------------------------+--------------------------------+-------------------

At each interval:

bid_fill[t+1]: Highest price where buy orders get filled
ask_fill[t+1]: Lowest price where sell orders get filled
bid_fill_ack[t]: Fill price for orders sent before ack time
bid_fill_after_ack[t]: Fill price for orders sent after ack time

Preprocessing Market Data

Step 1: Define the Running Interval

Choose your strategy’s decision interval:

running_interval = 100_000_000  # 100ms in nanoseconds

# This determines:
# - How often your strategy makes decisions
# - The granularity of precomputed fill prices
# - Speed vs accuracy tradeoff

Choosing the Interval:

10-100ms: Good balance for most strategies
100-500ms: Faster, sufficient for slower strategies
< 10ms: Defeats purpose of acceleration

Step 2: Process Raw Market Data

Implement preprocessing using Numba for performance:

import numpy as np
from numba import njit

INVALID_MIN = 0
INVALID_MAX = np.iinfo(np.int64).max - 1

@njit
def preprocess_data(raw_data, running_interval, tick_size, entry_latency):
    """
    Preprocess market data for accelerated backtesting
    
    Args:
        raw_data: Raw market events [exch_ts, local_ts, px, qty, ...]
        running_interval: Strategy running interval in nanoseconds
        tick_size: Market tick size
        entry_latency: Order entry latency in nanoseconds
    
    Returns:
        Preprocessed data with precomputed fill prices
    """
    # Initialize output arrays
    num_intervals = int((raw_data[-1]['local_ts'] - raw_data[0]['local_ts']) / running_interval) + 1
    
    processed = np.zeros(num_intervals, dtype=[
        ('local_ts', 'i8'),
        ('best_bid', 'f8'),
        ('best_ask', 'f8'),
        ('bid_fill', 'f8'),
        ('ask_fill', 'f8'),
        ('order_ack_ts', 'i8'),
        ('best_bid_ack', 'f8'),
        ('best_ask_ack', 'f8'),
        ('bid_fill_ack', 'f8'),
        ('ask_fill_ack', 'f8'),
        ('bid_fill_after_ack', 'f8'),
        ('ask_fill_after_ack', 'f8'),
    ])
    
    start_ts = raw_data[0]['local_ts']
    interval_idx = 0
    
    # Initialize tracking variables
    current_best_bid = np.nan
    current_best_ask = np.nan
    
    # For each interval
    for i in range(num_intervals):
        interval_start = start_ts + i * running_interval
        interval_end = interval_start + running_interval
        ack_time = interval_start + entry_latency
        
        # Initialize interval values
        lowest_best_ask = np.inf
        highest_best_bid = -np.inf
        lowest_sell_trade = np.inf
        highest_buy_trade = -np.inf
        
        # Process events in this interval
        for event in raw_data:
            if event['local_ts'] < interval_start:
                continue
            if event['local_ts'] >= interval_end:
                break
            
            # Track best bid/ask
            if event['ev'] == DEPTH_EVENT:
                if event['ev'] & BUY_EVENT:
                    current_best_bid = event['px']
                else:
                    current_best_ask = event['px']
            
            # Track trades
            elif event['ev'] == TRADE_EVENT:
                if event['ev'] & BUY_EVENT:
                    highest_buy_trade = max(highest_buy_trade, event['px'])
                else:
                    lowest_sell_trade = min(lowest_sell_trade, event['px'])
            
            # Track best prices in interval
            lowest_best_ask = min(lowest_best_ask, current_best_ask)
            highest_best_bid = max(highest_best_bid, current_best_bid)
        
        # Compute fill prices
        bid_fill = lowest_best_ask
        if np.isfinite(lowest_sell_trade):
            bid_fill = min(bid_fill, lowest_sell_trade + tick_size)
        
        ask_fill = highest_best_bid
        if np.isfinite(highest_buy_trade):
            ask_fill = max(ask_fill, highest_buy_trade - tick_size)
        
        # Store in processed data
        processed[i]['local_ts'] = interval_start
        processed[i]['best_bid'] = current_best_bid
        processed[i]['best_ask'] = current_best_ask
        processed[i]['bid_fill'] = bid_fill
        processed[i]['ask_fill'] = ask_fill
        processed[i]['order_ack_ts'] = ack_time
        # ... compute ack-related values similarly
    
    return processed

# Save preprocessed data
np.savez('btcusdt_20240101_accel.npz', data=processed)

Step 3: Simplified Preprocessing

For practical use, you can leverage HftBacktest’s data utilities and focus on interval-based aggregation:

from hftbacktest.data.utils import tardis
import numpy as np

# First convert to standard format
tardis.convert(
    ['BTCUSDT_trades_20240101.csv.gz',
     'BTCUSDT_incremental_book_L2_20240101.csv.gz'],
    output_filename='BTCUSDT_20240101.npz'
)

# Then run your preprocessing
raw_data = np.load('BTCUSDT_20240101.npz')['data']
processed = preprocess_data(
    raw_data,
    running_interval=100_000_000,  # 100ms
    tick_size=0.1,
    entry_latency=1_000_000  # 1ms
)
np.savez('BTCUSDT_20240101_accel.npz', data=processed)

Using Accelerated Backtesting

Once you have preprocessed data, use it with a simplified backtester:

from hftbacktest import BacktestAsset, AcceleratedBacktest
from numba import njit

@njit
def fast_strategy(hbt):
    asset_no = 0
    tick_size = hbt.depth(asset_no).tick_size
    
    # Strategy runs at preprocessed interval (e.g., 100ms)
    while hbt.elapse(100_000_000) == 0:
        depth = hbt.depth(asset_no)
        position = hbt.position(asset_no)
        
        best_bid = depth.best_bid
        best_ask = depth.best_ask
        mid_price = (best_bid + best_ask) / 2.0
        
        # Simple market making logic
        half_spread = tick_size * 2
        bid_price = mid_price - half_spread
        ask_price = mid_price + half_spread
        
        order_qty = 0.1
        
        # Clear old orders
        hbt.clear_inactive_orders(asset_no)
        
        # Submit new orders
        if position < 10:
            hbt.submit_buy_order(asset_no, 1, bid_price, order_qty, 
                                GTC, LIMIT, False)
        if position > -10:
            hbt.submit_sell_order(asset_no, 2, ask_price, order_qty, 
                                 GTC, LIMIT, False)
    
    return True

# Configure with accelerated data
asset = (
    BacktestAsset()
        .data(['BTCUSDT_20240101_accel.npz'])
        .accelerated()  # Use accelerated mode
        .linear_asset(1.0)
        .trading_value_fee_model(-0.00005, 0.0007)
        .tick_size(0.1)
        .lot_size(0.001)
)

hbt = AcceleratedBacktest([asset])
recorder = Recorder(1, 1_000_000)

fast_strategy(hbt, recorder)

In accelerated mode:

Queue position is not tracked
Orders fill immediately when conditions met (no response latency)
Much faster execution
Suitable for parameter optimization

Parameter Optimization

Accelerated backtesting shines in parameter optimization:

from itertools import product
import numpy as np
import pandas as pd

def optimize_parameters(data_files):
    """Grid search over strategy parameters"""
    
    # Define parameter grid
    half_spreads = [1, 2, 3, 4, 5]  # In ticks
    max_positions = [5, 10, 15, 20]
    skews = [0.0, 0.5, 1.0, 1.5]
    
    results = []
    
    # Test all combinations
    for half_spread, max_pos, skew in product(half_spreads, max_positions, skews):
        asset = (
            BacktestAsset()
                .data(data_files)
                .accelerated()
                # ... config
        )
        
        hbt = AcceleratedBacktest([asset])
        recorder = Recorder(1, 1_000_000)
        
        # Run strategy with these parameters
        fast_strategy(hbt, recorder, half_spread, max_pos, skew)
        
        # Collect results
        record = LinearAssetRecord(recorder.get_records(0))
        results.append({
            'half_spread': half_spread,
            'max_position': max_pos,
            'skew': skew,
            'sharpe': record.sharpe_ratio,
            'total_pnl': record.total_pnl,
            'num_trades': record.num_trades,
        })
    
    return pd.DataFrame(results)

# Run optimization
results = optimize_parameters(['BTCUSDT_20240101_accel.npz'])

# Find best parameters
best = results.loc[results['sharpe'].idxmax()]
print(f"Best parameters: {best}")

Validation Workflow

Use accelerated backtesting for optimization, then validate with standard backtesting:

# Step 1: Fast parameter search (accelerated)
params = optimize_parameters_fast(accel_data)

# Step 2: Validate top candidates (standard)
top_10_params = params.nlargest(10, 'sharpe')

validation_results = []
for idx, params in top_10_params.iterrows():
    # Use STANDARD backtesting with full accuracy
    asset = (
        BacktestAsset()
            .data(standard_data_files)  # Use full, non-preprocessed data
            .power_prob_queue_model(3.0)  # Enable queue model
            .constant_latency(1_000_000, 1_000_000)  # Model latency
            # ... full config
    )
    
    hbt = ROIVectorMarketDepthBacktest([asset])
    recorder = Recorder(1, 50_000_000)
    
    strategy(hbt, recorder, **params)
    
    record = LinearAssetRecord(recorder.get_records(0))
    validation_results.append({
        'params': params,
        'accel_sharpe': params['sharpe'],
        'standard_sharpe': record.sharpe_ratio,
    })

# Step 3: Select parameters that validate well
for result in validation_results:
    sharpe_diff = abs(result['standard_sharpe'] - result['accel_sharpe'])
    if sharpe_diff < 0.2:  # Similar performance
        print(f"Good parameters: {result['params']}")

Accuracy Tradeoffs

Understand what you lose with acceleration:

Lost Accuracy

Queue Position Effects
- Can’t model “getting in line early”
- Overestimates fills in congested markets
- Underestimates fills when you’d be at front
Partial Fills
- Orders either fully fill or don’t fill
- Reality: large orders may partially fill
Order Response Latency
- State updates happen immediately
- Reality: you don’t know fill status until response arrives
- Can lead to unrealistic hedging in backtest

Preserved Accuracy

Feed Latency
- Still modeled correctly
- You react to stale market data as in reality
Order Entry Latency
- Still modeled correctly
- Orders arrive at exchange with delay
Price Movements
- Market dynamics preserved
- Spread and volatility effects captured

Best Practices

Choose Appropriate Interval

The running interval should match your strategy’s natural decision frequency:

HFT strategies: 10-50ms
Market making: 50-200ms
Slower strategies: 200-1000ms

Smaller intervals = more accuracy but less speed gain.

Validate with Standard Backtest

Always validate your top parameter sets with standard backtesting before live deployment. Use accelerated mode for searching, standard mode for validation.

Avoid Queue-Sensitive Strategies

Accelerated backtesting works poorly for:

Strategies that rely on queue position
Large tick size markets
Strategies that use GTX orders aggressively

Use standard backtesting for these cases.

Monitor Accuracy Degradation

Compare accelerated vs standard results periodically:

accel_sharpe = 2.1
standard_sharpe = 1.9
degradation = (accel_sharpe - standard_sharpe) / standard_sharpe

if degradation > 0.15:  # >15% optimistic
    # Use standard backtesting or adjust expectations

Performance Comparison

Typical speedups from accelerated backtesting:

Strategy Type	Standard Time	Accelerated Time	Speedup
Simple Market Making	120s	8s	15x
Grid Trading	180s	5s	36x
Multi-Asset Strategy	300s	25s	12x
Complex Alpha Strategy	240s	18s	13x

Get Started

Core Concepts

Guides

Advanced

Accelerated Backtesting

Overview

Key Differences

How It Works

Fill Conditions

Data Structure

Preprocessing Market Data

Step 1: Define the Running Interval

Step 2: Process Raw Market Data

Step 3: Simplified Preprocessing

Using Accelerated Backtesting

Parameter Optimization

Validation Workflow

Accuracy Tradeoffs

Lost Accuracy

Preserved Accuracy

Best Practices

Performance Comparison

Next Steps

Queue Models

Pricing Framework

Get Started

Core Concepts

Guides

Advanced

Documentation Index

​Overview

​Key Differences

​How It Works

​Fill Conditions

​Data Structure

​Preprocessing Market Data

​Step 1: Define the Running Interval

​Step 2: Process Raw Market Data

​Step 3: Simplified Preprocessing

​Using Accelerated Backtesting

​Parameter Optimization

​Validation Workflow

​Accuracy Tradeoffs

​Lost Accuracy

​Preserved Accuracy

​Best Practices

​Performance Comparison

​Next Steps

Queue Models

Pricing Framework

Overview

Key Differences

How It Works

Fill Conditions

Data Structure

Preprocessing Market Data

Step 1: Define the Running Interval

Step 2: Process Raw Market Data

Step 3: Simplified Preprocessing

Using Accelerated Backtesting

Parameter Optimization

Validation Workflow

Accuracy Tradeoffs

Lost Accuracy

Preserved Accuracy

Best Practices

Performance Comparison

Next Steps