Backtest Validation - Referential Labs

Your backtest shows 40% annual returns. But will it hold up in live trading? Most don't—and the gap between simulation and reality is where capital goes to die.

The Problem

Backtesting errors are subtle and systematic. They don't throw exceptions—they silently inflate your expected returns until live trading reveals the truth. Common issues include:

  • Lookahead bias: Using information that wouldn't have been available at the time of the trade
  • Survivorship bias: Only testing on assets that still exist today, missing the losers that were delisted
  • Data leakage: Information bleeding from training data into test data, especially in ML pipelines
  • Overfitting: Tuning parameters until they fit historical noise rather than signal
  • Unrealistic execution: Assuming perfect fills, zero slippage, and infinite liquidity

Our Approach

Systematic validation that surfaces methodology issues for review. Automation catches the obvious errors; your judgment handles the edge cases.

Bias Detection

Statistical tests for lookahead, survivorship, and selection bias. Flag suspicious patterns in your returns.

Leakage Analysis

Trace data flow through your pipeline to identify where future information might contaminate historical analysis.

Overfitting Metrics

Parameter sensitivity analysis, out-of-sample degradation tracking, and complexity penalties.

Execution Realism

Compare your execution assumptions against historical market data. Model realistic slippage and market impact.

Validation Reports

Every backtest run generates a validation report that includes:

  • Pass/fail status for each validation check
  • Specific line items where issues were detected
  • Severity ratings and recommended investigation
  • Historical comparison against previous backtest versions
  • Confidence intervals on reported performance metrics

Ongoing Refinement

Validation isn't one-and-done. As you develop new strategies, trade new asset classes, or refine your methodology, the checks need tuning:

  • Thresholds that work for equities may not work for crypto
  • New data sources require new validation rules
  • False positives need investigation and rule refinement
  • Your team builds institutional knowledge about what matters

The platform handles routine detection so your researchers can focus on the judgment calls that actually require expertise.

Get Started

See what systematic validation surfaces in your backtests. Start with an assessment of your current methodology.

Contact Us

Frequently Asked Questions

What types of backtesting errors can you detect?
Our platform flags potential lookahead bias, survivorship bias, data leakage, overfitting patterns, and unrealistic execution assumptions. Flagged issues require human review—some will be real problems, others will be false positives that inform threshold tuning.
How does automated validation work?
You connect your backtesting system to our platform via API or file upload. We run statistical checks and methodology analysis, presenting results in a validation report. Your team reviews flagged items and decides which require action. Over time, you tune the checks to reduce noise for your specific strategies.
Can you validate backtests from any framework?
Yes. We support common frameworks like Zipline, Backtrader, QuantConnect, and custom Python/Rust implementations. Our validation focuses on the methodology and results rather than the specific framework.
Do you offer manual backtest review?
Yes. Beyond automated validation, our engineering team can perform detailed manual reviews of your backtesting methodology, data pipelines, and statistical analysis. This is especially valuable for complex multi-asset or ML-based strategies where automated checks need expert interpretation.