The EA vendor’s sales page shows a beautiful equity curve. Maximum drawdown: 4.7%. Five years of data. Hundreds of trades. You check the myfxbook, the numbers match. You buy it, deploy it on a live account, and two months later you’re sitting at 19% drawdown wondering what went wrong.
Nothing went wrong. The backtest told you the truth — you just didn’t know how to read it.
A single backtest drawdown figure is not a risk estimate. It is a historical observation from one specific sequence of trades, in one specific order, during one specific market period. Change any of those three things and the drawdown changes — often dramatically. Understanding this distinction is the difference between sizing your risk correctly and blowing an account you thought was safe.
Why Backtest Drawdown Is Almost Always Optimistic
When you run a backtest, the strategy tester executes trades in chronological order — the same order they actually occurred in history. This produces a single equity curve with a single maximum drawdown figure. That number feels authoritative because it comes from real historical price data.
But consider what that number actually represents: the worst consecutive sequence of losses that happened to occur in the specific order your strategy encountered them, during the specific historical period you tested. If a cluster of losing trades happened to be distributed across several months rather than concentrated in a single week, the drawdown will look manageable. If a period of high volatility happened to fall in a quieter phase of your test window rather than at the start, you’ll see a smaller peak-to-trough decline.
The backtest drawdown doesn’t measure how bad it could get. It measures how bad it happened to be in one run of history. These are fundamentally different questions, and conflating them is one of the most expensive mistakes in algorithmic trading.
There is a reliable way to distinguish them: Monte Carlo simulation.
What Monte Carlo Simulation Actually Shows You
Monte Carlo analysis takes your backtest trade list and shuffles it thousands of times — typically 5,000 or more simulations. Each shuffle produces a different equity curve and a different maximum drawdown. The results are then ranked to produce a statistical distribution of possible outcomes.
The key output is the drawdown confidence interval: the drawdown level that 95% of all simulated sequences stayed within. This is your P95 drawdown — a statistically grounded estimate of plausible worst-case behavior from a strategy with your trade history.
The gap between your backtest drawdown and your P95 Monte Carlo drawdown is the number that actually matters for position sizing and risk management. And for most retail EA backtests, that gap is substantial.
The Typical Gap: What the Data Actually Shows
Across thousands of EA backtests analyzed with Monte Carlo simulation, a consistent pattern emerges. The ratio between P95 Monte Carlo drawdown and the original backtest drawdown — what we call the drawdown expansion ratio — clusters around predictable ranges depending on the strategy type:
1.2x to 1.5x: A tight, well-behaved distribution. The strategy’s risk profile is consistent regardless of trade ordering. The backtest drawdown is a reasonable approximation of realistic risk. This range is relatively uncommon — it indicates a strategy with consistent, uncorrelated trade outcomes.
1.5x to 2.5x: The most common range for legitimate EA backtests. A 5% backtest drawdown translates to a 7.5% to 12.5% P95 drawdown. This is the zone where most traders are surprised — they sized their risk based on 5% and are now experiencing 10%. The backtest wasn’t lying; they were reading it incorrectly.
2.5x to 4x: A warning zone. At this expansion ratio, the backtest drawdown significantly understates realistic risk. A 6% backtest drawdown with a 3x expansion ratio means a 95th percentile drawdown of 18%. Most traders deploying this strategy with standard risk parameters will eventually experience drawdowns they didn’t budget for.
Above 4x: Almost always indicates a strategy with clustering — losing trades that are correlated and tend to occur together. When the sequence favors you, the backtest looks clean. When it doesn’t, the losses pile up rapidly. This pattern is common in news-sensitive strategies, strategies that trade correlated instruments, and martingale structures where the loss clustering is a design feature rather than a coincidence.
A Concrete Example: The Same Strategy, Three Different Realities
Take a real-world scenario: an EURUSD scalper with the following backtest statistics over 3 years:
- Total trades: 634
- Win rate: 67%
- Profit factor: 1.74
- Maximum drawdown (backtest): 5.2%
- Recovery factor: 8.3
On paper, this looks excellent. A 5.2% drawdown on a 3-year backtest with 634 trades and a recovery factor above 8 is a strong result. Most traders would deploy this at 2-5% risk per trade and feel comfortable.
Now run Monte Carlo simulation on those 634 trades:
- P50 drawdown (median simulation): 7.8%
- P90 drawdown: 11.4%
- P95 drawdown: 13.9%
- P99 drawdown: 18.7%
The backtest showed 5.2%. Half of all simulated trade sequences produce a drawdown above 7.8%. In 5% of scenarios — one-in-twenty — the drawdown exceeds 13.9%. In 1% of scenarios, it exceeds 18.7%.
None of this is a prediction of what will happen. It is a description of what the strategy’s trade history implies is statistically possible. And “statistically possible” becomes “eventually inevitable” when you run a strategy for years across hundreds of trades.
A trader who sized risk based on the 5.2% backtest figure is exposed to drawdowns nearly 3x deeper than they budgeted for. That’s not bad luck. That’s a measurement error made at the point of position sizing.
The Three Strategies That Look Safest and Aren’t
Monte Carlo analysis is particularly revealing for three categories of EA that consistently show low backtest drawdown while carrying much higher real-world risk:
High win rate scalpers. An EA with a 75-85% win rate typically shows a very low backtest drawdown because losing trades are rare and distributed. But when losses do cluster — as they inevitably do during news events, spread widening, or regime changes — the drawdown can expand rapidly. Monte Carlo simulation captures this clustering risk in a way that the single-path backtest cannot. High win rate EAs often show the largest P95/backtest drawdown ratios precisely because their smooth equity curves obscure tail risk.
Range-bound strategies in trending markets. A mean-reversion strategy backtested during a ranging period will show beautiful, low-drawdown results. The losses are small and quickly recovered. But the trade sequence includes a survivorship bias — the strategy happened to be tested during a period that suited it. If those same trades were reordered with the clusters of trending market losses concentrated at the start, the drawdown profile looks very different. Monte Carlo doesn’t solve the regime problem, but it reveals how sensitive the strategy is to sequencing.
Strategies with infrequent but large losses. An EA that wins 90% of the time with small, consistent gains but occasionally takes a large loss looks excellent on a per-trade basis. The average loss is small, the average win is consistent, and the equity curve is smooth. But if two of those large losses occur consecutively — as they will, given enough time — the drawdown is 2x what the backtest implied. P95 Monte Carlo drawdown captures the realistic worst-case of these clustered outlier events.
How to Use the P95 Drawdown for Position Sizing
The practical application of Monte Carlo drawdown analysis is straightforward. Rather than sizing risk based on the backtest maximum drawdown, size it based on the P95 Monte Carlo drawdown.
If your risk tolerance is a 20% maximum account drawdown, and your P95 Monte Carlo drawdown is 14%, you have a reasonable buffer. If your P95 drawdown is already 18%, you are operating with almost no margin before your tolerance is breached — and you should reduce position size accordingly before the first trade is placed.
The formula is simple: Risk multiplier = Backtest drawdown ÷ P95 Monte Carlo drawdown. If your backtest shows 5% and your P95 is 15%, your risk multiplier is 0.33 — meaning you should deploy at roughly one-third of the position size you would have used based on backtest figures alone.
This is not a conservative adjustment — it is the correct adjustment. You are not reducing your risk below what the strategy can support. You are sizing to what the strategy actually implies, rather than to what one lucky historical sequence happened to show.
What Monte Carlo Doesn’t Tell You
Monte Carlo simulation works with the trades that exist in your backtest. It reshuffles them — it does not invent new ones. This means it cannot account for:
Future market regime changes. If the strategy’s edge was specific to the 2019-2024 market structure and conditions change materially, the Monte Carlo distribution — however wide — will not capture losses from an environment the strategy was never designed for.
Execution differences. Backtest fills are ideal. Live fills involve slippage, requotes, spread widening during volatility, and broker-specific execution behavior. These factors compress real-world profit factor and can increase effective drawdown beyond Monte Carlo projections.
Small sample sizes. With fewer than 100 trades, Monte Carlo simulation becomes less reliable as a risk estimator. The confidence intervals are wide and the distribution is poorly defined. A 60-trade backtest with a low drawdown is not made more credible by Monte Carlo analysis — the sample is too small for any statistical method to extract reliable conclusions.
Monte Carlo is a tool for understanding the range of outcomes implied by your existing trade history. It is one of several tests that together build a complete picture of strategy robustness — not a standalone verdict on whether to deploy.
Checking Your EA’s Real Drawdown Range
If you have an MT4, MT5, or cTrader backtest report, you can run a full Monte Carlo simulation on it for free using the ErgodicLabs tool. Upload the strategy tester HTML file and you’ll see the complete drawdown distribution — P50, P90, P95, and P99 — alongside the drawdown expansion ratio and a robustness score that factors in martingale detection, trade sequence predictability, and distribution shape.
The tool runs 5,000 simulations in your browser. Your data never leaves your machine. The entire analysis takes under ten seconds.
If you’ve been evaluating EAs based on backtest drawdown alone, the P95 figure from that analysis will almost certainly change how you think about the risk you’ve been accepting — and how much capital you should actually be deploying.