What Is Overfitting?
Your trading bot’s backtest shows +500% returns. You deploy it. It loses money immediately.
Overfitting means your strategy learned the noise in historical data instead of the signal. It memorized the past instead of discovering patterns that repeat in the future.
This is the #1 reason trading bots fail in production.
The Overfitting Checklist
Use this checklist every time you build or modify a strategy. If you answer “yes” to any of these, you might be overfitting.
Red Flag 1: Too Many Parameters
Question: Does your strategy have more than 5-7 tunable parameters?
Every parameter is a degree of freedom. More freedom = more room to fit noise.
| Parameters | Risk Level |
|---|---|
| 1-3 | Low — probably capturing real signal |
| 4-7 | Medium — be careful |
| 8-12 | High — likely overfitting |
| 13+ | Almost certainly overfitting |
My approach: My trend-following bot has 6 main parameters (SL%, trailing activation, trailing stop, body threshold, volume ratio, CHOP threshold). I optimize 2-3 at a time, never all at once.
Red Flag 2: Optimization on Short Data
Question: Did you optimize on less than 3 months of data?
Short periods have specific market conditions. Parameters optimized on a 2-week rally will fail in the next sideways period.
| Data Period | Reliability |
|---|---|
| 1-2 weeks | Useless |
| 1 month | Dangerous |
| 3 months | Minimum |
| 6-12 months | Good |
| Multiple years | Best |
Red Flag 3: No Out-of-Sample Test
Question: Have you tested on data the strategy has never seen?
This is the single most important test. Split your data:
|-------- Training (70%) --------|----- Testing (30%) -----|
Optimize here Validate here
If performance drops significantly on the test set, you’re overfitting.
My experience:
| Strategy | Training PnL | Out-of-Sample PnL | Verdict |
|---|---|---|---|
| Momentum Bot | +800U | -200U | Overfitting — killed it |
| Trend Following | +343U | +245U | Real edge — deployed |
| FVG Bot | +444U | +548U (1yr) | Real edge — deployed |
Red Flag 4: Spike vs Plateau
Question: Does a 10% change in any parameter destroy performance?
Good parameters sit on plateaus — nearby values work almost as well. Overfit parameters sit on spikes — any small change collapses returns.
GOOD (Plateau):
SL 1.5%: +200U | SL 2.0%: +250U | SL 2.5%: +230U
BAD (Spike):
SL 1.5%: -100U | SL 2.0%: +500U | SL 2.5%: -80U
If your strategy only works with SL at exactly 2.0%, you don’t have an edge. You have a coincidence.
Red Flag 5: Win Rate Too High
Question: Is your backtest win rate above 65%?
In crypto, with typical trend-following or mean-reversion strategies, realistic win rates are:
- Trend following: 40-60%
- Mean reversion: 25-40%
- Scalping: 50-65%
If your backtest shows 80%+ win rate, something is wrong. You’re probably:
- Looking at too short a period
- Not accounting for slippage and fees
- Using future data accidentally (look-ahead bias)
Red Flag 6: No Losing Months
Question: Does your backtest have zero losing months?
Real strategies have drawdowns. Every strategy has market conditions where it underperforms. If your equity curve is a smooth line going up, it’s fiction.
My FVG bot’s quarterly results:
Q2 2025: +271U ✓
Q3 2025: +38U ✓ (barely)
Q4 2025: +273U ✓
Q1 2026: -36U ✗ (sideways market)
One losing quarter out of four. That’s realistic. If all four were positive with smooth returns, I’d be suspicious.
Red Flag 7: Complexity Without Justification
Question: Can you explain WHY each rule exists?
Every condition in your strategy should have a logical reason:
| Rule | Why It Exists |
|---|---|
| Body ≥ 0.7% | Filter out noise/sideways candles |
| Volume Ratio ≥ 1.5 | Confirm strong move, not just random fluctuation |
| CHOP < 50 | Avoid choppy markets where trend-following fails |
| SL at 2% | Based on typical crypto noise level on 5m timeframe |
If you added a rule just because it improved the backtest but you can’t explain the logic, it’s probably overfitting.
The Prevention Protocol
Step 1: Walk-Forward Analysis
Instead of one backtest, do this:
Period 1: Train on Jan-Mar → Test on Apr
Period 2: Train on Feb-Apr → Test on May
Period 3: Train on Mar-May → Test on Jun
...
If performance is consistent across all test periods, the edge is real.
Step 2: Multiple Market Conditions
Test in:
- Bull market (strong uptrend)
- Bear market (strong downtrend)
- Sideways (ranging/choppy)
- High volatility events (crashes, pumps)
A real strategy works in at least 2-3 of these. An overfit strategy only works in the specific condition it was trained on.
Step 3: Trimmed Returns
Calculate PnL with the top 10% and bottom 10% of trades removed.
If your total PnL is +500U but your trimmed PnL is -50U, your returns depend on a few outlier trades. That’s not a robust strategy — it’s luck.
Step 4: Trade Count Minimum
- Under 30 trades: Statistically meaningless
- 30-100 trades: Weak but indicative
- 100-500 trades: Reasonable confidence
- 500+ trades: Good statistical significance
My trend-following bot validation: 1,800 trades. FVG bot OOS test: 1,506 trades. Large enough samples that the results mean something.
Step 5: Keep It Simple
“Everything should be made as simple as possible, but no simpler.” — Einstein
The best trading strategies are surprisingly simple. When I added complexity to my bots (rolling RSI, regime detection, multiple indicator crossovers), performance almost always got worse on out-of-sample data.
The strategies that survived:
- Trend following: body + volume + CHOP filter → 3 conditions
- FVG: gap detection + size filter + trend filter → 3 conditions
Simple strategies are harder to overfit.
The Ultimate Test
Ask yourself:
“If I showed this strategy to someone with no knowledge of my training data, would they agree the logic makes sense?”
If the answer is yes, you probably have a real edge. If the answer is “well, the backtest shows…” — you’re probably overfitting.
The market doesn’t care about your backtest. It only cares about whether your edge is real.