It Will Happen
Your bot will crash. Not if — when.
- Network timeout at 3 AM
- Exchange maintenance window you didn’t know about
- Unhandled exception in an edge case
- Your VPS runs out of memory
- Python segfault (yes, really)
When it crashes, you might have open positions with no stop loss monitoring. This is where accounts blow up.
The Recovery Problem
When your bot restarts, it needs to answer:
- Do I have open positions?
- What were the entry prices?
- Are there stop loss orders on the exchange?
- What state was the trailing stop in?
If it can’t answer these questions, it’s blind. It might open duplicate positions, or worse, leave existing positions unmanaged.
My Recovery System
Step 1: State File
Every position change is saved to state.json:
|
|
This file is the bot’s memory. Without it, a restart is a cold start.
Step 2: Exchange Sync
State files can be wrong. Maybe the bot crashed between placing an order and updating the file. So on startup:
|
|
Step 3: SL Order Verification
For each recovered position, check if the stop loss order is still on the exchange:
- SL exists and active: Great, do nothing
- SL exists but triggered: Position might be closed, verify
- SL missing: Place a new one immediately
The scariest case is a position with no SL. This is an unprotected position — unlimited downside. The bot’s first priority on restart is making sure every position has a stop loss.
Step 4: Deep Recovery Edge Cases
What if the bot crashed right after opening a position but before placing the SL?
|
|
What if the position is already at -20% loss?
|
|
The Balance Check
Before any trading logic runs, verify you can actually fetch your balance:
|
|
If you can’t check your balance, you don’t know how much capital is available. Don’t open new positions blind.
Defensive Coding Patterns
Every API Call Gets a Try/Except
|
|
State Saves After Every Change
Not at the end of the loop. Not every minute. After every state change.
|
|
If the bot crashes 1 second after opening a position, the state file has it.
Graceful Shutdown
|
|
CTRL+C and system kill signals trigger a clean save before exit.
The Lesson
The difference between a toy bot and a production bot is crash recovery.
A toy bot works great when everything is normal. A production bot works great when everything is on fire.
Assume your bot will crash with open positions. Build the recovery before you build the strategy.
Your bot’s job isn’t just to make money. It’s to not lose money when things go wrong.