Friday, December 6, 2013

Turning Backtesting Results into Live Profits

When building a new trading strategy we have all seen the backtests that make you sit back in your chair and think you have just conquered the world. I am talking about the nice smooth equity curves, minimal drawdowns and the perfectly timed entries and exits. Inevitably, after containing your excitement enough to actually get the strategy running live, you watch in dismay as your strategy just doesn’t perform as you expected. You are then sent back to the drawing board with a lightened bank account, a more cynical view of the world, and a deep sense of frustration. There is only word to blame for this painful lesson: Overfitting.
What is Overfitting?
Overfitting, also known as over-optimization or curve-fitting (though this last one is a misnomer as every backtested strategy has some element of curve-fitting), is tailoring your strategy to fit a particular set of data and not the underlying patterns in the market. The strategy is basically just memorizing one dataset, which is very close to useless when running the strategy live. In machine learning terms, this problem is known as the bias-variance dilemma. Bias, or underfitting, occurs when your strategy is too simplistic to ever capture the underlying signal. Variance, or overfitting, is when the strategy is really just looking at the random noise in the data and not the underlying patterns. The million-dollar question is finding the balance between the bias and variance where the strategy captures enough of the underlying signal without getting caught up in the inherent random noise.
What to do about Overfitting?
Now that you have found the culprit, what can you really do about it? There are generally three schools of thought on how to battle overfitting:
KISS: Keep It Simple, Stupid
  • The easiest way to avoid overfitting is keeping your strategy simple, as some of the best strategies are the simplest. This means simpler models, fewer inputs, less rules, and minimal filters, keeping only the most basic elements of your strategy. While this institutes a high amount of bias (underfitting), at least you can be sure your backtesting results will be comparable to what you will actually get when you run your strategy live. A simple strategy won’t usually get you the spectacular returns of something more sophisticated but it is more likely to work in a variety of market conditions.
  • More Data
  • If a sophisticated strategy is more of your style, you have to be sure you have the data to back it up. Generally speaking, using more data to build your strategy will decrease the overfitting as the model or optimization process will have more information to separate the signal from the noise. This means having years of historical data and thousands of data points to create your strategy. While you generally want to keep your strategy as simple as possible, a more sophisticated strategy does have its advantages. Just be sure you have enough data so you aren’t fitting random noise.
  • Know Your Strategy
  • All strategies are curve-fit to some extent due to their inherent nature of relying on past events to predict future behavior. The least you can do is understand the extent which your strategy is fitting the data. The best way to do is this is keeping some data separate from the training or optimization process. By running your strategy on data that your strategy has never seen, you can get a better idea of how it will perform on new data. This can help you manage expectations and know when your strategy isn’t performing as it should when running live.

  • If you are comparing multiple strategies, it is best to split your available data into 3 sets: the training set, test set, and evaluation set. The training set is used to build the strategies, the test set is used to select the best strategy, and the evaluation set is used to give you an idea of how the best strategy will perform in real life (Careful, you can only use this evaluation set once or you institute an element of overfitting by selecting the strategy that happened to perform best on that particular dataset). The general rule of thumb is 60% for the training set, 20% for the test set, and 20% for the evaluation set.

  • Knowing where your strategy lies on the bias-variance scale is crucial to understanding how it will perform in real life. It is up to you as the strategy developer to find the balance between capturing enough of the signal while filtering out the noise. Only once you understand the principles of overfitting and how to avoid it can you even start thinking about running a strategy on a live account.