Every trader with an attitude dreams of a robot bringing not substantial but at least stable, stress-free (and preferably devoid of time investments) income. Developing such trading systems is not easy, and has much in common with the scientific research process. A trader, in fact, is trying to find and identify some market patterns so further price movement can be predicted. In the process of this "search" ("research" is essentially re(petitive)-search, starting to get it?), a trader has to make and proofcheck his assumptions, or, in scientific terms - put forward some hypotheses, then checking them on historical data to weed out false ones, and then allocating the capital. At this stage, mistakes often occur, usually stemming from not fully understanding the process, and as a result - receiving trading systems that show excellent results on historical data, but do not earn. At all. This article will try to provide an unbiased view on the problem of overfitting.

Among the recommendations for novice traders the following is often found: - you need to keep a log of your transactions, writing down everything that led you to opening a position at precisely this price and moment. The asset being "overbought", "oversold", support and resistance levels, the values of indicators, etc. - everything must be recorded, analogous to a book library (probably even more so - you never know when a small factor is going to pay a crucial role). A trader is also recommended to read and analyze the said journal periodically trying to understand what he did or did not do when we received a profit or loss. All this is done with one purpose in mind - **finding patterns** (yeah, you got it - finding patterns in a quest to find patterns, a meta-search).

The vast majority of trading algorithms and systems try to capitalize on some well-known patterns in price behavior. For example, through analysing our journal, we found out that on Tuesday afternoon, the price is always increasing (or, at least, most Tuesdays - that would be enough). In that case everything is plain simple - we just spit up a nifty little robot going long on Tuesday lunch, and closing the position by the evening. Of course, our little algorithm can be and easily become more complicated and entrailed with additional filters, including some indicators' values or even moon phases, but the gist remains the same - trying to find and use patterns that (presumably) arise in the price behavior, to one's favor. It is knowledge of a certain pattern that allows us to predict the future price.

Take a look at a sinusoid chart, or any other repeating cyclically:

If it were a price chart - you would be rich. Everything's obvious here - time to go long, time to sell, optimal closing points. Real market exhibits roughly the same meta-pattern - traders are looking for such a combination of prices and indicators that repeat constantly. E.g. they're trying to "cage" something so complex it's considered mostly random, in a small and (mostly) simple set of calculations. Periodic repetition of a pattern allows forecasting future behavior. And a correct forecast is exactly what you need for a profitable trading strategy, more so than a weather one.

For checking hypotheses' and educated (we hope so) guesses' credibility, they are backtested on historical data, then so-called forward testing is made. These tests basically "look" at how these patterns would have worked in the past. If past data testing results look bright, it is concluded that the future asset's behavior will correspond to it's past performance, at least for a while. Most traders in this case provide reasoning ground in approximately this way (also a pattern, albeit cognitive): "if during the last ten years this regularity existed, what are the odds it wouldn't last for another half a year and make me rich, huh"?

Obvious and natural at first glance, the trader's answer to this question may, unfortunately, turn out to be erroneous, and that's why: what if the pattern repetitions found are just coincidences? What if there's no real correlation, and the causation behind the pattern is bogus, a cognitive bias transformed into a coincidence? Like a tossed coin, accidentally dropping on the same side several times in a row. So an idea, vividly materialising in one's mind, confirmed by experts or/and tested on a breadth of historical data with excellent results, may in reality not have any predictive power. Each transaction of such a system will be like playing roulette: sooner or later you will lose. The main difference is that this roulette moreso reminds a Russian roulette, not the Vegas one; for overconfident traders. So, trading algorithms can be history-optimized, with combinations of parameters and indicators's options providing a beautiful upwards-facing parabola are found; this is usually called "overfitting".

How to distinguish an overfitted strategy from a robust one? It is necessary to say that it is best to analyze from the position of probability, because there is always a chance of coincidence we can almost never escape (yeah, "probability" and "never", we know). We can only highlight some characteristics that a trading system with robust predictive power should possess.

**Complexity**. More precisely - the complexity of the algorithm within the period of history on which it works. It's intuitive. The more indicators and filters a trading algorithm uses, the easier it will be for us to get the desired, cherry-picked "positive" result. Therefore, the simpler the trading algorithm - the more reliable it is. A more complex algorithm will usually only work on the data it has been tested on, although that's a bit of an extreme example; but you get the point.**Sensivity**. This is a very important characteristic. Sensitivity shows how much the change in the algo's parameters affects our final result (the equity curve). The less this influence - the better, meaning we're less likely to have received an overfitted, coincident result. Overfitted trading robots are by definition very "sensitive". A minor change, for example - the moving average period - can easily turn a profit-making robot into a losing one. This attribute (sensitivity) can even be used to test trading algorithms' robustness. Successively changing each of the parameters of the trading algorithm in a large range (somewhat resembling genetic optimization algorithm), one needs to look at how much this affects the testing results. If these changes do not lead to radical changes, then the probability that the trading system will have some real predictive power is proportionally more.**Representativeness and reliability of backtesting**. The testing period of the trading system should be quite large and include periods of different market behavior, so that the probability that a strategy tested and worked during a market's bullish phase, and will quickly lose in the emerging bearish one, will approach near-zero values. For example, it may be difficult to assess the results of a couple months worth of forward testing depth if only a dozen positions were opened. In this case, the probability of just a coincidence is high. Therefore, a profitable trading system be backtested on a large period of history, preferably with hundreds or thousands of perceived positions (the more the better).

Of course, even full compliance with the characteristics mentioned above will not and cannot give us some absolute guaranteed certainty that the trading robot will work as expected live, but it will definitely significantly increase the chances. Ultimately, the only thing that is available to us is backtesting using different data (this includes more advanced techniques such as, for example, Monte Carlo simulations, or genetic approach) for building robust systems with the above characteristics at arms, that can and possibly will make the trading system itself more reliable and "long-lasting" in the non-foreseeable future that awaits.