### CurrDayChange input discretization, ensemble of 10

This is the continuation of a previous post (see here), in which we studied the discretization of the RUT index current day change input. **Our previous backtest was performed without ensembles**, so our backtest were very volatile. We repeat most of the text of that post now again.

We have **only 1 input now, the currDayChange**. We try to discretize the input into different equal sample size bins. For example, the 2 bins version discretizes the currDayChange according to its sign, mapping negative numbers to -1 and positive numbers to +1. We pursue this, because we suspect that there is inherently something difficult to learn in a continuous Gaussian function. Presenting the data in a nice simple way, we increase the chance that the ANN can learn it.

The parameters of the experiments:

`nNeurons = 2;`

nEpoch = 5;

lookbackWindowSize = 200;

outlierFixLimit = 0.04;

nEnsembleMembers = [10, 0, 0];

We made 8 different random experiments.

the non-discretized experiments:

Note the STD here. The 2bins case is very stable. Almost all the experiments give the same result. This is the beauty of discretization. We made the objective function simple. The ANN can learn it easily and consistently. Compare that STD with the STD of the continuous case.

The performance numbers in one table:

And we make a plot of these performance numbers.

Conclusions:

– **the 10.84% CAGR of the continuous case can be doubled to 20.48% CAGR in the 10 bins case**. So, this test revealed that **it is advantageous to discretize the input.** We should decide we want to use the 2 bins or 4 bins or 6 bins or 10 bins version.

– the **highest D_stat (52.82%) and the highest TR (536%) are in the 10 bins case.** (however, note the STD increases as we increase the bins, and it shows high randomness)

– **even the most primitive discretization, the 2 bins case is better than the continuous, non discretized input.**

– **compare this performance to the performance of the 1 member ensemble case:**

In this **10 member ensemble version, all the performance measures are better compared to the 1 ensemble case.** For example **in the 10 bins version, the CAGR improved from 12.3% to 20.48% and the TR improved from 174% to 536%. (The backtest is performed for 12 years.)**

– **Personally, I would love to use the 6 bins version** (maybe with more nNeurons). Intuitively it feels right to discretize the today change as very oversold, oversold, slightly oversold, slightly overbought, overbought, very overbought. I like these 6 categories. However, the tests show unusually high STD in the 6 bins case. So, I hesitate.

– and on the other hand it can be perfectly sensible that the ANN is better than a human observer. So instead of categorizing input into 6 bins (that is reasonable to a human),** it is best for the machine to categorize it into 10 bins.** Machines are omnipotent. We should rely on them. So this test showed me that **for the ANN the 10 bins version is the best. We should use this one in the future (in the 1 dimensional discretizational case)**. Note also that we have 200 inputs only. So,** in case of 10 bins, there are 20 samples in each bins.** (it seems this is the minimum we need.) No wonder that increasing the bins to 20 doesn’t work. After a while there are too many bins, too much randomness. This has some consequence for the future. If we have 2 input dimensions, and we discretize both input dimensions to 10 bins per dimension, that will result 100 bins **for the 2 dimensional space. Maybe** that will be too much. **Better to stick to 6 bins** in the 1 dim. case, **that will gave 36 bins on the 2 dim. surface.**

– The 2 bins case has the lowest randomness. If that is important: stability (for backtest), low nEnsembleMembers, use the 2 bins case.

– With this backtest, **we again set our record: this is our best strategy so far: 20%CAGR, 536% TR in 12 years. And the only input is the RUT index current day change. This is basically a strategy that learns that the market is in MR (mean reversion) or FT (follow through) mode and acts accordingly. It should be better than simple DV2, DVB and other fixed rules.**

Filed under: Uncategorized | Leave a Comment

## No Responses Yet to “CurrDayChange input discretization, ensemble of 10”