### Clip the input outliers (not eliminate).

The **Neural Network FAQ **is an immense source of information. In a chapter about normalization, it can be read that **removing outliers is not the only solution to attack the outlier contamination. Clipping outliers to specific thresholds is also viable.** An obvious idea is to use the SD of the training set to determine the clipping threshold. **Every value that is further to the arithmetic mean than a multiplication of the SD is clipped to that threshold.**

In a practical case, we use 200 samples in the training set. The arithmetic mean is always around zero. Let be the SD of that 200 samples is 1.3%. Using the thresholdMultiplier = 2.0, we will clip all the values that are less than -2.3% to -2.3% and we will clip all the values over 2.3% to 2.3%. (suppose the mean is zero).

In our backtest, **we used the following multipliers:
0.5, 1, 2, 3, Double.PositiveInfinity.**

Note that **we use this outlier clipping only for the input. For the output, we have always used the outlier elimination with a fix 4% threshold.**

The **vertical columns represent the number of neurons** used. We don’t intend to give too much importance on that dependence, so just **ignore it. The focus is on the clipping multiplier.**

ensembleMembers = 5;

nTestsPerCell = 5;

We would ignore the PV table, because it is too much of the randomness (the daily %change variation on the next day).

We would regard the dStat as the measurement of choice in this test.

**Based on dStat,
it seems that the Double.PositiveInfinity is generally the winner.** Even if it is not the winner, it is not far from the winner.

**So, this kind of input normalization doesn’t work in our case.**

(we may later test the same for the output. For that it definitely works in some cases (the proof is that the output outlier elimination is essential and it works)

(we may later test the same for the output. For that it definitely works in some cases (the proof is that the output outlier elimination is essential and it works)

Filed under: Uncategorized | Leave a Comment

## No Responses Yet to “Clip the input outliers (not eliminate).”