Normalize data!

02Feb10

The dismal performance of a previous post for the SP Close prediction let me down.
According to a couple of suggestions, it is worth normalizing the data.

So far, the problem is the too wide swings in foracast: after 1138 closePrice of the previous day,
the NN sometimes predicts 1299 closeprice for the next day.  That is +14%… Auch.
That cannot never happen in life. (never say never. :))
I contributed this wide swings to the fact that my ClosePrice and volume data weren’t normalized.
They swing from 990 to 1600 as the SP500 swings in the period of 2000 to 2009.

My normalization is simple:
normalizedClosePrices = (closes – minCloses) ./ (maxCloses-minCloses);

The problem with this simple normalization that the NN will never produce forecast out of that range.
So, the NN cannot forecast if the SP500 goes above 1600.
But so far, we can use this forecast.

The good news is that it worked.

After normalization, the forecasted values are better.
After 1138 closeprice of yesterday, it usually predicts 1138.9.
And the max. absolute value difference is -58 (not 160). -58 point is still -5%, that is big.
But not as big as +14%. But this -58 point happened only once in 30 tests. The usual values is now in the range of +- 1-2%.

That change also calmed the wide swings in performance.
The results doesn’t vary so much.

1. With these parameters:
nNeurons:40. nDaysInTest:260. 20 tests were performed
The WinLoseRatio:
53.08%    49.23%    48.46%    46.15%    46.92%    51.92%    53.46%    50.00%    51.15%    50.77%    50.38%    53.85%    50.00%    53.85%    46.92%    46.92%    44.62%    50.00%    52.31%    46.15%
The average: 49.81%. StDev: 2.82%.

The chart of these values show now information:

The gain in that period (260 days):
63.60%    -33.70%    -5.10%    -26.30%    -14.20%    12.10%    53.70%    -1.60%    13.80%    15.80%    5.00%    9.20%    48.20%    41.90%    -16.20%    -13.70%    -25.20%    19.00%    7.60%    -5.20%
The average: 7.44%. StDev: 27.33%.

2. With 200 neurons:
nNeurons:200. nDaysInTest:260, 9 tests were performed.
WinLoseRatio:
53.08%    47.31%    52.69%    50.38%    53.08%    51.92%    50.00%    50.00%    43.85%
The average: 50.26%. StDev: 3.05%.

The gain in that period (260 days):
20.80%    7.80%    29.10%    43.30%    -4.30%    2.30%    18.70%    -22.20%    -21.80%
The average: 8.19%. StDev: 22.23%.

So, clearly, none of them has predictive power.
But, their result is more consistent (less variance). There is less room for pure randomness.
That is a great news for future backtesting and stability.

Also note that NNeurons = 200 case was no better than NN = 40.

Advertisements


No Responses Yet to “Normalize data!”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: