### Match input range across dimensions (GRNN)

We are already well aware the importance of normalizing the output. (See previous posts.) However, we trusted the Matlab FFNN (FeedForward Neural Network) to use the default ‘mapminmax’ input preprocessing to map the input to -1..+1, and therefore we thought it doesn’t really matter how we select our input range, because it will be mapped later. But the story is different for **GRNN (Generalized Regression Neural Network) that doesn’t do automatic input normalization.** **We think it is a flaw in the Matlab NN Toolbox design. **Why do some NN implementations use default input normalizations and others are not? Confusing the users?

Testing the new input, the currDayChange in backtests, we learned another thing about the range of inputs: It does matter if the range is very different across dimensions for the GRNN case. In our first test, the first input dimension, **the dayOfTheWeek had a range of [1..5]. The second input dimension, the currDayChange had a range of about [-0.05..+0.05]. The difference is about 100x times.** Guess what happened. **The GRNN that in theory used both the dayOfTheWeek and the currDayChange inputs completely ignored the currDayChange. The version that used the extra currDayChange gives exactly the same prediction then the one without this input.**

It completely neglected the new input with its tiny range. We were surprised, but in hindsight, it is perfectly understandable. **The GRNN works as a RBF network, by storing the input samples in the network and optimizing (learning) the r radius for every input sample and forming a spheres in the hyperspace. However, the radius of the sphere is the same across all the input dimensions.** (It is not an hyper-ellipse). So, **in the GRNN case, it is very important that the input dimensions have the same (or similar) range.**

And the FeedForward ANN case, it doesn’t hurt as well.

As a start, we used this normalization function to map the maximum value of the input to 1 (or to map it to 2 by a multiplier 2):

function [nnInput multiplier] = NormalizeInput(nnInput, inputVectorDim, isUseCurrBarChanges, isDirectionCurrBarChange)

indexCurrBarChangesMin = min(nnInput(inputVectorDim,:));

indexCurrBarChangesMax = max(nnInput(inputVectorDim,:));

indexCurrBarChangesAbsMinMax = max(abs(indexCurrBarChangesMin), abs(indexCurrBarChangesMax));

multiplier= indexCurrBarChangesAbsMinMax / 2;

if (isUseCurrBarChanges)

if (isDirectionCurrBarChange)

nnInput(inputVectorDim,:) = sign(nnInput(inputVectorDim,:));

else

nnInput(inputVectorDim,:) = nnInput(inputVectorDim,:) /multiplier;

end;

end

end

Later we thought that using **a fix 2x multiplier is awkward and it doesn’t adapt to the different regimes, for example different volatility environments. So, we introduced the stdev as a multiplier.** **Two standard deviations away from the mean account for roughly 95 percent of the samples.** When we map the 2x stdev to 1, it means that only 5% of the currDayChange samples fall out of the range [-1..1]. As we use a rolling window of about 200 samples, it is only about 10 samples out of 200 in our case. And this solution is adaptive.

indexCurrBarChangesStdDev = std(nnInput(inputVectorDim,:));

multiplier= indexCurrBarChangesStdDev * 2;

The result:

the GRNN in the 1dimensional dayOfTheWeek case:

using the [-0.05..0.05] currDayChange range:

D_stat: 51.19%, projectedCAGR: -5.89%, TR: -67.41%

using the 2xstdev mapped to [-1..1] currDayChange range:

D_stat: 51.21%, projectedCAGR: 0.83%, TR: -26.62%

Yes, the TR is still negative. We know that the GRNN doesn’t really work for this task. But notice that the **CAGR has turned from -5% to +1%**. Overall this change is an improvement.

We think that **it doesn’t help too much in the FF ANN case (because of the default mapminmax), but as it doesn’t hurt either, in the future, we always normalize all the input dimensions to the [-1..+1] range.**

Filed under: Uncategorized | Leave a Comment

## No Responses Yet to “Match input range across dimensions (GRNN)”