### The number of neurons is a cardinal factor.

In this post, we travel back and disassemble how the NN works.

It is paramount to understand the NN concept to see its advantages and limitations.

As a start, I don’t really like the name.

Instead of the term Artificial Neural Network, I prefer the less hyped Computational Network (CN) expression.

When the ANN was invented, people believed that it models the human neural system.

However, our understanding of the human brain has been made headway so much from that time, that it is

very childish to say now that the ANN is a model how the brain works.

One of the problem of the ANN (CN) design is how we select the number of hidden layers (I prefer the 1, because it is proven that

any 1 layer network can simulate any other N layer networks), and how we select the number of neurons in the layers.

In this post, we will set up a very simple function and we will forecast our simple function with an 1 dimensional (1 binary input) NN predictor.

Let’s see the distribution of the next week average %up/down direction as a function of

RUT difference from its SMA(180).

The chart is based on the last 10 years data.

The interpretation oft his is:

– when the RUT is far bellow (-35%) under its SMA(180), its next week direction is Up with the probability of 80%-90%.

– when the RUT is above its SMA(180) by more than 20%, the probability of the next week is Up is less than 50% (it is about 40%).

This is the usual spring model as expected. When the RUT moves too much from its statuary point, it swings back to it.

Let’s see it with more precision:

in Matlab:

sum((p_indexNextBarChanges > 0) .* (p_indexFromLongMA>0.2)) / sum(p_indexFromLongMA>0.2) = 39%

The number of samples that fell above >0.2 region is:

sum(p_indexFromLongMA>0.2) is about 4% of the total samples.

That is OK, I think. I expect that it is not a pure randomness here then.

The data shows this effect, there is a logical explanation and the number of samples is sufficient for it.

This does mean btw. that we can build a strategy that shorts the RUT, if it is over its SMA(180).

Great strategy, the problem is that we will very rarely be on the market, because it occurs only 4% of the time. (10 days in every year,

and probably those 10 days are consecutive days).

We create an input of the NN as it is:

%nnInput(:, i) = [abs(p_indexFromLongMA(i + inputStartOffset) > 0.2);];

This input is binary: True or false, 1 or 0.

It is 1, if the current day RUT is above the SMA(180), and 0 otherwise.

Let’s create the output as binary:

nnTarget(i) = sign(p_indexNextBarChanges(i + inputStartOffset));

This gives +1, if the next week is an Up week, and -1 otherwise.

So, what is the function that the ANN should learn?

Note that for the 10 years inputs: if the input is = 1, that the output is +1, (39% of the time), and -1 (61% of the time).

Note that if the input is = 0, the output is +1 (55% of the time), and -1 (45% of the time)

This is crucial feature of our input.

We have 2500 input samples.

many (1, -1) (input, target) pairs, but sometimes (with 39% of the time), there is a (1, +1) (input target) pair.

Poor NN. It has difficulty because of the semi-random samples.

This is it:

Note that the target function is a discrete function, not continuous.

The red dotted line is not part of the function. I just draw it to show, what I think can be the optimal continuous representation of it.

Let’s see the NN surface, how the NN predict this for different nNeurons.

The number of Neurons are: 1, 2, 3, 4, 5, 15.

I encircle by red what are not good features.

So, as we see, increasing the number of neurons can be very, very bad for the ANN prediction.

The best nNeurons is the one neuron case.

summary: always start with minimal neurons, and increase in progressively until the predictive power increases, but no further.

Filed under: Uncategorized | Leave a Comment

## No Responses Yet to “The number of neurons is a cardinal factor.”