Standalone or ensemble ANN?

17Sep10

Common sense says a group is better than a standalone expert. But what if the expert is very good…?
We have made 2 tests for determining the optimal number of epochs. One for a standalone ANN and the other for an ensemble of 20 ANNs. See images here:
Click for the bigger image.
For the standalone case:

For the ensemble case:

Surprisingly, it seems there is no point using the group policy. If we have weak standalone experts (ANN trained for 1 or 2 epochs), it is worth using the democratic voting. However, if our experts are already good, there is no point aggregating their results.
We even make it worse with the aggregation. Every standalone expert has stuck in his local minima when climbing the ANN weights surface. There can be different local minima for the 20 ANNs. So for example, one ANN expert is very good at predicting Fridays. He specialized on it. Another expert is very good at predicting Mondays, but he is mediocre on Fridays. The third expert is very good on Wednesdays, but mediocre on Fridays. What happens when we aggregate them? And ask about a Friday forecast. The Friday expert will be in the minority, and the other two can vote him out. (At least in a democratic policy; but if Karl Popper doesn’t notice our blog, we may get rid of this big democracy idea later.)
Conclusion:
– if experts are specialized, ask the forecast from that expert, who is best in forecasting that day. This is not a democratic voting. Not averaging the votes with the same weight. It should be an adaptive meta-strategy that would give more weight to the ANN performed best in the past.
– If you want democratic averaging voting system, then don’t specialize the members too much
– 14% CAGR was achieved by an individual ANN after training for 5 epochs. The almost same 15% feat was achieved by the ensemble after training the individuals for only 2 epochs
– Overall, for epoch 5, the ensemble was only slightly better (0.4% CAGR) than the best individual. (That is not the best individual, but he average best individual. More about it later.) It suggests that there is no point using the ensemble method in this task.
– And it even has a danger in it. If overtrained, the ensemble performed worse than standalone expert ANN.

However, compare the standard deviation charts.
For the ensemble, the D_stat average is a little bit better (51.6% vs. 51.4%), the other averages are about the same, but the std. is about 3-4 times lower for all the three statistics. For example, let’s see the TR% std. The standalone case it is about 30%. That means that it is quite probable that one run of the strategy differs from the other run in 30% TR% performance (1 std away happens 1/3th of the cases). In the ensemble case, one run of the strategy is likely to differ from the other one only in 10% TR% performance. The ensemble strategy backtest is much more reliable. Overall, we found that the std of the ensemble is 4 times better than the standalone version. (the CAGR% std is 4% in the standalone case vs. 1% in the ensemble case). This alone is a significant reason why it is worth using the ensemble backtests. In the future we will use the ensemble method instead of the standalone, because only that can give backtest that we dare to trust.

Advertisements


No Responses Yet to “Standalone or ensemble ANN?”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: