Open/High/Low prices are back in the input. Use SPY instead of SPX


Previously when I realized that the Open/High/Low prices of the SPX is unreliable, I revert back to use only the close price and volume data per day in the input.
However, if we want the NN to find patterns in candlestick charts, we have to give the candlestick input, the open/high/low prices as well.
Because it is not possible with SPX, the decision was made to use the SPY now.
However, care has to be taken.
SPY is a stock, and occasionally it has dividends. Luckily, it had no splits in the period.
Therefore instead of the using the open/high/low/close prices we have to use the adjusted (dividend adjusted) open/high/low/close prices.
The volume should be split adjusted if any, but luckily SPY didn’t have splits.

I run a couple of test at the weekend.
Each of them took hours to run.
Let’s see the results.

The input contains adjusted SPY open/high/low/close and volume data.
Per day.
If the input contains 2 days, the input vector has 2*5 = 10 dimension.
If the input contains 5 days, the input vector has 5*5 = 25 dimension.
The date of inspection is 2010-02-10.
So, when we say 5days test, it means the performance was measured only in the last 5 days (2010-02-03…-2010-02-10, excluding the weekend).

2days input
5days test
After 310 tests:
winLoseRatios Arithmetic Mean: 64.58%, stdev: 21.19%
avgDailyGainPercents Arithmetic Mean: 0.13%, stdev: 0.37%. The would-be CAGR is: 38.90%

5days input
5days test
After 1095 tests:
winLoseRatios Arithmetic Mean: 63.34%, stdev: 21.26%
avgDailyGainPercents Arithmetic Mean: 0.12%, stdev: 0.35%. The would-be CAGR is: 36.50%

2days input
260days test
After 218 tests:
winLoseRatios Arithmetic Mean: 51.44%, stdev: 3.05%
avgDailyGainPercents Arithmetic Mean: 0.04%, stdev: 0.09%. The would-be CAGR is: 9.53%

5days input
260days test
After 99 tests:
winLoseRatios Arithmetic Mean: 50.37%, stdev: 2.84%
avgDailyGainPercents Arithmetic Mean: 0.01%, stdev: 0.09%.. The would-be CAGR is: 3.02%

CAGRs in a table form:

This one is not so good. It means that using 5 days input (CAGR is: 3.02%) is worse than using 2 days input (CAGR is: 9.53%).
So, using 5 days introduced more (random) noise to the estimate.
Should I test it with 1 day input?
We can try.

1. Testing only for the last 5 days:
(2010-02-03…-2010-02-10, excluding the weekend).,
the winLoseRatios Arithmetic Mean was 64.58%. That is great.
We had an edge to forecast the market direction in the last 5 days.
That seems to be good news.

However, we have no prediction power during the whole year.
Note that that 260 days period is from 2009-02 to 2010-02, so it contains the worst time of 2009 stock market crash.
We found the direction only 51% of them time. Not significant.
Bad news.

Using 5 days data (instead of 2 days) in the input doesn’t improve the prediction.

Overall, there is no significant prediction power.
What I think is that we have to seriously dig deeper into the Neural Network theory.
We want to compete here with researchers that studied the topic for decades.
And we only started this topic a 2 weeks ago.
There is more to learn. I go back to reading publications, thesis’s about NN in stock price prediction to have some ideas why
we cannot reproduce good prediction.


No Responses Yet to “Open/High/Low prices are back in the input. Use SPY instead of SPX”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: