“Deep Future” Crypto Predictions with HTM Classifiers

looking t+N steps into the future

Mark Cleverley
5 min readAug 24, 2020

Algorithms are threatening to eat day traders’ collective lunch.

Machine learning models are arguably more attentive to small fluctuations than humans.
The more those models trade, the more volatile the micro-changes are, increasing potential for other models to pick up “machine generated” signals that could indicate coming booms or busts.

My first foray into price forecasting with Hierarchical Temporal Memory algorithms was what I would call cute. Trying to predict a high-volume, somewhat volatile cryptocurrency using a bit-array biological neural network with only 3200 cell-columns in the Spatial Pooler is like trying to surf a rogue wave with a pool noodle.

It made some halfway-decent predictions (green and blue lines vs red actual BTC price), but nothing worth gambling on.

There’s plenty of avenues to make a net like this competitive — to start, give it more data than Bitcoin prices alone. A situation where only BTC is moving is quite different from a situation where several or all top 20 coins are moving (in any which way).

But there’s something else to consider for price forecasting:

If I gave you a mystical algorithm that could predict BTC price quite well, but it would only tell you “here is the price exactly 5 minutes from now” — would it really be that useful?

The price generally doesn’t change by much in 5 minutes, and 5 is quite short — unless you have deep pockets and an pre-configured trading bot hooked to your account, you probably wouldn’t get that much out of it.

The answer is to look beyond the immediate future: in our case, look beyond “5 minutes from now”.

So let’s fix an embarrassingly large misunderstanding I made last time and simultaneously generate interesting “Deep Future” predictions for each line of data.

Fixing my prior misreading

When I worked through htm.core’s hotgym.py example, we were predicting “the gym’s power consumption one hour from now”, since our data was recorded every hour.

I wrote that the steps=[1,5] in the SDRClassifier would govern two parallel channels of predictions for each power_consumption input — one looking 1 step into the past data, one looking 5 steps behind.

However I was puzzled by my interpretation, since the ‘depth’ of a Temporal Memory system — its ability to look N time-steps into the past — should technically be determined by how many neurons it has in each cell column (here’s a superb explanation of why).

I looked into the C++ bindings for the SDRClassifier, and found that I had indeed misunderstood the code:

from htm.bindings.algorithms import Predictorpredictor = Predictor(steps=[1,5], alpha=0.1)
# will make 2 predictions: 1 and 5 steps into the future

Our Predictor replaces the SDRClassifier’s function here. What’s important to note is that steps=[1,5] configures the predictor to return 2 predictions: one 1 time-step into the future and one 5 time-steps beyond the present.

This also solves the issue of the strange alignment block I couldn’t discern the need for:

for n_steps, pred_list in predictions.items(): # for 1 or 5, list_of_price_predictions:
for x in range(n_steps): # for x in range(1) or range(5)
pred_list.insert(0, float('nan')) # insert nan at position 0
pred_list.pop() # remove the last element to re-align

predictions = {1: [...], 5: [...]}
Our output from the training loop is a dictionary with keys 1 and 5, with the lists of predictions at 1 and 5 as values of those keys. This alignment loop effectively shifts the 1-step predictions one index ahead, and the 5-step five indexes ahead, so that they can be accurately measured and graphed.

Deep Future Loop

The best way to think of “Deep Future Predictions” via the SDRClassifier is quite simple.
If a Predictor can give you “the output 1 time-step from now (t+1)”, then it can also feed the Temporal Memory circuit that output_t+1 and get the output of that, which would be 2 steps from now (t+2).

This is a form of recursive prediction called “Temporal Unfolding”, which is:

  1. live-calculated (no need for post batching, run it on every new row of data)
  2. increasingly unreliable with added depth

The accuracy of t+N predictions decreases with N, much for the same reason it’s harder to predict weather farther in the future.

With that said, I’ve updated the predictor and training loop to return 12 time-states ahead of the present, so we can check the predicted fluctuations over the next hour. I’ll detail the exact changes as follows. We start by modifying the Predictor instantiation to be programmatically modifiable:

predictor = Predictor(steps=[1,5], alpha=0.1)
# becomes:
future_length = [i+1 for i in range(10) # ints, 1 to 10
predictor = Predictor(steps=future_length, alpha=0,1)

Then we change the output dictionary created before the training loop in the same fashion:

predictions = {1: [], 5:[]}
# becomes:
holders = [ [] for e in a ] # empty lists matching future depth
predictions = dict(zip(future_length, holders)) # same format as original

Now the loop that unpacks the predictor.infer()’s resulting PDF into each step prediction:

for n in (1,5):
# becomes
for n in tuple(future_length):

The re-alignment loop is already programmatic, so we’ll just update the accuracy calculation:

accuracy = {1: 0, 5: 0}
# becomes
accuracy = dict(zip(future_lengths, [0 for e in future_lengths]))

Next Steps

Running the altered code on my notebook should give you 10-step deep predictions for each BTC price, every 5 minutes since 2017. I still wouldn’t stake your portfolio on this relatively tiny model, however.

The major caveat is that this exponentially boosts runtime compared to t+1 predictions. It feels much more like a classic neural network in terms of calculation time.

I’m curious as to the recursive logic behind the ‘steps’ parameter: If you pass it steps=[5,10] and steps=[1,10] , theoretically the computational load would be the same, since it needs to calculate through to 10 either way and would just record predictions at different steps in each configuration.
Of course if the SDRClassifier doesn’t use temporal unfolding the story is quite different. I’ll look more into the C++ base behind the classes in question.

So we fixed a huge misunderstanding on my part and successfully configured the HTM net to predict any N steps into the future. This will come in handy when we’re dealing with ambiguous outputs — of which cryptocurrency prediction is a fine example.

No network can (yet) give you “the price of BTC will be X exactly N minutes from now” with 99% accuracy. But as we’ll see next week, there may just be an HTM-based algorithm that can deliver efficient probability ranges that can leave you firmly in the green.

--

--

Mark Cleverley
Mark Cleverley

Written by Mark Cleverley

data scientist, machine learning engineer. passionate about ecology, biotech and AI. https://www.linkedin.com/in/mark-s-cleverley/

No responses yet