Saturday, October 21, 2017

Systematic Trading | Using Autoencoder for Momentum Trading

In a previous post, we discussed the basic nature of various technical indicators and noted some observations. One of the ideas was: at a basic level, most indicators captures the concept of momentum vs mean-reversion. Most do so in the price returns space, but some in a non-linear transformation of the returns space, like signed returns or time since new high/ low. We presented the idea of a PCA approach to extract the momentum signals embedded in these indicators. From there to a trading model, the steps will be to collate this momentum signal (1st PCA component or higher if required) along with other input variables (like returns volatility and/ or other fundamental indicators) to train a separate regression/ classification model (like a random forest or a deep NN).

One of the issues with using simple PCA is that it is linear and hence may not be appropriate to summarize different measures captured across all these indicators. Here we discuss the next logical improvement - a nonlinear dimensional reduction approach using autoencoder.

As discussed here, the new Keras R interface has now made it very easy to develop deep learning models in R using the TensorFlow framework. Here we use this interface to train an autoencoder to fit the same set of technical indicators on NSE Nifty 50 Index as before. The steps involved are relatively straight-forward. First we generate and standardize the inputs (technical indicators levels). Then we build the computation graph.

To do so, first we define the encoding layers (2 hidden layers, the latent coded unit size is 3, to match the first 3 components of the PCA we use for comparison), and two different decoding layers. The two different decoding layers are to  enable us to train the auto-encoder as well as compute only decoding independently.


Next we combine these layers to create the computational graph. One for the encoder only, another for the decoder, and a third one for the end-to-end autoencoder, that we will actually train.

The rest of it is standard. We define a loss function to map the input to the output, measuring mean squared losses, and train the model. The training is done on data till 2013, and test set is since 2014 till present. Once the training is done, we can use the encoder and decoder separately to generate a dimensionality reduction of the input space and vice-versa.

The output of the dimensionality reduction is compared with the PCA. As it appears from the correlations, the PCAs are almost one-to-one mapped to the three latent dimensions in the hidden layer generating the encoding. So the encoded layers are orthogonal in our case, although this need not be true always.

V1
V2
V3
PC1
1
-0.3
0.2
PC2
0.1
-0.2
0.8
PC3
-0.2
-0.9
0.5

The scatter plot below captures the same, but also highlights the some non-linearity, especially the first component of PCA vs the first latent dimension from the autoencoder.


From here the next step is obvious, replace the PCA factors inputs in the momentum trading model in the first paragraph with these latent dimensions from the autoencoder and re-evaluate. This will capture a richer set of inputs that can handle non-linearity and hopefully performs better than linear PCA. Here are some results what other reported (opens PDF). Here are some more (opens PDF) on the using autoencoder for cross-sectional momentum trading. The entire code is available here.