Ordinary Least Squares (OLS) regression is a statistical method of analysis the estimate the relationship between one or more independent variables. It has many wide applications within finance and trading and can be used as a method of producing alpha, in this post I'll go over such a method. I'll likely brush over a lot of detail and context when explaining regression so I would recommend reading up on it as I want to keep this post brief. As with most of my posts this is for demonstrative purposes only, so some key assumptions around slippage/transaction costs etc have been ignored. My dataset used for this is BTC and ETH timeseries data in 4 hour intervals ranging from 1 January 2017 to 20 June 2020. On to the post…
Now I have my dataset all ready to model, and as I only have two assets in my dataset I am using a bivariate model, that is, a model in which there is only one independent variable (x) predicting one dependant variable (y), however this can be extended to a multivariate model by adding more independent variables. For my data set containing BTC and ETH I chose ETH as my x and BTC as my y. I then set a lookback length of the last 50 periods to fit my model, this step means I will only be able to use past data to estimate my parameters, this avoids data snooping and relying on 'future' data to estimate our parameters. Fitting the model will produce output like so:
without going into too much theory this model trying to fit your data as such:
By ignoring the y intercept and random error term we can model our data in the from y = x.beta and rearranging that we can say y – x.beta = 0. This means we only have to worry about finding the beta term (which I've circled in red) for our data set and then we can (hopefully) model our data as a mean reverting process. In code this would look like:
I created a vector labelled 'OLS_spread' which contains our mean reverting process of y – x.beta = 0. If we plot it we can see we've generated a mean reverting process that we can now use as a source of alpha.
Using the OLS spread I simply set entry signals at -200 and +200. I would go long the spread (Buy ETH, sell BTC) when OLS_spread < -200 and would short the spread (Sell ETH, Buy BTC) when OLS_spread > 200. This could be further optimised but generally speaking this higher the entry threshold the less trade and more success, and the lower the threshold the more trade and less success.
Onto results, I have two sets of results, one containing the full sample set and one containing the final year of results as I found data from 2017 made BTC and ETH performance stats a bit wild. Note my data set only goes to 20 June and as such the final year performance stats are also quite unforgiving on BTC and ETH but nonetheless the regression strategy performs well in almost every market conditions it faced.
and visualisations of returns over time:
Thanks for reading, hopefully this was useful and made some sense, I'll try to clarify any questions in the comments.
Submitted October 10, 2020 at 06:34PM by Tacoslim