Short Term Directional Equity Forecasting with SVMs and R

Short Term Directional Equity Forecasting with SVMs and R

This post outlines a framework for forecasting short term (i.e. daily tick data) directional movements of equity prices. The method used here relies on support vector machines and treats the system like a Markov Chain. Historical data is downloaded from This is not investment advice or recommendation of an investment strategy but provided for educational purposes only. The following code comes with no warranties whatsoever. The code which can be found in its entirety on GitHub, attempts to model the directional movement (i.e. above or below the previous close) of the closing price of a stock on the following variables:

  • return of the equity at lag = 1
  • return of the equity at lag = 3
  • return of SPY at lag = 1
  • return of SPY at lag = 3
  • return of QQQ at lag = 1
  • return of QQQ at lag = 3
  • return of UVXY at lag = 1
  • return of UVXY at lag = 3

Return at time t is defined as (closet /closet-1) -1

The code is broken up into four parts:

  • helpers.r – a file containing helper functions
  • setup.r – a file that specifies which data to mine (ticker symbols, start and end dates). This file also reduces the necessary data to a matrix-like extensible time series object and splits this data into a training set and testing set.
  • svm_trend_follower.r – a file that specifies and evaluates models. Evaluation results are saved to a text file.
  • svm_predict.r – a file that predicts directional movement 1 step ahead. Prediction is saved to a text file.

This code relies on the xts,, and e1071 packages.

From the helpers file the functions GrabData, Frame2Xts, Plag4, TTSplit, and LiveTestSplit are used to acquire, format, and manipulate data for use in SVM models.

Grab Data will download data from given a stock symbol and start and end dates.

Frame2Xts will do some formatting and convert a data frame to an extensible time series object.

Plag4 will calculate and store daily returns, dummy variables for whether returns were positive or negative, and will store vectors for lagged returns up to lag 4. For specifying different models that require a greater number of lags this function will likely need to be changed to accommodate.

TTSplit splits data into training and testing sets given an xts object and a proportion of data to be used as the training set.

LiveTestSplit is a convenience function for formatting dates when subsetting extensible time series objects.

We put this all of this together in the setup.r file

Next we discuss the svm_tend_follower.r file, where we specify and evaluate models. We use the helper functions SvmEval2 to determine the accuracy of a model when applied on out of sample data. The function returns the proportion of accurate directional predictions and a confusion matrix.

Given our ‘basket’ setup in the setup.r file here is an example of how we would model AMD:

After we similarly specify models for the other stocks in our basket we can filter the models by their out of sample accuracy and write to a text file. Here we filter for models that are accurate more than 50% of the time:

Finally, if we find a model that appears promising we can use it for forecasting and prediction as demonstrated in svm_predict.r. We create a dataframe containing the variables our model will need for prediction. And attempt to forecast the next days directional movement.

Although the models for Amd and Paypal appear promising from their accuracy, the prediction for the following day’s directional movement is incorrect. To improve accuracy one can consider optimizing SVM parameters, selecting a different kernel type , or changing sampling procedures (dates sampled, sampling methods, etc). This code and framework can be tweaked to be used with other methods that require matrix like objects and lagged values for forecast predictions. If one were to hypothetically trade such a strategy short term option positions near the money may be interesting trading vehicles.

Leave a Reply

Your email address will not be published. Required fields are marked *