Data Crunching
The data that we are going to use for this article will be downloaded from Yahoo Finance. For training our algorithm, we will be using the Apple stock prices for the past 5 years. To perform this task we will first define a function that will work to scrape the Yahoo finance website.
import pandas as pd
import numpy as np
import requests
import re
from datetime import datetime, timedelta
from io import StringIO
###############################################################################
# Data Scraping Class for Yahoo Finance #
###############################################################################
class YahooFinanceHistory:
timeout = 2
crumb_link = 'https://finance.yahoo.com/quote/{0}/history?p={0}'
crumble_regex = r'CrumbStore":{"crumb":"(.*?)"}'
quote_link = 'https://query1.finance.yahoo.com/v7/finance/download/{quote}?period1={dfrom}&period2={dto}&interval=1d&events=history&crumb={crumb}'
def __init__(self, symbol, days_back=7):
self.symbol = symbol
self.session = requests.Session()
self.dt = timedelta(days=days_back)
def get_crumb(self):
response = self.session.get(self.crumb_link.format(self.symbol), timeout=self.timeout)
response.raise_for_status()
match = re.search(self.crumble_regex, response.text)
if not match:
raise ValueError('Could not get crumb from Yahoo Finance')
else:
self.crumb = match.group(1)
def get_quote(self):
if not hasattr(self, 'crumb') or len(self.session.cookies) == 0:
self.get_crumb()
now = datetime.utcnow()
dateto = int(now.timestamp())
datefrom = int((now - self.dt).timestamp())
url = self.quote_link.format(quote=self.symbol, dfrom=datefrom, dto=dateto, crumb=self.crumb)
response = self.session.get(url)
response.raise_for_status()
return pd.read_csv(StringIO(response.text), parse_dates=['Date'])
stock = 'AMD'
df = YahooFinanceHistory(stock, days_back=2000).get_quote()
df.tail()
Date | Open | High | Low | Close | Adj Close | Volume | |
---|---|---|---|---|---|---|---|
1375 | 2019-11-21 | 40.419998 | 40.709999 | 38.639999 | 39.520000 | 39.520000 | 88069400 |
1376 | 2019-11-22 | 39.360001 | 39.889999 | 38.189999 | 39.150002 | 39.150002 | 56931900 |
1377 | 2019-11-25 | 39.500000 | 40.169998 | 39.490002 | 39.790001 | 39.790001 | 45769500 |
1378 | 2019-11-26 | 39.299999 | 39.480000 | 38.810001 | 38.990002 | 38.990002 | 43603300 |
1379 | 2019-11-27 | 39.459999 | 39.759998 | 39.070000 | 39.410000 | 39.410000 | 33630100 |
import matplotlib.pyplot as plt
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
df_plot = df
#setting index as date
df_plot['Date'] = pd.to_datetime(df_plot.Date,format='%Y-%m-%d')
df_plot.index = df_plot['Date']
#plot
plt.figure(figsize=(16,8));
plt.plot(df_plot['Close'], label='Close Price history');
plt.title('{} 5 Year Stock Price'.format(stock), fontweight='bold', fontsize=16);
plt.xlabel('Days', fontweight='bold');
plt.ylabel('Price (USD)', fontweight='bold');
You can see that the trend is highly non-linear and it is very difficult to capture the trend using this information. This is where the power of Long Short-Term Memory network (LSTM) can be utilized. LSTM is a type of recurrent neural network capable of remembering the past information and while predicting the future values, it takes this past information into account.
Predicting Future Stock Prices
Stock price prediction is similar to any other machine learning problem where we are given a set of features and we have to predict a corresponding value. We will perform the same steps as we do perform in order to solve any machine learning problem.
As a rule of thumb, whenever you use a neural network, you should normalize or scale your data. We will use MinMaxScaler class from the sklearn.preprocessing library to scale our data between 0 and 1.
As mentioned earlier, in a time series problems, we have to predict a value at time T, based on the data from days T-N where N can be any number of steps. In this tutorial, we are going to predict the opening stock price of the data based on the opening stock prices for the past 60 days (prediction_window_size). I have tried and tested different numbers and found that the best results are obtained when past 60 time steps are used. You can try different numbers and see how your algorithm performs.
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
# Number of days to use for prediction
prediction_window_size = 60
# Creating dataframe
data = df.sort_index(ascending=True, axis=0)
new_data = pd.DataFrame(index=range(0,len(df)),columns=['Date', 'Close'])
for i in range(0,len(data)):
new_data['Date'][i] = data['Date'][i]
new_data['Close'][i] = data['Close'][i]
# Setting index
new_data.index = new_data.Date
new_data.drop('Date', axis=1, inplace=True)
# Converting dataset into x_train and y_train
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(new_data.values)
x_train, y_train = [], []
for i in range(prediction_window_size, len(scaled_data)):
x_train.append(scaled_data[i-prediction_window_size:i,0])
y_train.append(scaled_data[i,0])
x_train, y_train = np.array(x_train), np.array(y_train)
x_train = np.reshape(x_train, (x_train.shape[0],x_train.shape[1],1))
In the script above we create two lists: x_train (features) and y_train (labels). There are 1380 records in the scaled_data data. We execute a loop that starts from 61st record and stores all the previous 60 records to the feature_set list. The 61st record is stored in the y_train labels list. We need to convert both the x_train and the y_train lists to numpy arrays before we can use it for training.
In order to train LSTM on our data, we need to convert our data into the shape accepted by the LSTM. We need to convert our data into three-dimensional format. The first dimension is the number of records or rows in the dataset which is 1320 in our case. The second dimension is the number of time steps which is 60 while the last dimension is the number of indicators. Since we are only using one feature, i.e Close, the number of indicators will be one.
Create and Fit LSTM Network
We have preprocessed our data and have converted it into the desired format. Now is the time to create our LSTM. The LSTM model that we are going to create will be a sequential model with multiple layers. We will add four LSTM layers to our model followed by a dense layer that predicts the future stock price.
In the script below we imported the Sequential class from keras.models library and Dense, LSTM, and Dropout classes from keras.layers library. To add a layer to the sequential model, the add method is used. Inside the add method, we passed our LSTM layer. The first parameter to the LSTM layer is the number of neurons or nodes that we want in the layer. The second parameter is return_sequences, which is set to true since we will add more layers to the model. The first parameter to the input_shape is the number of time steps while the last parameter is the number of indicators.
Creating LSTM and Dropout Layers
Let’s now add a dropout layer to our model. Dropout layer is added to avoid over-fitting, which is a phenomenon where a machine learning model performs better on the training data compared to the test data. We will then add three more LSTM and dropout layers to our model.
Creating Dense Layer
To make our model more robust, we add a dense layer at the end of the model. The number of neurons in the dense layer will be set to 1 since we want to predict a single value in the output.
Model Compilation
Finally, we need to compile our LSTM before we can train it on the training data. The following script compiles the our model. We call the compile method on the Sequential model object which is “model” in our case. We use the mean squared error as loss function and to reduce the loss or to optimize the algorithm, we use the adam optimizer.
Algorithm Training
Now is the time to train the model that we defined. To do so, we call the fit method on the model and pass it our training features and labels. Depending upon your hardware, model training can take some time.
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM
# Initialising the RNN
model = Sequential()
# Adding the first LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50, return_sequences = True, input_shape = (x_train.shape[1], 1)))
model.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
# Adding a fourth LSTM layer and some Dropout regularisation
model.add(LSTM(units = 50))
model.add(Dropout(0.2))
# Adding the output layer
model.add(Dense(units = 1))
# Compiling the RNN
model.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fitting the RNN to the Training set
model.fit(x_train, y_train, epochs = 100, batch_size = 32,verbose=0);
Testing our LSTM
We have successfully trained our LSTM, now is the time to test the performance of our algorithm on the test set by predicting the opening stock prices for one full year. However, as we did with the training data, we need to convert our test data in the right format.
# predicting 246 values, using past 60 from the train data
total_prediction_days = 365
inputs = new_data[-(total_prediction_days+prediction_window_size):].values
inputs = inputs.reshape(-1,1)
inputs = scaler.transform(inputs)
X_test = []
for i in range(prediction_window_size, inputs.shape[0]):
X_test.append(inputs[i-prediction_window_size:i,0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
Making Predictions
Now is the time to see the magic. We preprocessed our test data and now we can use it to make predictions. To do so, we simply need to call the predict method on the model that we trained. Since we scaled our data, the predictions made by the LSTM are also scaled. We need to reverse the scaled prediction back to their actual values. To do so, we can use the ìnverse_transform method of the scaler object we created during training. Take a look at the following script:
closing_price = model.predict(X_test)
closing_price = scaler.inverse_transform(closing_price)
How Did We Do?
# Split data into test and train sets
data_idx = len(new_data) - len(closing_price)
train.loc[:,0] = new_data[:data_idx]
valid.loc[:,0] = new_data[data_idx:]
valid.loc[:,'Predictions'] = closing_price
fig, ax = plt.subplots(figsize=(10,6))
plt1 = ax.plot(train['Close']);
plt2 = ax.plot(valid[['Close', 'Predictions']]);
plt.title('{} Closing Price'.format(stock), fontweight='bold', fontsize=16)
plt.xlabel('Days', fontweight='bold')
plt.ylabel('Price (USD)', fontweight='bold')
plt.legend(plt2, ('Actual', 'Prediction'), loc='center left', bbox_to_anchor=(1, 0.5));
y_pred = (closing_price > 0.5)
valid_true = new_data.values[-len(closing_price):]
rms = np.sqrt(np.mean(np.power((valid_true-closing_price),2)))
trade_dataset_temp = df
trade_dataset_temp['y_pred'] = np.NaN
trade_dataset_temp.iloc[(len(trade_dataset_temp) - len(y_pred)):,-1:] = y_pred
trade_dataset = trade_dataset_temp.dropna()
trade_dataset['Tomorrows Returns'] = 0.
trade_dataset['Tomorrows Returns'] = np.log(trade_dataset['Close']/trade_dataset['Close'].shift(1))
trade_dataset['Tomorrows Returns'] = trade_dataset['Tomorrows Returns'].shift(-1)
trade_dataset['Strategy Returns'] = 0.
trade_dataset['Strategy Returns'] = np.where(trade_dataset['y_pred'] == True, trade_dataset['Tomorrows Returns'], - trade_dataset['Tomorrows Returns'])
trade_dataset['Cumulative Market Returns'] = np.cumsum(trade_dataset['Tomorrows Returns'])
trade_dataset['Cumulative Strategy Returns'] = np.cumsum(trade_dataset['Strategy Returns'])
plt.figure(figsize=(10,5))
plt.plot(trade_dataset['Cumulative Market Returns'], color='r', label='Market Returns')
plt.plot(trade_dataset['Cumulative Strategy Returns'], color='g', label='Strategy Returns')
plt.legend()
plt.show()