Learn practical skills, build real-world projects, and advance your career

Author: Raoul Malm

Description:

This notebook demonstrates the future price prediction for different stocks using recurrent neural networks in tensorflow. Recurrent neural networks with basic, LSTM or GRU cells are implemented.

Outline:

  1. Libraries and settings
  2. Analyze data
  3. Manipulate data
  4. Model and validate data
  5. Predictions

Reference:

LSTM_Stock_prediction-20170507 by BenF

1. Libraries and settings

import numpy as np
import pandas as pd
import math
import sklearn
import sklearn.preprocessing
import datetime
import os
import matplotlib.pyplot as plt
import tensorflow as tf

# split data in 80%/10%/10% train/validation/test sets
valid_set_size_percentage = 10 
test_set_size_percentage = 10 

#display parent directory and working directory
print(os.path.dirname(os.getcwd())+':', os.listdir(os.path.dirname(os.getcwd())));
print(os.getcwd()+':', os.listdir(os.getcwd()));
/kaggle: ['src', 'lib', 'input', 'working'] /kaggle/working: ['__notebook__.ipynb']

2. Analyze data

  • load stock prices from prices-split-adjusted.csv
  • analyze data
# import all stock prices 
df = pd.read_csv("../input/prices-split-adjusted.csv", index_col = 0)
df.info()
df.head()

# number of different stocks
print('\nnumber of different stocks: ', len(list(set(df.symbol))))
print(list(set(df.symbol))[:10])
<class 'pandas.core.frame.DataFrame'> Index: 851264 entries, 2016-01-05 to 2016-12-30 Data columns (total 6 columns): symbol 851264 non-null object open 851264 non-null float64 close 851264 non-null float64 low 851264 non-null float64 high 851264 non-null float64 volume 851264 non-null float64 dtypes: float64(5), object(1) memory usage: 45.5+ MB number of different stocks: 501 ['DGX', 'PPL', 'VMC', 'ARNC', 'HOLX', 'KIM', 'COF', 'F', 'HCP', 'V']