Predict Amazon Inc Stock Price with Machine Learning

In this article we are going to see how we can predict Amazon stock price with the help of Machine Learning.

Import library

import pandas as pd

Load data

I have used the 5 years historical data of Amazon.com, Inc. (AMZN). You can download data from following link: Amazon.com, Inc. (AMZN)

inputFolder = "input/"

filePath = inputFolder + "AMZN.csv"
filePath

'input/AMZN.csv'

Read csv file using pandas library

pandas.read_csv(): Read a comma-separated values (csv) file into DataFrame.

df = pd.read_csv(filePath)
df

View dataframe shape

pandas.DataFrame.shape:

Return a tuple representing the dimensionality of the DataFrame.

df.shape

(1258, 7)

Data has 1258 rows and 7 columns.

Print first five records

DataFrame.head(n=5):

Return the first n rows.

This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it. Default is 5 number of rows to select.

For negative values of n, this function returns all rows except the last n rows, equivalent to df[:-n].

df.head()

Print last five records

DataFrame.tail(n=5)

Return the last n rows.

This function returns last n rows from the object based on position. It is useful for quickly verifying data, for example, after sorting or appending rows. Default is 5 Number of rows to select.


For negative values of n, this function returns all rows except the first n rows, equivalent to df[n:].

df.tail()

Create a new dataframe

Create a new dataframe with two columns 'Date' and 'Close'. For stock prediction we need only date and closing price. We are using length of original dataframe as index.

class pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters

data: ndarray, Iterable, dict, or DataFrame

    Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order.

index: Index or array-like

    Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.

columns: Index or array-like

    Column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided.

dtype: dtype, default None

    Data type to force. Only a single dtype is allowed. If None, infer.

copy: bool, default False

    Copy data from inputs. Only affects DataFrame / 2d ndarray input.

new_df = pd.DataFrame(index = range(0,len(df)), columns=['Date', 'Close'])
new_df

Sort the dataframe

DataFrame.sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index=False, key=None)

Sort object by labels (along an axis).

Returns a new DataFrame sorted by label if inplace argument is False, otherwise updates the original DataFrame and returns None.

df = df.sort_index(ascending = True, axis = 0)
df

Fill data in new dataframe

We have to take data from original dataframe(df) and fill in new dataframe(new_df).

for i in range(0, len(df)):
    new_df['Date'][i] = df['Date'][i]
    new_df['Close'][i] = df['Close'][i]

new_df

Set date as index

new_df.index = new_df.Date
new_df

Drop Date column

Now we don't need 'Date' column, so just drop the column.

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

Drop specified labels from rows or columns.

Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names.

Parameters

labels: single label or list-like

Index or column labels to drop.

axis{0 or ‘index’, 1 or ‘columns’}, default 0

Whether to drop labels from the index (0 or ‘rows’) or columns (1 or ‘columns’).

index: single label or list-like

Alternative to specifying axis (labels, axis=0 is equivalent to index=labels).

columns: single label or list-like

Alternative to specifying axis (labels, axis=1 is equivalent to columns=labels).

inplace: bool, default False

If False, return a copy. Otherwise, do operation inplace and return None.

Returns

    DataFrame or None. DataFrame without the removed index or column labels or None if inplace=True.

new_df.drop('Date', axis=1, inplace=True)
new_df

pandas.DataFrame.values

DataFrame.values: Return a Numpy representation of the DataFrame.

dataset = new_df.values
dataset[:10]

array([[620.75],
       [625.8900150000001],
       [635.349976],
       [627.900024],
       [632.98999],
       [631.0],
       [620.5],
       [626.200012],
       [616.880005],
       [606.570007]], dtype=object)

Scaling features to a range

It is important to scale features before training a neural network. Normalization is a common way of doing this scaling.

A way to normalize the input features/variables is the Min-Max scaler. By doing so, all features will be transformed into the range [0,1] meaning that the minimum and maximum value of a feature/variable is going to be 0 and 1, respectively.

from sklearn.preprocessing import MinMaxScaler

class sklearn.preprocessing.MinMaxScaler(feature_range=0, 1, *, copy=True, clip=False)

Transform features by scaling each feature to a given range.

This estimator scales and translates each feature individually such that it is in the given range on the training set, e.g. between zero and one.

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

X: array-like of shape (n_samples, n_features)

    Input samples.
    
y: array-like of shape (n_samples,) or (n_samples, n_outputs), default=None

    Target values (None for unsupervised transformations).
    
**fit_paramsdict

    Additional fit parameters.

Returns

X_new: ndarray array of shape (n_samples, n_features_new)

Transformed array.

scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(dataset)
scaled_data[:10]

array([[0.00640052],
       [0.00815512],
       [0.01138438],
       [0.00884126],
       [0.01057877],
       [0.00989947],
       [0.00631518],
       [0.00826094],
       [0.00507945],
       [0.00156002]])

Split data in train and test

We are dividing data for training and testing. We have 1258 records. We are taking index 0 to 700 for training and from 700 to last for validation.

train = dataset[0:700,:]
valid = dataset[700:,:]

train.shape

(700, 1)

valid.shape

(558, 1)

train[:5]

array([[620.75],
       [625.8900150000001],
       [635.349976],
       [627.900024],
       [632.98999]], dtype=object)

valid[:5]

array([[1670.569946],
       [1637.890015],
       [1593.880005],
       [1670.430054],
       [1718.72998]], dtype=object)

Converting dataset into X_train and y_train

len(train)

700

X_train, y_train = [], []

for i in range(60, len(train)):
    X_train.append(scaled_data[i-60: i, 0])
    y_train.append(scaled_data[i,0])
    
    pass

X_train[0]

array([0.00640052, 0.00815512, 0.01138438, 0.00884126, 0.01057877,
       0.00989947, 0.00631518, 0.00826094, 0.00507945, 0.00156002,
       0.        , 0.01965899, 0.02794039, 0.02366315, 0.02351978,
       0.01948831, 0.02456093, 0.02654082, 0.03450136, 0.03796958,
       0.03957398, 0.03683967, 0.03709228, 0.03183875, 0.03258291,
       0.03294817, 0.03440919, 0.03234396, 0.0348871 , 0.03630374,
       0.03854306, 0.03763163, 0.04123299, 0.04008944, 0.04309341,
       0.04217173, 0.04257795, 0.04155729, 0.04254724, 0.04289202,
       0.03956715, 0.03865572, 0.04004164, 0.03832119, 0.03943061,
       0.03563468, 0.03823585, 0.03885371, 0.0370718 , 0.04099064,
       0.03309837, 0.03050401, 0.0361672 , 0.0387786 , 0.03878544,
       0.04221953, 0.04304562, 0.04629196, 0.04593695, 0.04909113])

Convert X_train and y_train into numpy array

import numpy as np

X_train, y_train = np.array(X_train), np.array(y_train)

print(X_train[1])
print(y_train[1])

[0.00815512 0.01138438 0.00884126 0.01057877 0.00989947 0.00631518
 0.00826094 0.00507945 0.00156002 0.         0.01965899 0.02794039
 0.02366315 0.02351978 0.01948831 0.02456093 0.02654082 0.03450136
 0.03796958 0.03957398 0.03683967 0.03709228 0.03183875 0.03258291
 0.03294817 0.03440919 0.03234396 0.0348871  0.03630374 0.03854306
 0.03763163 0.04123299 0.04008944 0.04309341 0.04217173 0.04257795
 0.04155729 0.04254724 0.04289202 0.03956715 0.03865572 0.04004164
 0.03832119 0.03943061 0.03563468 0.03823585 0.03885371 0.0370718
 0.04099064 0.03309837 0.03050401 0.0361672  0.0387786  0.03878544
 0.04221953 0.04304562 0.04629196 0.04593695 0.04909113 0.05181178]
0.049910401080615674

X_train.shape[0]

640

X_train.shape[1]

60

X_train.shape

(640, 60)

Reshape X_train array

NumPy array reshape this one-dimensional array into a three-dimensional array with 640 sample, 60 time steps, and 1 feature at each time step.

numpy.reshape(a, newshape, order='C')

Gives a new shape to an array without changing its data.

X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

X_train.shape

(640, 60, 1)

Now X_train data is ready to be used as input (X) to the LSTM with an input_shape of (60, 1).

Create model

import tensorflow as tf

model = tf.keras.Sequential()

Add layers in model

The input to every LSTM layer must be three-dimensional.

The three dimensions of this input are:

Samples. One sequence is one sample. A batch is comprised of one or more samples.
Time Steps. One time step is one point of observation in the sample.
Features. One feature is one observation at a time step.

This means that the input layer expects a 3D array of data when fitting the model and when making predictions, even if specific dimensions of the array contain a single value, e.g. one sample or one feature.

Units: The amount of "neurons", or "cells", or whatever the layer has inside it.

The LSTM input layer is defined by the input_shape argument on the first hidden layer. The input_shape argument takes a tuple of two values that define the number of time steps and features.

Hidden layer 1: 50 units/ 50 neurons
Hidden layer 2: 50 units/ 50 neurons
Last layer: 1 unit

model.add(tf.keras.layers.LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
model.add(tf.keras.layers.LSTM(units = 50))
model.add(tf.keras.layers.Dense(1))

Model summary

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm (LSTM)                  (None, 60, 50)            10400     
_________________________________________________________________
lstm_1 (LSTM)                (None, 50)                20200     
_________________________________________________________________
dense (Dense)                (None, 1)                 51        
=================================================================
Total params: 30,651
Trainable params: 30,651
Non-trainable params: 0
_________________________________________________________________

Compile model

The mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value.

model.compile(loss = 'mean_squared_error', optimizer = 'adam')

Train the model

history = model.fit(X_train, y_train, epochs = 100, batch_size=10)

Epoch 1/100
64/64 [==============================] - 1s 14ms/step - loss: 0.0059
Epoch 2/100
64/64 [==============================] - 1s 14ms/step - loss: 3.9926e-04
Epoch 3/100
64/64 [==============================] - 1s 14ms/step - loss: 4.2879e-04
Epoch 4/100
64/64 [==============================] - 1s 14ms/step - loss: 4.6579e-04
Epoch 5/100
64/64 [==============================] - 1s 14ms/step - loss: 4.0886e-04
Epoch 6/100
64/64 [==============================] - 1s 14ms/step - loss: 3.3498e-04
Epoch 7/100
64/64 [==============================] - 1s 15ms/step - loss: 3.2333e-04
Epoch 8/100
64/64 [==============================] - 1s 15ms/step - loss: 3.7809e-04
Epoch 9/100
64/64 [==============================] - 1s 14ms/step - loss: 3.5955e-04
Epoch 10/100
64/64 [==============================] - 1s 14ms/step - loss: 3.2380e-04
Epoch 11/100
64/64 [==============================] - 1s 14ms/step - loss: 3.0329e-04
Epoch 12/100
64/64 [==============================] - 1s 14ms/step - loss: 3.2191e-04
Epoch 13/100
64/64 [==============================] - 1s 14ms/step - loss: 2.8677e-04
Epoch 14/100
64/64 [==============================] - 1s 14ms/step - loss: 2.5705e-04
Epoch 15/100
64/64 [==============================] - 1s 14ms/step - loss: 2.5431e-04
Epoch 16/100
64/64 [==============================] - 1s 14ms/step - loss: 2.9078e-04
Epoch 17/100
64/64 [==============================] - 1s 14ms/step - loss: 2.5680e-04
Epoch 18/100
64/64 [==============================] - 1s 14ms/step - loss: 2.7394e-04
Epoch 19/100
64/64 [==============================] - 1s 14ms/step - loss: 2.6471e-04
Epoch 20/100
64/64 [==============================] - 1s 14ms/step - loss: 2.6454e-04
Epoch 21/100
64/64 [==============================] - 1s 14ms/step - loss: 2.0183e-04
Epoch 22/100
64/64 [==============================] - 1s 14ms/step - loss: 2.2347e-04
Epoch 23/100
64/64 [==============================] - 1s 14ms/step - loss: 2.0584e-04
Epoch 24/100
64/64 [==============================] - 1s 14ms/step - loss: 2.0493e-04
Epoch 25/100
64/64 [==============================] - 1s 14ms/step - loss: 1.8271e-04
Epoch 26/100
64/64 [==============================] - 1s 14ms/step - loss: 1.7624e-04
Epoch 27/100
64/64 [==============================] - 1s 14ms/step - loss: 1.6914e-04
Epoch 28/100
64/64 [==============================] - 1s 14ms/step - loss: 1.5706e-04
Epoch 29/100
64/64 [==============================] - 1s 14ms/step - loss: 1.5758e-04
Epoch 30/100
64/64 [==============================] - 1s 14ms/step - loss: 1.7418e-04
Epoch 31/100
64/64 [==============================] - 1s 14ms/step - loss: 1.7235e-04
Epoch 32/100
64/64 [==============================] - 1s 14ms/step - loss: 1.5608e-04
Epoch 33/100
64/64 [==============================] - 1s 14ms/step - loss: 1.4036e-04
Epoch 34/100
64/64 [==============================] - 1s 14ms/step - loss: 2.0073e-04
Epoch 35/100
64/64 [==============================] - 1s 14ms/step - loss: 1.4431e-04
Epoch 36/100
64/64 [==============================] - 1s 14ms/step - loss: 1.4923e-04
Epoch 37/100
64/64 [==============================] - 1s 14ms/step - loss: 1.3997e-04
Epoch 38/100
64/64 [==============================] - 1s 14ms/step - loss: 1.3732e-04
Epoch 39/100
64/64 [==============================] - 1s 14ms/step - loss: 1.3894e-04
Epoch 40/100
64/64 [==============================] - 1s 14ms/step - loss: 1.4029e-04
Epoch 41/100
64/64 [==============================] - 1s 14ms/step - loss: 1.1931e-04
Epoch 42/100
64/64 [==============================] - 1s 14ms/step - loss: 1.1095e-04
Epoch 43/100
64/64 [==============================] - 1s 14ms/step - loss: 1.5364e-04
Epoch 44/100
64/64 [==============================] - 1s 14ms/step - loss: 1.2021e-04
Epoch 45/100
64/64 [==============================] - 1s 14ms/step - loss: 1.3986e-04
Epoch 46/100
64/64 [==============================] - 1s 14ms/step - loss: 1.1162e-04
Epoch 47/100
64/64 [==============================] - 1s 16ms/step - loss: 1.1788e-04
Epoch 48/100
64/64 [==============================] - 1s 15ms/step - loss: 1.1198e-04
Epoch 49/100
64/64 [==============================] - 1s 14ms/step - loss: 1.1119e-04
Epoch 50/100
64/64 [==============================] - 1s 14ms/step - loss: 1.2146e-04
Epoch 51/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0934e-04
Epoch 52/100
64/64 [==============================] - 1s 14ms/step - loss: 1.3719e-04
Epoch 53/100
64/64 [==============================] - 1s 15ms/step - loss: 1.5263e-04
Epoch 54/100
64/64 [==============================] - 1s 15ms/step - loss: 1.0821e-04
Epoch 55/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0546e-04
Epoch 56/100
64/64 [==============================] - 1s 15ms/step - loss: 1.1087e-04
Epoch 57/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0679e-04
Epoch 58/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0365e-04
Epoch 59/100
64/64 [==============================] - 1s 14ms/step - loss: 1.2739e-04
Epoch 60/100
64/64 [==============================] - 1s 14ms/step - loss: 9.7388e-05
Epoch 61/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0564e-04
Epoch 62/100
64/64 [==============================] - 1s 14ms/step - loss: 1.2045e-04
Epoch 63/100
64/64 [==============================] - 1s 14ms/step - loss: 1.1745e-04
Epoch 64/100
64/64 [==============================] - 1s 14ms/step - loss: 1.1334e-04
Epoch 65/100
64/64 [==============================] - 1s 14ms/step - loss: 9.8233e-05
Epoch 66/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0251e-04
Epoch 67/100
64/64 [==============================] - 1s 14ms/step - loss: 9.1577e-05
Epoch 68/100
64/64 [==============================] - 1s 14ms/step - loss: 1.5062e-04
Epoch 69/100
64/64 [==============================] - 1s 14ms/step - loss: 1.2784e-04
Epoch 70/100
64/64 [==============================] - 1s 14ms/step - loss: 1.2438e-04
Epoch 71/100
64/64 [==============================] - 1s 14ms/step - loss: 9.2123e-05
Epoch 72/100
64/64 [==============================] - 1s 14ms/step - loss: 8.7226e-05
Epoch 73/100
64/64 [==============================] - 1s 14ms/step - loss: 8.9998e-05
Epoch 74/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0895e-04
Epoch 75/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0280e-04
Epoch 76/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0889e-04
Epoch 77/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0240e-04
Epoch 78/100
64/64 [==============================] - 1s 14ms/step - loss: 9.2744e-05
Epoch 79/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0128e-04
Epoch 80/100
64/64 [==============================] - 1s 14ms/step - loss: 8.8985e-05
Epoch 81/100
64/64 [==============================] - 1s 14ms/step - loss: 9.8151e-05
Epoch 82/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0025e-04
Epoch 83/100
64/64 [==============================] - 1s 14ms/step - loss: 1.2269e-04
Epoch 84/100
64/64 [==============================] - 1s 14ms/step - loss: 9.3485e-05
Epoch 85/100
64/64 [==============================] - 1s 14ms/step - loss: 9.5440e-05
Epoch 86/100
64/64 [==============================] - 1s 14ms/step - loss: 8.4447e-05
Epoch 87/100
64/64 [==============================] - 1s 14ms/step - loss: 8.2244e-05
Epoch 88/100
64/64 [==============================] - 1s 14ms/step - loss: 8.3451e-05
Epoch 89/100
64/64 [==============================] - 1s 14ms/step - loss: 8.5823e-05
Epoch 90/100
64/64 [==============================] - 1s 14ms/step - loss: 9.1595e-05
Epoch 91/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0452e-04
Epoch 92/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0908e-04
Epoch 93/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0312e-04
Epoch 94/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0181e-04
Epoch 95/100
64/64 [==============================] - 1s 14ms/step - loss: 1.0090e-04
Epoch 96/100
64/64 [==============================] - 1s 14ms/step - loss: 8.3450e-05
Epoch 97/100
64/64 [==============================] - 1s 15ms/step - loss: 8.9355e-05
Epoch 98/100
64/64 [==============================] - 1s 15ms/step - loss: 8.7632e-05
Epoch 99/100
64/64 [==============================] - 1s 14ms/step - loss: 8.4219e-05
Epoch 100/100
64/64 [==============================] - 1s 14ms/step - loss: 9.7970e-05

Model histroy

history.history['loss'][:10]

[0.005947669502347708,
 0.000399257056415081,
 0.00042879246757365763,
 0.0004657871031668037,
 0.0004088585264980793,
 0.0003349783073645085,
 0.00032332821865566075,
 0.0003780880942940712,
 0.0003595463349483907,
 0.0003238031640648842]

Prepare validation data for prediction

print(len(new_df))
print(len(valid))

1258
558

test_inputs = new_df[len(new_df) - len(valid) - 60:].values
test_inputs[:10]

array([[1642.810059],
       [1538.880005],
       [1530.420044],
       [1598.01001],
       [1665.530029],
       [1665.530029],
       [1627.800049],
       [1642.810059],
       [1755.4899899999998],
       [1754.910034]], dtype=object)

Reshape and transform test_inputs

test_inputs = test_inputs.reshape(-1,1)
test_inputs  = scaler.transform(test_inputs)
test_inputs[:10]

array([[0.35529198],
       [0.31981431],
       [0.31692641],
       [0.33999899],
       [0.36304769],
       [0.36304769],
       [0.35016814],
       [0.35529198],
       [0.39375651],
       [0.39355854]])

Create X_test

X_test = []
for i in range(60, test_inputs.shape[0]):
    X_test.append(test_inputs[i-60:i, 0])

Convert X_test into numpy array

X_test = np.array(X_test)

print(X_test)
print(X_test.shape)

[[[0.35529198]
  [0.31981431]
  [0.31692641]
  ...
  [0.35165989]
  [0.35433956]
  [0.35942927]]

 [[0.31981431]
  [0.31692641]
  [0.33999899]
  ...
  [0.35433956]
  [0.35942927]
  [0.36476812]]

 [[0.31692641]
  [0.33999899]
  [0.36304769]
  ...
  [0.35942927]
  [0.36476812]
  [0.35361246]]

 ...

 [[0.85983038]
  [0.87521205]
  [0.86209699]
  ...
  [0.89498715]
  [0.91395652]
  [0.92075307]]

 [[0.87521205]
  [0.86209699]
  [0.85417059]
  ...
  [0.91395652]
  [0.92075307]
  [0.94563826]]

 [[0.86209699]
  [0.85417059]
  [0.85980647]
  ...
  [0.92075307]
  [0.94563826]
  [0.94809262]]]
(558, 60, 1)

print(X_test.shape)

(558, 60)

Reshape X_test

X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
print(X_test.shape)

(558, 60, 1)

Predict X_test data

closing_price = model.predict(X_test)
closing_price[:10]

array([[0.35466692],
       [0.3603092 ],
       [0.35042325],
       [0.3362159 ],
       [0.36019176],
       [0.37736833],
       [0.34817305],
       [0.3489977 ],
       [0.35752347],
       [0.35195965]], dtype=float32)

Scaler inverse transformation

closing_price = scaler.inverse_transform(closing_price)
closing_price[:10]

array([[1640.979 ],
       [1657.5078],
       [1628.5474],
       [1586.9277],
       [1657.1638],
       [1707.4817],
       [1621.9556],
       [1624.3713],
       [1649.347 ],
       [1633.0482]], dtype=float32)

Visualize actual and predicted stock price

import matplotlib.pyplot as plt

Actual and predicted stock price for test data

train = new_df[:700]
valid = new_df[700:]
valid['Predictions'] = closing_price

/opt/tljh/user/lib/python3.7/site-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until

plt.figure(figsize=(16,8)) 
plt.plot(valid['Close'], color = 'green', label = 'Actual Amazon Inc. Stock Price',ls='--')
plt.plot(valid['Predictions'], color = 'red', label = 'Predicted Amazon Inc. Stock Price',ls='-')
plt.title('Predicted Amazon Inc. Stock Price')
plt.xlabel('Time in days')
plt.ylabel('Stock Price')
plt.legend()

<matplotlib.legend.Legend at 0x7f38e14585d0>

Visualize training and test data

plt.figure(figsize=(16,8)) 
plt.plot(train['Close'], color = 'blue')
plt.plot(valid[['Close','Predictions']])
plt.title('Amazon Inc. Stock Price')
plt.xlabel('Time in days')
plt.ylabel('Stock Price')

Text(0, 0.5, 'Stock Price')

	Date	Open	High	Low	Close	Adj Close	Volume
0	2016-04-14	615.070007	624.380005	615.070007	620.750000	620.750000	3512100
1	2016-04-15	621.919983	626.770020	618.109985	625.890015	625.890015	2887700
2	2016-04-18	625.349976	637.640015	624.960022	635.349976	635.349976	4360900
3	2016-04-19	637.140015	638.010010	620.799988	627.900024	627.900024	4055900
4	2016-04-20	630.000000	636.549988	623.000000	632.989990	632.989990	2609400
...	...	...	...	...	...	...	...
1253	2021-04-07	3233.800049	3303.610107	3223.649902	3279.389893	3279.389893	3346200
1254	2021-04-08	3310.899902	3324.500000	3292.000000	3299.300049	3299.300049	2812100
1255	2021-04-09	3304.699951	3372.199951	3288.899902	3372.199951	3372.199951	4334600
1256	2021-04-12	3355.209961	3395.040039	3351.149902	3379.389893	3379.389893	3281800
1257	2021-04-13	3400.850098	3432.000000	3395.629883	3400.000000	3400.000000	3304900

	Date	Open	High	Low	Close	Adj Close	Volume
0	2016-04-14	615.070007	624.380005	615.070007	620.750000	620.750000	3512100
1	2016-04-15	621.919983	626.770020	618.109985	625.890015	625.890015	2887700
2	2016-04-18	625.349976	637.640015	624.960022	635.349976	635.349976	4360900
3	2016-04-19	637.140015	638.010010	620.799988	627.900024	627.900024	4055900
4	2016-04-20	630.000000	636.549988	623.000000	632.989990	632.989990	2609400

	Date	Open	High	Low	Close	Adj Close	Volume
1253	2021-04-07	3233.800049	3303.610107	3223.649902	3279.389893	3279.389893	3346200
1254	2021-04-08	3310.899902	3324.500000	3292.000000	3299.300049	3299.300049	2812100
1255	2021-04-09	3304.699951	3372.199951	3288.899902	3372.199951	3372.199951	4334600
1256	2021-04-12	3355.209961	3395.040039	3351.149902	3379.389893	3379.389893	3281800
1257	2021-04-13	3400.850098	3432.000000	3395.629883	3400.000000	3400.000000	3304900

	Date	Close
0	NaN	NaN
1	NaN	NaN
2	NaN	NaN
3	NaN	NaN
4	NaN	NaN
...	...	...
1253	NaN	NaN
1254	NaN	NaN
1255	NaN	NaN
1256	NaN	NaN
1257	NaN	NaN

	Date	Open	High	Low	Close	Adj Close	Volume
0	2016-04-14	615.070007	624.380005	615.070007	620.750000	620.750000	3512100
1	2016-04-15	621.919983	626.770020	618.109985	625.890015	625.890015	2887700
2	2016-04-18	625.349976	637.640015	624.960022	635.349976	635.349976	4360900
3	2016-04-19	637.140015	638.010010	620.799988	627.900024	627.900024	4055900
4	2016-04-20	630.000000	636.549988	623.000000	632.989990	632.989990	2609400
...	...	...	...	...	...	...	...
1253	2021-04-07	3233.800049	3303.610107	3223.649902	3279.389893	3279.389893	3346200
1254	2021-04-08	3310.899902	3324.500000	3292.000000	3299.300049	3299.300049	2812100
1255	2021-04-09	3304.699951	3372.199951	3288.899902	3372.199951	3372.199951	4334600
1256	2021-04-12	3355.209961	3395.040039	3351.149902	3379.389893	3379.389893	3281800
1257	2021-04-13	3400.850098	3432.000000	3395.629883	3400.000000	3400.000000	3304900

	Date	Close
0	2016-04-14	620.75
1	2016-04-15	625.89
2	2016-04-18	635.35
3	2016-04-19	627.9
4	2016-04-20	632.99
...	...	...
1253	2021-04-07	3279.39
1254	2021-04-08	3299.3
1255	2021-04-09	3372.2
1256	2021-04-12	3379.39
1257	2021-04-13	3400

Predict Amazon Inc Stock Price with Machine Learning

Predict Amazon Inc Stock Price with Machine Learning

Import library

Load data

Read csv file using pandas library

View dataframe shape

pandas.DataFrame.shape:

Print first five records

DataFrame.head(n=5):

Print last five records

DataFrame.tail(n=5)

Create a new dataframe

class pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Parameters

data: ndarray, Iterable, dict, or DataFrame

index: Index or array-like

columns: Index or array-like

dtype: dtype, default None

copy: bool, default False

Sort the dataframe

DataFrame.sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index=False, key=None)

Fill data in new dataframe

Set date as index

Drop Date column

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

Parameters

labels: single label or list-like

axis{0 or ‘index’, 1 or ‘columns’}, default 0

index: single label or list-like

columns: single label or list-like

inplace: bool, default False

Returns

pandas.DataFrame.values

Scaling features to a range

class sklearn.preprocessing.MinMaxScaler(feature_range=0, 1, *, copy=True, clip=False)

fit_transform(X, y=None, **fit_params)

Parameters

Returns

Split data in train and test

Converting dataset into X_train and y_train

Convert X_train and y_train into numpy array

Reshape X_train array

numpy.reshape(a, newshape, order='C')

Create model

Add layers in model

Model summary

Compile model

Train the model

Model histroy

Prepare validation data for prediction

Reshape and transform test_inputs

Create X_test

Convert X_test into numpy array

Reshape X_test

Predict X_test data

Scaler inverse transformation

Visualize actual and predicted stock price

Actual and predicted stock price for test data

Visualize training and test data

kindergarten

Python for kids

Fourier series

Linear Equations

Geometry

Laplace

Vectors

Differential equations

Functions

Jacobian

Lagrangian

Waves

Electromagnetism

Optics

Quantum mechanics concepts

Theory of relativity

Kinematics

Thermodynamics

Formulae

A level physics

Chemistry