Build the Neural Network with PyTorch

Import libraries

import torch
from xv.util import listAttr

torch.__version__

'1.8.1+cu102'

Check if GPU (CUDA) is available

We want to train our model on a hardware configuration like the GPU, if it is available. For that need to check if torch.cuda is available, else we continue to use the CPU.

use_cuda = torch.cuda.is_available()
use_cuda

False

In this device there is no GPU, so it is showing "False".

Datasets

PyTorch provides two data primitives: torch.utils.data.DataLoader and torch.utils.data.Dataset that allow you to use pre-loaded datasets as well as your own data.

Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.

PyTorch domain libraries provide a number of pre-loaded datasets (such as FashionMNIST) that subclass torch.utils.data.Dataset

Import libraries

from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor

View available datasets

listAttr(datasets)

Loading a dataset

THE MNIST DATABASE

The MNIST database of handwritten digits, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

We load the MNIST Dataset with the following parameters:

Parameters: root is the path where the train/test data is stored

train specifies training or test dataset,

download=True downloads the data from the internet if it’s not available at root.

transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.RandomCrop

target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

training_data = datasets.MNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

test_data = datasets.MNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz

  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz

  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz

  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz

  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw

Processing...
Done!

/opt/tljh/user/lib/python3.7/site-packages/torchvision/datasets/mnist.py:502: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /pytorch/torch/csrc/utils/tensor_numpy.cpp:143.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)

training_data

Dataset MNIST
    Number of datapoints: 60000
    Root location: data
    Split: Train
    StandardTransform
Transform: ToTensor()

test_data

Dataset MNIST
    Number of datapoints: 10000
    Root location: data
    Split: Test
    StandardTransform
Transform: ToTensor()

Preparing data for training with DataLoaders

The Dataset retrieves our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in “minibatches”, reshuffle the data at every epoch to reduce model overfitting, and use Python’s multiprocessing to speed up data retrieval.

DataLoader is an iterable that abstracts this complexity for us in an easy API.

from torch.utils.data import DataLoader

loaders = {
    'train' : torch.utils.data.DataLoader(training_data, 
                                          batch_size=100, 
                                          shuffle=True, 
                                          num_workers=1),
    
    'test'  : torch.utils.data.DataLoader(test_data, 
                                          batch_size=100, 
                                          shuffle=True, 
                                          num_workers=1),
}
loaders

{'train': <torch.utils.data.dataloader.DataLoader at 0x7f40b2a6d0d0>,
 'test': <torch.utils.data.dataloader.DataLoader at 0x7f40b2a6d190>}

Iterate the DataLoader

We have loaded that dataset into the Dataloader and can iterate through the dataset as needed.

Each iteration below returns a batch of train_features and train_labels (containing batch_size=10 features and labels respectively). Because we specified shuffle=True, after we iterate over all batches the data is shuffled.

sample = next(iter(loaders['train']))
imgs, lbls = sample

lbls

tensor([1, 6, 6, 1, 6, 5, 2, 2, 0, 8, 1, 0, 8, 6, 2, 6, 5, 1, 9, 6, 7, 9, 1, 1,
        3, 9, 9, 6, 2, 2, 9, 0, 1, 7, 5, 8, 9, 9, 0, 9, 6, 4, 8, 1, 8, 1, 6, 0,
        9, 6, 4, 8, 8, 3, 2, 2, 4, 3, 6, 0, 1, 9, 2, 7, 0, 5, 1, 1, 1, 0, 7, 7,
        3, 6, 2, 2, 5, 2, 2, 1, 1, 1, 2, 1, 5, 1, 7, 4, 0, 3, 8, 4, 6, 3, 9, 6,
        3, 4, 8, 0])

Visualizing the Dataset

import matplotlib.pyplot as plt

figure = plt.figure(figsize=(10, 8))
cols, rows = 5, 5
for i in range(1, cols * rows + 1):
    sample_idx = torch.randint(len(training_data), size=(1,)).item()
    img, label = training_data[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.title(label)
    plt.axis("off")
    plt.imshow(img.squeeze(), cmap="gray")
plt.show()

Define the Neural Network model

Neural networks comprise of layers/modules that perform operations on data. The torch.nn namespace provides all the building blocks you need to build your own neural network.

Every module in PyTorch subclasses the nn.Module. A neural network is a module itself that consists of other modules (layers). This nested structure allows for building and managing complex architectures easily.

from torch import nn
import torch.nn.functional as F

class CustomPytorchModel(nn.Module):
    def __init__(self, input_size = 784):
        super().__init__()
        self.fc1 = nn.Linear(input_size, 512)
        self.fc2 = nn.Linear(512, 256)
        self.fc3 = nn.Linear(256, 128)
        self.fc4 = nn.Linear(128, 64)
        self.fc5 = nn.Linear(64,10)
        self.dropout = nn.Dropout(p=0.2)
        
    def forward(self, input_data):
        x1 = input_data.view(input_data.shape[0], -1)
        x2 = self.dropout(F.relu(self.fc1(x1)))
        x3 = self.dropout(F.relu(self.fc2(x2)))
        x4 = self.dropout(F.relu(self.fc3(x3)))
        x5 = self.dropout(F.relu(self.fc4(x4)))
        x6 = F.log_softmax(self.fc5(x5), dim=1)
        return x6

Initialize model

# Create the network, define the loss_function and optimizer
model = CustomPytorchModel()

# move model to GPU if CUDA is available
if use_cuda:
    model = model.cuda()

print(model)

CustomPytorchModel(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=128, bias=True)
  (fc4): Linear(in_features=128, out_features=64, bias=True)
  (fc5): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

Define a Loss Function

Error/loss is calculated as the difference between the actual output and the predicted output.

loss_function = nn.NLLLoss()
loss_function

NLLLoss()

Define a Optimization Function

The weights are modified using a function called Optimization Function.

torch.optim

is a package implementing various optimization algorithms. To use torch.optim we have to construct an optimizer object, that will hold the current state and will update the parameters based on the computed gradients.

To construct an Optimizer you have to give it an iterable containing the parameters to optimize. Then, you can specify optimizer-specific options such as the learning rate, weight decay, etc.

from torch import optim

optimizer = optim.Adam(model.parameters(), lr=0.001)
optimizer

Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.001
    weight_decay: 0
)

Train the model

Step 1: Create a function called train and loop through epoch

def train(start_epochs, n_epochs, model):
    for epoch in range(start_epochs, n_epochs + 1):
        print(f"epoch = {epoch}")
        
        pass
    
    # return trained model
    return model
    pass


train(0, 2, model)

epoch = 0
epoch = 1
epoch = 2

CustomPytorchModel(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=128, bias=True)
  (fc4): Linear(in_features=128, out_features=64, bias=True)
  (fc5): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

Step 2: for each epoch, set loss params

def train(start_epochs, n_epochs, model):
    for epoch in range(start_epochs, n_epochs + 1):
        
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        
        #Set the model in training mode
        model.train()
        
        print(f"epoch = {epoch}")
        
    
    # return trained model
    return model

    pass


train(0, 2, model)

epoch = 0
epoch = 1
epoch = 2

CustomPytorchModel(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=128, bias=True)
  (fc4): Linear(in_features=128, out_features=64, bias=True)
  (fc5): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

Step 3: iterate training data loader

def train(start_epochs, n_epochs, model, loaders):
    for epoch in range(start_epochs, n_epochs + 1):
        
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        
        #Set the model in training mode
        model.train()
        
        print(f"batch started: ")
        for batch_idx, (data, target) in enumerate(loaders['train']):
            #print(f"batch_idx: {batch_idx}")
            if batch_idx % 50 == 0:
                print(f"{batch_idx}, ", end = "")
            pass
            
        print(f"epoch = {epoch}")
        
    
    # return trained model
    return model

    pass


train(0, 2, model, loaders)

batch started: 
0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, epoch = 0
batch started: 
0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, epoch = 1
batch started: 
0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, epoch = 2

CustomPytorchModel(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=128, bias=True)
  (fc4): Linear(in_features=128, out_features=64, bias=True)
  (fc5): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

Step 4: Compute training params for the batches for training data

def train_process_batches(model, loaders, optimizer, loss_function, verbose = True ):
    train_loss = 0.0
    
    model.train()
    if verbose:
        print(f"Training data batch process: ", end = "")
        
    for batch_idx, (data, target) in enumerate(loaders['train']):
        # move to GPU
        if use_cuda:
            data, target = data.cuda(), target.cuda()
            
        #we need to set the gradients to zero before starting to do backpropragation 
        #because PyTorch accumulates the gradients on subsequent backward passes
        optimizer.zero_grad()
        
        #forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)
        
        #calculate the batch loss
        loss = loss_function(output, target)
        
        #backward pass: compute gradient of the loss with respect to model parameters
        loss.backward()
        
        # perform a single optimization step (parameter update)
        optimizer.step()
        
        ## calculate train_loss
        train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))

        if batch_idx % 50 == 0:
            if verbose:
                print(f"\t{batch_idx}, {train_loss}", end = "\n")
            else:
                print(f"\t{batch_idx}, ", end = "")

        pass
    
    return train_loss
    pass

def train(start_epochs, n_epochs, model, loaders):
    for epoch in range(start_epochs, n_epochs + 1):
        print(f"Epoch: {epoch}, ", end = "\n")

        # initialize variables to monitor training and validation loss
        valid_loss = 0.0
        
        #train model
        train_loss = train_process_batches(model, loaders, optimizer, loss_function)
        
        print(f"\ntrain_loss = {train_loss}")
    # return trained model
    return model

train(0, 1, model, loaders)

Epoch: 0, 
Training data batch process: 	0, 2.305771827697754
	50, 1.5398510694503784
	100, 1.1072068214416504
	150, 0.8877690434455872
	200, 0.7574605941772461
	250, 0.6712536215782166
	300, 0.6092685461044312
	350, 0.5596079230308533
	400, 0.5209376811981201
	450, 0.48695871233940125
	500, 0.45983585715293884
	550, 0.43591877818107605

train_loss = 0.41575857996940613
Epoch: 1, 
Training data batch process: 	0, 0.32345160841941833
	50, 0.17298457026481628
	100, 0.16961823403835297
	150, 0.16186124086380005
	200, 0.16395391523838043
	250, 0.15951061248779297
	300, 0.15689998865127563
	350, 0.15646544098854065
	400, 0.15325626730918884
	450, 0.1527736335992813
	500, 0.15049482882022858
	550, 0.14950348436832428

train_loss = 0.1487814486026764

CustomPytorchModel(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=128, bias=True)
  (fc4): Linear(in_features=128, out_features=64, bias=True)
  (fc5): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

Step 5: Compute training params for the batches for test data

def eval_process_batches(model, loaders, optimizer, loss_function, verbose = True ):
    valid_loss = 0.0

    model.eval()
    if verbose:
        print(f"Test data batch process: ", end = "")
        
    for batch_idx, (data, target) in enumerate(loaders['test']):

        # move to GPU
        if use_cuda:
            data, target = data.cuda(), target.cuda()

        ## update the average validation loss
        # forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)

        # calculate the batch loss
        loss = loss_function(output, target)

        # update average validation loss 
        valid_loss = valid_loss + ((1 / (batch_idx + 1)) * (loss.data - valid_loss))
            
        if batch_idx % 20 == 0:
            if verbose:
                print(f"\t{batch_idx}, {valid_loss}", end = "\n")
            else:
                print(f"\t{batch_idx}, ", end = "")

        pass
    print()
    return valid_loss
    pass

def train(start_epochs, n_epochs, model, loaders):
    for epoch in range(start_epochs, n_epochs+1):
        print(f"Epoch: {epoch}, ", end = "\n")

        # initialize variables to monitor training and validation loss
        valid_loss = 0.0
        
        #train model
        train_loss = train_process_batches(model, loaders, optimizer, loss_function, verbose = False)
        valid_loss = eval_process_batches(model, loaders, optimizer, loss_function, verbose = True)
        
          
        print(f"\ntrain_loss = {train_loss}")
        print(f"\nvalid_loss = {valid_loss}")
        
    # return trained model
    return model

train(0, 1, model, loaders)

Epoch: 0, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, Test data batch process: 	0, 0.08128464967012405
	20, 0.07977210730314255
	40, 0.08911243826150894
	60, 0.09196033328771591
	80, 0.08802781254053116


train_loss = 0.10612005740404129

valid_loss = 0.08945155888795853
Epoch: 1, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, Test data batch process: 	0, 0.14080554246902466
	20, 0.09095311164855957
	40, 0.09221968799829483
	60, 0.09187059849500656
	80, 0.08812899887561798


train_loss = 0.08659271895885468

valid_loss = 0.09170358628034592

CustomPytorchModel(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=128, bias=True)
  (fc4): Linear(in_features=128, out_features=64, bias=True)
  (fc5): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

Step 6: Compute training params for the batches for test data

def train(start_epochs, n_epochs, model, loaders):
    for epoch in range(start_epochs, n_epochs+1):
        print(f"Epoch: {epoch}, ", end = "\n")

        # initialize variables to monitor training and validation loss
        valid_loss = 0.0
        
        #train model
        train_loss = train_process_batches(model, loaders, optimizer, 
                                           loss_function, verbose = False)
        valid_loss = eval_process_batches(model, loaders, optimizer, 
                                          loss_function, verbose = False)
        
        # calculate average losses
        train_loss = train_loss/len(loaders['train'].dataset)
        valid_loss = valid_loss/len(loaders['test'].dataset)

        # print training/validation statistics 
        print('Epoch: {} Training Loss: {:.6f}, Validation Loss: {:.6f}'.format(
            epoch, 
            train_loss,
            valid_loss
            ))
        
        print(f" Over")
    # return trained model
    return model

train(0, 10, model, loaders)

Epoch: 0, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 0 Training Loss: 0.000001, Validation Loss: 0.000007
 Over
Epoch: 1, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 1 Training Loss: 0.000001, Validation Loss: 0.000007
 Over
Epoch: 2, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 2 Training Loss: 0.000001, Validation Loss: 0.000008
 Over
Epoch: 3, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 3 Training Loss: 0.000001, Validation Loss: 0.000006
 Over
Epoch: 4, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 4 Training Loss: 0.000001, Validation Loss: 0.000007
 Over
Epoch: 5, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 5 Training Loss: 0.000001, Validation Loss: 0.000007
 Over
Epoch: 6, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 6 Training Loss: 0.000001, Validation Loss: 0.000007
 Over
Epoch: 7, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 7 Training Loss: 0.000001, Validation Loss: 0.000008
 Over
Epoch: 8, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 8 Training Loss: 0.000001, Validation Loss: 0.000007
 Over
Epoch: 9, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 9 Training Loss: 0.000000, Validation Loss: 0.000008
 Over
Epoch: 10, 
	0, 	50, 	100, 	150, 	200, 	250, 	300, 	350, 	400, 	450, 	500, 	550, 	0, 	20, 	40, 	60, 	80, 
Epoch: 10 Training Loss: 0.000000, Validation Loss: 0.000006
 Over

CustomPytorchModel(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=128, bias=True)
  (fc4): Linear(in_features=128, out_features=64, bias=True)
  (fc5): Linear(in_features=64, out_features=10, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

Build the Neural Network with PyTorch

Build the Neural Network with PyTorch

Import libraries

Check if GPU (CUDA) is available

Datasets

Import libraries

View available datasets

Loading a dataset

THE MNIST DATABASE

Preparing data for training with DataLoaders

Iterate the DataLoader

Visualizing the Dataset

Define the Neural Network model

Initialize model

Define a Loss Function

Define a Optimization Function

torch.optim

Train the model

Step 1: Create a function called train and loop through epoch

Step 2: for each epoch, set loss params

Step 3: iterate training data loader

Step 4: Compute training params for the batches for training data

Step 5: Compute training params for the batches for test data

Step 6: Compute training params for the batches for test data

kindergarten

Python for kids

Fourier series

Linear Equations

Geometry

Laplace

Vectors

Differential equations

Functions

Jacobian

Lagrangian

Waves

Electromagnetism

Optics

Quantum mechanics concepts

Theory of relativity

Kinematics

Thermodynamics

Formulae

A level physics

Chemistry

English

Geography

Animation

Plotting

SVG

Python

Machine Learning

TensorFlow

PySpark

PyTorch

Natural Language Processing

Others