PyTorch - Introduction to Tensors and some interesting functions

What are "Tensors" and why do we need it?

Lets start with a very simple definition of "Tensors" - Tensors are a generalization of matrices and can be understood as multidimensional arrays. It could be one dimensional scalars/constants, one-dimensional vectors or simply any n-dimensional matrix.

The users of python/numpy very well know that the Numpy ndarrays which makes it easy to do mathematical and scientific computations efficiently due to the ability to "broadcast" the mathematical operations. This facilitates parallel processing of multiple independent mathematical operations thus increasing efficiency of codes manifold. This makes it very useful in fields like Deep Learning where a lot of calculations(algebra/calculus) are needed to optimize the underlying Neural Networks.

What makes PyTorch so important?

According to wikipedia: "PyTorch is an open source machine learning library based on the Torch library, used for applications such as Computer Vision and Natural Language Processing, primarily developed by Facebook's AI Research lab(FAIR)."

The PyTorch library is built on top of Numpy so it makes it easy to read big datasets into python as ndarray objects and then convert it into Tensor objects which allow us to have pretty fast computations. Several other libraries like Tensorflow/Keras, Theano also use the same concept and each one has its own framework to build/train and optimize neural networks. Pytorch can use GPUs as well to further speed up the calculations since GPUs are built to perform very high number matrix operations in parallel and that's exactly what tensors basically used for.

This is what makes libraries like PyTorch incredibly important. A deep learning model that would otherwise take a day to train on CPUs can easily be trained using PyTorch in just a few hours.

Some basic functions

Here i've chosen and explained the application of some of the basic PyTorch Tensor based functions through examples and errors :

backward()
flatten()
eq()
flip()
exp()

# Import torch and other required modules
import jovian
import torch
import numpy as np

Function 1 - torch.tensor.backward()

Computes the gradient of current tensor w.r.t. graph leaves.

The graph is differentiated using the chain rule. If the tensor is non-scalar (i.e. its data has more than one element) and requires gradient, the function additionally requires specifying gradient. It should be a tensor of matching type and location, that contains the gradient of the differentiated function w.r.t. self.

# Example 1 - working
x = torch.tensor(10.)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(25., requires_grad=True)
y = w * x + b
y.backward()
print("Gradient for w :",w.grad)
print("Gradient for b :",b.grad)

Gradient for w : tensor(10.)
Gradient for b : tensor(1.)

In the above code block, we have defined a tensor(scalar) "y" that depends on x, w and b.

The tensors "w" and "b" require gradient and hence backward() function can be called on y to get partial derivative of y w.r.t. w and b.

The same have been printed after calling the backward function. Everything works fine.