Machine Learning Basics - Part 1

This is going to be a series for who want to learn machine learning fundamentals as I am starting my journey to document the process.

Dec 30, 2023

Before we delve further into the newsletter, let me provide a quick introduction about myself and the purpose of this newsletter. For those who wish to jump right ahead, feel free to skip this section.

- This newsletter reflects my personal and straightforward approach to learning, teaching, and documenting my journey in understanding the fundamentals of Machine Learning.

- I strive to explain concepts in a simple and clear manner, making them accessible to everyone.

- Each blog post will include a Colab button, allowing you to try out the provided code yourself.

- Lastly, anticipate a new blog every week, ensuring a continuous stream of quality content for you. Please subscribe to this newsletter to stay updated and not miss any upcoming insights.

Open in Colab

🚀 Mastering the Basics of PyTorch Tensors

Before we delve into the coding, let me provide you with a quick introduction to tensors.

In PyTorch, a tensor is a fundamental data structure - think of it as a powerful multi-dimensional array. Tensors are the building blocks for representing and manipulating data in machine learning.

In short, every operation that we perform in machine learning is done in tensor format. Consider it as a different type, similar to int or float, but specifically designed to take advantage of GPUs, making our machine learning training more efficient.

1️⃣ Creating Tensors with torch.arange

import torch
x = torch.arange(12, dtype=torch.float32)
x

Output:
tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

By using above operation you can generate a tensor of size 12.

2️⃣ Tensor Operations

# Number of elements
print(f"Number of elements: {x.numel()}")

# Shape
print(f"Shape: {x.shape}")

Output:
12
torch.Size([12])

You can inspect a previously declared tensor using .numel() and .shape to retrieve the total number of elements and the shape, respectively.

3️⃣ Reshaping Tensors

X = x.reshape(3, 4)

Output:
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

You can reshape an existing tensor into your desired shape, provided you know the current shape of the tensor, and the reshape operation is valid for use. This allows for flexibility in adapting the structure of tensors to suit the requirements of your machine learning operations.

4️⃣ Zeroes and Ones Tensors

# Tensors with all zeros
print(f"Tensors with all zeros:\n{torch.zeros((2, 3, 4))}")

# Tensors with all ones
print(f"Tensors with all ones:\n{torch.ones((2, 3, 4))}")

Output:
tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

Tensors with all ones:
tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

You can create a tensor with zeros and ones according to your desired shape using .zeros() and .ones() functions, respectively.

5️⃣ Gaussian Distribution

# Random elements from a standard Gaussian distribution
print(f"Random elements from Gaussian distribution:\n{torch.randn(3, 4)}")

Output:
Random elements from Gaussian distribution:
tensor([[-0.2341, -0.6406,  0.3683,  1.6027],
        [ 0.8805, -1.0081, -0.5039, -0.7385],
        [-1.2223,  0.4821, -0.0089, -0.4169]])

Instead of declaring tensors with zeros and ones, you can create tensors with random values. This is particularly useful when training a neural network, as it helps in initializing weights with diverse and random values using Gaussian Distribution.

6️⃣ Manual Tensor Creation

# Creating a tensor with manual values
print(f"Manual values:\n{torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])}")

Output:
Manual values:
tensor([[2, 1, 4, 3],
        [1, 2, 3, 4],
        [4, 3, 2, 1]])

This is the most simple way to declare your own tensor with your values.

7️⃣ Saving Memory

Y = torch.randn(3, 4)
X = torch.randn(3, 4)
before = id(Y)
Y = Y + X
print(id(Y) == before)

Z = torch.zeros_like(Y)
print('id(Z):', id(Z))
Z[:] = X + Y
print('id(Z):', id(Z))

before = id(X)
X += Y
print(id(X) == before)

Output:-
False
id(Z): 140594373912080
id(Z): 140594373912080
True

This is most important trick i learned while preparing for this blog to add an X factor and this will be useful for efficient memory operations.

In summary, Y = Y + X involves the creation of a new tensor, leading to a change in the memory address referenced by Y. On the other hand, X += Y modifies the existing tensor in-place, preserving the memory address of the original tensor (X).

8️⃣ Conversion

Converting between a torch Tensor and Numpy array is straightforward.

The torch Tensor and Numpy array will share their underlying memory, so changes to one will affect the other.

A = X.numpy()
B = torch.from_numpy(A)
print(type(A), type(B))


a = torch.tensor([3.5])
print(a, a.item(), float(a), int(a))

To convert a size-1 tensor to a Python scalar, we can use the item function or Python’s built-in functions.

Output:-
<class 'numpy.ndarray'> <class 'torch.Tensor'>
tensor([3.5000]) 3.5 3.5 3

This is just an overview and introduction to tensors if you want to explore further you can check out the documentation of Pytorch about Tensors.

Thanks for reading Pranith’s Substack! Subscribe for free to receive new posts and support my work.

Pranith’s Substack

Machine Learning Basics - Part 1

This is going to be a series for who want to learn machine learning fundamentals as I am starting my journey to document the process.

Discussion about this post