Convolutional Neural Networks (CNN) Fresco Play Handson Solution HackerRank

Lab 1: Welcome to CNN Numpy

Lab1: Convolution using NumPy

Task 1: Run the bellow cell to import necessary packages

# Task 1:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

Task 2: Read the image file 'home.png' and returns the pixel intensities in numpy format.

Instructions!

Read the image file 'home.png'(in current directory) using mpimg.imread("file_path") function provided by matplotlib.image module. This function reads the image and returns the pixel intensities in numpy format. Assign this result to variable img.
The dimension of img will now be: 𝑛_H x 𝑛_w x 𝑛_c
reshape img to dimension "𝑚 x 𝑛_H x 𝑛_w x 𝑛_c" and assign it to variable data. The dimension m will be one since we are dealing with one image data. (use numpy's reshape())
Expected Output:
- class 'numpy.ndarray'
- Image dimension (252, 362, 3)
- input data dimension (1, 252, 362, 3)

# Task 2:
###Start code here
img = mpimg.imread("home.png")
data = np.reshape(img,(1,img.shape[0], img.shape[1], img.shape[2]))
###End code

print(type(img))
print("Image dimension ",img.shape)
print("Input data dimension ", data.shape)

# Output:
# <class 'numpy.ndarray'>
# Image dimension  (252, 362, 3)
# Input data dimension  (1, 252, 362, 3)

Task 3: Run the below cell to view the image from the data.

# Task 3:
plt.imshow(data[0,:,:,:])
plt.grid(False)
plt.axis("off")

Task 4: Define a function zero_pad and return nd-array after padding.

Define method named zero_pad that performs specified number of zero padding on the input data.

Parameters:
- data: the data on which padding is performed
- pad: the amount of padding around the data
Returns:
- data_padded: the nd-array after padding

# Task 4: 
def zero_pad(data, pad):
    ### Start code here
    # Use np.pad to add padding around the data
    data_padded = np.pad(data, ((0, 0), (pad, pad), (pad, pad), (0, 0)), mode='constant', constant_values=0)
    ### End code 
    return data_padded

Task 5: Run the below cell to add zero zero padding using the method define above.

Expected output: [[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]]

# Task 5:
print("dimension before padding: ", data.shape)
img_pad = zero_pad(data, 10)
print("dimension after padding: ", img_pad.shape)
print(img_pad[0,8:12,8:12,1])
plt.imshow(img_pad[0,:,:,:], cmap = "gray")
plt.grid(False)

output1 = np.mean(img_pad)

# Output:
'''
dimension before padding:  (1, 252, 362, 3)
dimension after padding:  (1, 272, 382, 3)
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 1. 1.]
 [0. 0. 1. 1.]]
'''

Task 6: Write a function Convolution single step and slice the input data using the filter.

Define the function named conv_single_step() to convolve a slice of input data using the specified filter:

Parameters:
- data_slice: the receptive field over which convolution is performed
- W: the filter used for convolution
- b: the bias term
Returns:
- Z: convolved output over the receptive field

# Task 6:
def conv_single_step(data_slice, W, b):
    ### Start code
    # Element-wise multiplication of the data slice and the filter W
    conv = np.sum(data_slice * W) + b  # Perform convolution and add the bias
    Z = conv  # Assign the result to Z
    ### End code
    return Z

Task 7: Write a function Strided Convolution and perform strided convolution on given data.

Define the function named conv_single_step() to convolve a slice of input data using the specified filter:

Define method conv_forward to perform strided convolution on the input data.
Use conv_single_step() to perform the convolution at each stride.
Refer the code snippet provided in the course.
Parameters:
- data: input data on which convolution is performed
- W: the filter used for convolution operation
- b: the bias term
- hparams: dictionary defined by {"stride": s, "pad": p}
Returns:
- Z: the convolved output

# Task 7:
def conv_forward(data, W, b, hparams):
    ### Start code here
    
    # Extracting parameters
    stride = hparams["stride"]
    pad = hparams["pad"]

    # Get the dimensions of the input data and the filter
    (m, n_H, n_W, n_C) = data.shape  # Input data dimensions (m: number of samples)
    (f, f, n_C, n_filters) = W.shape  # Filter dimensions (assuming W is f x f x n_C x n_filters)

    # Calculate the dimensions of the output
    n_H_out = int((n_H - f + 2 * pad) / stride) + 1
    n_W_out = int((n_W - f + 2 * pad) / stride) + 1
    
    # Initialize the output array with zeros
    Z = np.zeros((m, n_H_out, n_W_out, n_filters))  # Adjusted to include the batch dimension and number of filters

    # Apply padding to the input data
    data_padded = zero_pad(data, pad)

    # Perform the convolution operation for each sample in the batch
    for i in range(m):  # Loop over each sample
        for h in range(n_H_out):
            for w in range(n_W_out):
                for k in range(n_filters):
                    # Define the corners of the slice
                    h_start = h * stride
                    h_end = h_start + f
                    w_start = w * stride
                    w_end = w_start + f

                    # Extract the current slice from the padded data for the i-th sample
                    # Perform convolution for each filter
                    Z[i, h, w, k] = conv_single_step(data_padded[i, h_start:h_end, w_start:w_end, :], W[:,:,:,k], b[:,:,:,k])

    ### End code
    return Z  # (convolved output)

Task 8: Run the cell to perfom the convolution operation using the method defined above.

Expected output: 0.145

# Task 8:
np.random.seed(1)
input_ = np.random.randn(10, 4, 4, 3)
W = np.random.randn(2, 2, 3, 8)
b = np.random.randn(1, 1, 1, 8)
hparameters = {"pad" : 1,
               "stride": 1}

output_ = conv_forward(input_, W, b, hparameters)
print(np.mean(output_))

Task 9: Run the below cell to define edge_detect filter, the filter values for edge detection has been define for you

# Task 9:
edge_detect = np.array([[-1,-1,-1],[-1,8,-1],[-1,-1,-1]]).reshape((3,3,1,1))

Task 10: Perform strided convolution.

Instructions!

Define a dictionary hparams with stride = 1 and pad = 0
initialize bias parameter b to zero of dimension (1,1,1,1) hint: use np.zeros()
Perform strided convolution using the method conv_forward() you defined previously:
- pass edge_detect filter, bais b and hparams as parameters to perform convolution on the data variable defined previously.
- assign the result to variable Z

# Task 10:
###Start code
hparams = {"stride": 1, "pad": 0}
b = np.zeros((1, 1, 1, 1))  # Bias initialized to zeros

# Step 2: Perform convolution using conv_forward
Z = conv_forward(data, edge_detect, b, hparams)

plt.clf()
plt.imshow(Z[0,:,:,0], cmap='gray',vmin=0, vmax=1)
plt.grid(False)
print("dimension of image before convolution: ", data.shape)
print("dimension of image after convolution: ", Z.shape)

output2 = np.mean(Z[0,100:200,200:300,0])


##below are the filters for vetical as well as horizontal edge detection, try these filters once you have completed this handson.
##vertical_filter = np.array([[-1,2,-1],[-1,2,-1],[-1,2,-1]]).reshape(3,3,1,1)
##horizontal_filter = np.array([[-1,-1,-1],[2,2,2],[-1,-1,-1]]).reshape((3,3,1,1))

Task 11: Define a function Max pooling.

Define method max_pool to perform max pooling on the input data:

Parameters:
- data: input data on which convolution is performed
- hparams: dictionary defined by {"f": f, "stride": s}, 'f' is the filter size and 's' the number of strides
Returns:
- output: output after pooling
Refer the code snippet provided in the course.

# Task 11:
import numpy as np

def max_pool(data, hparams):
    ### Start code
    
    m, h_prev, w_prev, c_prev = data.shape  # Input shape
    f = hparams["f"]  # Filter size (pooling window)
    stride = hparams["stride"]  # Stride size
    
    # Calculate output dimensions
    h_out = int((h_prev - f) / stride) + 1
    w_out = int((w_prev - f) / stride) + 1
    
    # Initialize output array
    output = np.zeros((m, h_out, w_out, c_prev))
    
    # Perform max pooling operation
    for i in range(m):  # Loop over each sample
        for h in range(h_out):  # Loop over output height
            for w in range(w_out):  # Loop over output width
                for c in range(c_prev):  # Loop over each channel
                    h_start = h * stride  # Starting height index
                    h_end = h_start + f  # Ending height index
                    w_start = w * stride  # Starting width index
                    w_end = w_start + f  # Ending width index
                    
                    # Extract the current slice and compute the maximum
                    output[i, h, w, c] = np.max(data[i, h_start:h_end, w_start:w_end, c])
    
    ### End code
    return output

Task 12: Run the below cell to test the method you define above.

Expected output: 1.075

# Task 12:
pool_params = {"stride" : 2, "f" : 2}
output_ = max_pool(input_, pool_params)
print(np.mean(output_))
# Output: 
# 1.0753012177728354

Task 13: Define pooling parameters "stride" and filter size "f" as a dictionary named hparams with stride = 1 and f = 2 call the method max_pool with parameters 'Z' (the convolved output) and 'hparams'.

# Task 13:
###start code
# Define pooling parameters
hparams = {"stride": 1, "f": 2}  # Filter size and stride

# Perform max pooling on the convolved output Z
Z_pool = max_pool(Z, hparams)
###End code

print("dimension before pooling :", Z.shape)
print("dimension after pooling :", Z_pool.shape)

plt.imshow(Z_pool[0,:,:,0], cmap = "gray")

with open("output.txt", "w+") as file:
    file.write("output1 = %f" %output1)
    file.write("\noutput2 = %f" %output2)

Lab 2: Welcome to Convolution neural network tensorflow

Lab 2: Convolution Using TensorFlow: CNN_tensorflow.

Task 1: Read the sample image "bird.png".

Instruction!

Read the image file 'bird.png'(in current directory) using mpimg.imread("file_path") function provided by matplotlib.image module. This function reads the image and returns the pixel intensities in numpy format. Assign this result to variable img.
The dimension of img will now be 𝑛_𝐻 x 𝑛_𝑤 x 𝑛_𝑐
reshape img to dimension 𝑚 x 𝑛_𝐻 x 𝑛_𝑤 x 𝑛_𝑐and assign it to variable data. The dimension m will be one since we are dealing with one image data. (use numpy's reshape()).


###Start code here
img = mpimg.imread("bird.png")
data = np.reshape(img,(1,img.shape[0], img.shape[1], img.shape[2]))
###End code

print(type(img))
print("Image dimension ",img.shape)
print(img.shape)
print("input data dimension ", data.shape)

Task 2: Run the below cell to plot the image

plt.imshow(data[0,:,:,:])

Task 3: Single layer convolution