Convolutional Neural Networks (CNN) Fresco Play Handson Solution HackerRank

Learn crucial concepts of Convolutional Neural Networks including Convolution Operations using NumPy and TensorFlow for pattern recognition & Vision.

Lab 1: Welcome to CNN Numpy

Lab1: Convolution using NumPy

Task 1: Run the bellow cell to import necessary packages

# Task 1:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

Task 2: Read the image file 'home.png' and returns the pixel intensities in numpy format.

Instructions!
  • Read the image file 'home.png'(in current directory) using mpimg.imread("file_path") function provided by matplotlib.image module. This function reads the image and returns the pixel intensities in numpy format. Assign this result to variable img.
  • The dimension of img will now be: 𝑛H x 𝑛w x 𝑛c
  • reshape img to dimension "𝑚 x 𝑛H x 𝑛w x 𝑛c" and assign it to variable data. The dimension m will be one since we are dealing with one image data. (use numpy's reshape())
  • Expected Output:
    • class 'numpy.ndarray'
    • Image dimension (252, 362, 3)
    • input data dimension (1, 252, 362, 3)
# Task 2:
###Start code here
img = mpimg.imread("home.png")
data = np.reshape(img,(1,img.shape[0], img.shape[1], img.shape[2]))
###End code

print(type(img))
print("Image dimension ",img.shape)
print("Input data dimension ", data.shape)

# Output:
# <class 'numpy.ndarray'>
# Image dimension  (252, 362, 3)
# Input data dimension  (1, 252, 362, 3)

Task 3: Run the below cell to view the image from the data.

# Task 3:
plt.imshow(data[0,:,:,:])
plt.grid(False)
plt.axis("off")

Task 4: Define a function zero_pad and return nd-array after padding.

Define method named zero_pad that performs specified number of zero padding on the input data.
  • Parameters:
    • data: the data on which padding is performed
    • pad: the amount of padding around the data
  • Returns:
    • data_padded: the nd-array after padding
# Task 4: 
def zero_pad(data, pad):
    ### Start code here
    # Use np.pad to add padding around the data
    data_padded = np.pad(data, ((0, 0), (pad, pad), (pad, pad), (0, 0)), mode='constant', constant_values=0)
    ### End code 
    return data_padded

Task 5: Run the below cell to add zero zero padding using the method define above.

Expected output: [[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]]
# Task 5:
print("dimension before padding: ", data.shape)
img_pad = zero_pad(data, 10)
print("dimension after padding: ", img_pad.shape)
print(img_pad[0,8:12,8:12,1])
plt.imshow(img_pad[0,:,:,:], cmap = "gray")
plt.grid(False)

output1 = np.mean(img_pad)

# Output:
'''
dimension before padding:  (1, 252, 362, 3)
dimension after padding:  (1, 272, 382, 3)
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 1. 1.]
 [0. 0. 1. 1.]]
'''

Task 6: Write a function Convolution single step and slice the input data using the filter.

Define the function named conv_single_step() to convolve a slice of input data using the specified filter:
  • Parameters:
    • data_slice: the receptive field over which convolution is performed
    • W: the filter used for convolution
    • b: the bias term
  • Returns:
    • Z: convolved output over the receptive field
# Task 6:
def conv_single_step(data_slice, W, b):
    ### Start code
    # Element-wise multiplication of the data slice and the filter W
    conv = np.sum(data_slice * W) + b  # Perform convolution and add the bias
    Z = conv  # Assign the result to Z
    ### End code
    return Z

Task 7: Write a function Strided Convolution and perform strided convolution on given data.

Define the function named conv_single_step() to convolve a slice of input data using the specified filter:
  • Define method conv_forward to perform strided convolution on the input data.
  • Use conv_single_step() to perform the convolution at each stride.
  • Refer the code snippet provided in the course.
  • Parameters:
    • data: input data on which convolution is performed
    • W: the filter used for convolution operation
    • b: the bias term
    • hparams: dictionary defined by {"stride": s, "pad": p}
  • Returns:
    • Z: the convolved output
# Task 7:
def conv_forward(data, W, b, hparams):
    ### Start code here
    
    # Extracting parameters
    stride = hparams["stride"]
    pad = hparams["pad"]

    # Get the dimensions of the input data and the filter
    (m, n_H, n_W, n_C) = data.shape  # Input data dimensions (m: number of samples)
    (f, f, n_C, n_filters) = W.shape  # Filter dimensions (assuming W is f x f x n_C x n_filters)

    # Calculate the dimensions of the output
    n_H_out = int((n_H - f + 2 * pad) / stride) + 1
    n_W_out = int((n_W - f + 2 * pad) / stride) + 1
    
    # Initialize the output array with zeros
    Z = np.zeros((m, n_H_out, n_W_out, n_filters))  # Adjusted to include the batch dimension and number of filters

    # Apply padding to the input data
    data_padded = zero_pad(data, pad)

    # Perform the convolution operation for each sample in the batch
    for i in range(m):  # Loop over each sample
        for h in range(n_H_out):
            for w in range(n_W_out):
                for k in range(n_filters):
                    # Define the corners of the slice
                    h_start = h * stride
                    h_end = h_start + f
                    w_start = w * stride
                    w_end = w_start + f

                    # Extract the current slice from the padded data for the i-th sample
                    # Perform convolution for each filter
                    Z[i, h, w, k] = conv_single_step(data_padded[i, h_start:h_end, w_start:w_end, :], W[:,:,:,k], b[:,:,:,k])

    ### End code
    return Z  # (convolved output)

Task 8: Run the cell to perfom the convolution operation using the method defined above.

Expected output: 0.145

# Task 8:
np.random.seed(1)
input_ = np.random.randn(10, 4, 4, 3)
W = np.random.randn(2, 2, 3, 8)
b = np.random.randn(1, 1, 1, 8)
hparameters = {"pad" : 1,
               "stride": 1}

output_ = conv_forward(input_, W, b, hparameters)
print(np.mean(output_))

Task 9: Run the below cell to define edge_detect filter, the filter values for edge detection has been define for you

# Task 9:
edge_detect = np.array([[-1,-1,-1],[-1,8,-1],[-1,-1,-1]]).reshape((3,3,1,1))

Task 10: Perform strided convolution.

Instructions!
  • Define a dictionary hparams with stride = 1 and pad = 0
  • initialize bias parameter b to zero of dimension (1,1,1,1) hint: use np.zeros()
  • Perform strided convolution using the method conv_forward() you defined previously:
    • pass edge_detect filter, bais b and hparams as parameters to perform convolution on the data variable defined previously.
    • assign the result to variable Z
# Task 10:
###Start code
hparams = {"stride": 1, "pad": 0}
b = np.zeros((1, 1, 1, 1))  # Bias initialized to zeros

# Step 2: Perform convolution using conv_forward
Z = conv_forward(data, edge_detect, b, hparams)

plt.clf()
plt.imshow(Z[0,:,:,0], cmap='gray',vmin=0, vmax=1)
plt.grid(False)
print("dimension of image before convolution: ", data.shape)
print("dimension of image after convolution: ", Z.shape)

output2 = np.mean(Z[0,100:200,200:300,0])


##below are the filters for vetical as well as horizontal edge detection, try these filters once you have completed this handson.
##vertical_filter = np.array([[-1,2,-1],[-1,2,-1],[-1,2,-1]]).reshape(3,3,1,1)
##horizontal_filter = np.array([[-1,-1,-1],[2,2,2],[-1,-1,-1]]).reshape((3,3,1,1))

Task 11: Define a function Max pooling.

Define method max_pool to perform max pooling on the input data:
  • Parameters:
    • data: input data on which convolution is performed
    • hparams: dictionary defined by {"f": f, "stride": s}, 'f' is the filter size and 's' the number of strides
  • Returns:
    • output: output after pooling
  • Refer the code snippet provided in the course.
# Task 11:
import numpy as np

def max_pool(data, hparams):
    ### Start code
    
    m, h_prev, w_prev, c_prev = data.shape  # Input shape
    f = hparams["f"]  # Filter size (pooling window)
    stride = hparams["stride"]  # Stride size
    
    # Calculate output dimensions
    h_out = int((h_prev - f) / stride) + 1
    w_out = int((w_prev - f) / stride) + 1
    
    # Initialize output array
    output = np.zeros((m, h_out, w_out, c_prev))
    
    # Perform max pooling operation
    for i in range(m):  # Loop over each sample
        for h in range(h_out):  # Loop over output height
            for w in range(w_out):  # Loop over output width
                for c in range(c_prev):  # Loop over each channel
                    h_start = h * stride  # Starting height index
                    h_end = h_start + f  # Ending height index
                    w_start = w * stride  # Starting width index
                    w_end = w_start + f  # Ending width index
                    
                    # Extract the current slice and compute the maximum
                    output[i, h, w, c] = np.max(data[i, h_start:h_end, w_start:w_end, c])
    
    ### End code
    return output

Task 12: Run the below cell to test the method you define above.

Expected output: 1.075

# Task 12:
pool_params = {"stride" : 2, "f" : 2}
output_ = max_pool(input_, pool_params)
print(np.mean(output_))
# Output: 
# 1.0753012177728354

Task 13: Define pooling parameters "stride" and filter size "f" as a dictionary named hparams with stride = 1 and f = 2 call the method max_pool with parameters 'Z' (the convolved output) and 'hparams'.

# Task 13:
###start code
# Define pooling parameters
hparams = {"stride": 1, "f": 2}  # Filter size and stride

# Perform max pooling on the convolved output Z
Z_pool = max_pool(Z, hparams)
###End code

print("dimension before pooling :", Z.shape)
print("dimension after pooling :", Z_pool.shape)

plt.imshow(Z_pool[0,:,:,0], cmap = "gray")

with open("output.txt", "w+") as file:
    file.write("output1 = %f" %output1)
    file.write("\noutput2 = %f" %output2)

Lab 2: Welcome to Convolution neural network tensorflow

Lab 2: Convolution Using TensorFlow: CNN_tensorflow.

Task 1: Read the sample image "bird.png".

Instruction!
  • Read the image file 'bird.png'(in current directory) using mpimg.imread("file_path") function provided by matplotlib.image module. This function reads the image and returns the pixel intensities in numpy format. Assign this result to variable img.
  • The dimension of img will now be 𝑛𝐻 x 𝑛𝑤 x 𝑛𝑐
  • reshape img to dimension 𝑚 x 𝑛𝐻 x 𝑛𝑤 x 𝑛𝑐and assign it to variable data. The dimension m will be one since we are dealing with one image data. (use numpy's reshape()).

###Start code here
img = mpimg.imread("bird.png")
data = np.reshape(img,(1,img.shape[0], img.shape[1], img.shape[2]))
###End code

print(type(img))
print("Image dimension ",img.shape)
print(img.shape)
print("input data dimension ", data.shape)

Task 2: Run the below cell to plot the image

plt.imshow(data[0,:,:,:])

Task 3: Single layer convolution

Instruction!
  • Initialise filter variable W each with random values using tf.random_normal() and filter configurations:
    • num_filters = 32
    • num_rows, num_columns, num_channels = 5,5,3
  • initilze bias variable using tf.random_normal() of shape 32
  • using tf.nn.conv2d() perform strided convolution on input_ using filter W of stride one and same padding, assign the result to variable conv
  • use tf.nn.bias_add() to add biase b to vector conv and assign the result to variable conv_bias
  • apply relu activation on vector conv_bias and assign it to variable conv_out
  • perform max pooling using tf.nn.pool() using filter of size 3 x 3 and valid padding

graph = tf.Graph()   
with graph.as_default():
    tf.random.set_seed(1)
    input_= tf.constant(data.astype(np.float32))  ##The input data is coverted into tensor of type float32
    ### Start code here
    # Filter configurations
    num_rows, num_columns, num_channels,num_filters = 5, 5, 3, 32

    # Initialize filter W with random values 
    W = tf.Variable(tf.random.normal(shape=(num_rows, num_columns, num_channels, num_filters)))
    
    # Initialize bias b with random values of shape 32 
    b = tf.Variable(tf.random.normal(shape=(num_filters,)))

    # Perform strided convolution with same padding
    conv = tf.nn.conv2d(input_, W, strides=[1, 1, 1, 1], padding='SAME') +b

    # Add bias to the convolution output
    conv_bias = tf.nn.bias_add(conv, b)

    # Apply ReLU activation
    conv_out = tf.nn.relu(conv_bias)

    # Perform max pooling with filter size 3x3 and valid padding
    conv_pool = tf.nn.pool(conv_out, window_shape=[3, 3], pooling_type='MAX', padding='VALID')

    ### End code

Task 4: Run the below cell to run the tensorflow graph defined in the above steps.


with tf.compat.v1.Session(graph=graph) as sess:
    sess.run(tf.compat.v1.global_variables_initializer())
    filters = sess.run(W)
    
    conv_output = sess.run(conv_out)
    
    after_pooling = sess.run(conv_pool)

###sanity check
print(conv_out)
print(conv_pool)
print(conv_output[0,100:105,200:205, 7])
print("\n", after_pooling[0,100:105,200:205, 7])

print('mean1',np.mean(conv_output))
print('mean2', np.mean(after_pooling))
with open("output.txt", "w+") as file:
    file.write("mean1 = %f" %np.mean(conv_output))
    file.write("\nmean2 = %f" %np.mean(after_pooling))
    
# Expected output:
'''
Tensor("Relu:0", shape=(1, 194, 259, 32), dtype=float32)

Tensor("max_pool:0", shape=(1, 192, 257, 32), dtype=float32)

[[ 2.35204768 2.43864083 2.06985545 2.01861191 2.53203893]
[ 2.50827527 2.18754387 1.9434787 1.68445456 2.16825724]
[ 2.24186778 2.29028106 2.66557431 2.32409024 2.51346755]
[ 2.09425473 2.65057802 3.0601604 2.65026021 2.57551527]
[ 2.33120751 2.55626559 2.69701314 2.72019339 2.46118355]]

[[ 2.66557431 2.66557431 2.66557431 3.11053085 3.11053085]
[ 3.0601604 3.0601604 3.0601604 3.11053085 3.11053085]
[ 3.0601604 3.0601604 3.0601604 3.11053085 3.11053085]
[ 3.0601604 3.0601604 3.0601604 2.99760103 2.99760103]
[ 2.69701314 2.89145637 3.06911826 3.06911826 3.06911826]]
'''

Task 5: Run the below cell to visualize the actual filters and plot the convolution output.


def show_conv_results(data, title):
    fig1 = plt.figure()
    fig1.suptitle(title, fontsize=30)
    rows, cols = 4, 8
    for i in range(np.shape(data)[3]):
        img = data[0, :, :, i]
        ax1 = fig1.add_subplot(rows, cols, i + 1)
        ax1.imshow(img, interpolation='none')
        ax1.axis('off')
        

def show_weights(W,title):
    fig2 = plt.figure()
    fig2.suptitle(title, fontsize=30)
    rows, cols = 4, 8
    for i in range(np.shape(W)[3]):
        img = W[:, :, 0, i]
        ax2 = fig2.add_subplot(rows, cols, i + 1)
        ax2.imshow(img, interpolation='none')
        ax2.axis('off')

show_weights(filters, title = "filters, "+"shape:" +str(filters.shape))
show_conv_results(conv_output, title = "after_convolution, "+ "shape:" + str(conv_output.shape))
show_conv_results(after_pooling, title = "after_pooling, "+"shape:"+str(after_pooling.shape))
        

About the author

D Shwari
I'm a professor at National University's Department of Computer Science. My main streams are data science and data analysis. Project management for many computer science-related sectors. Next working project on Al with deep Learning.....

Post a Comment