Lab 1: Welcome to CNN Numpy
Lab1: Convolution using NumPy
Task 1: Run the bellow cell to import necessary packages
# Task 1:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
Task 2: Read the image file 'home.png' and returns the pixel intensities in numpy format.
Instructions!
- Read the image file 'home.png'(in current directory) using mpimg.imread("file_path") function provided by matplotlib.image module. This function reads the image and returns the pixel intensities in numpy format. Assign this result to variable img.
- The dimension of img will now be: 𝑛H x 𝑛w x 𝑛c
- reshape img to dimension "𝑚 x 𝑛H x 𝑛w x 𝑛c" and assign it to variable data. The dimension m will be one since we are dealing with one image data. (use numpy's reshape())
- Expected Output:
- class 'numpy.ndarray'
- Image dimension (252, 362, 3)
- input data dimension (1, 252, 362, 3)
# Task 2:
###Start code here
img = mpimg.imread("home.png")
data = np.reshape(img,(1,img.shape[0], img.shape[1], img.shape[2]))
###End code
print(type(img))
print("Image dimension ",img.shape)
print("Input data dimension ", data.shape)
# Output:
# <class 'numpy.ndarray'>
# Image dimension (252, 362, 3)
# Input data dimension (1, 252, 362, 3)
Task 3: Run the below cell to view the image from the data.
# Task 3:
plt.imshow(data[0,:,:,:])
plt.grid(False)
plt.axis("off")
Task 4: Define a function zero_pad and return nd-array after padding.
Define method named zero_pad that performs specified number of zero padding on the input data.
- Parameters:
- data: the data on which padding is performed
- pad: the amount of padding around the data
- Returns:
- data_padded: the nd-array after padding
# Task 4:
def zero_pad(data, pad):
### Start code here
# Use np.pad to add padding around the data
data_padded = np.pad(data, ((0, 0), (pad, pad), (pad, pad), (0, 0)), mode='constant', constant_values=0)
### End code
return data_padded
Task 5: Run the below cell to add zero zero padding using the method define above.
Expected output:
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]]
[0. 0. 0. 0.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]]
# Task 5:
print("dimension before padding: ", data.shape)
img_pad = zero_pad(data, 10)
print("dimension after padding: ", img_pad.shape)
print(img_pad[0,8:12,8:12,1])
plt.imshow(img_pad[0,:,:,:], cmap = "gray")
plt.grid(False)
output1 = np.mean(img_pad)
# Output:
'''
dimension before padding: (1, 252, 362, 3)
dimension after padding: (1, 272, 382, 3)
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]]
'''
Task 6: Write a function Convolution single step and slice the input data using the filter.
Define the function named conv_single_step() to convolve a slice of input data using the specified filter:
- Parameters:
- data_slice: the receptive field over which convolution is performed
- W: the filter used for convolution
- b: the bias term
- Returns:
- Z: convolved output over the receptive field
# Task 6:
def conv_single_step(data_slice, W, b):
### Start code
# Element-wise multiplication of the data slice and the filter W
conv = np.sum(data_slice * W) + b # Perform convolution and add the bias
Z = conv # Assign the result to Z
### End code
return Z
Task 7: Write a function Strided Convolution and perform strided convolution on given data.
Define the function named conv_single_step() to convolve a slice of input data using the specified filter:
- Define method conv_forward to perform strided convolution on the input data.
- Use conv_single_step() to perform the convolution at each stride.
- Refer the code snippet provided in the course.
- Parameters:
- data: input data on which convolution is performed
- W: the filter used for convolution operation
- b: the bias term
- hparams: dictionary defined by {"stride": s, "pad": p}
- Returns:
- Z: the convolved output
# Task 7:
def conv_forward(data, W, b, hparams):
### Start code here
# Extracting parameters
stride = hparams["stride"]
pad = hparams["pad"]
# Get the dimensions of the input data and the filter
(m, n_H, n_W, n_C) = data.shape # Input data dimensions (m: number of samples)
(f, f, n_C, n_filters) = W.shape # Filter dimensions (assuming W is f x f x n_C x n_filters)
# Calculate the dimensions of the output
n_H_out = int((n_H - f + 2 * pad) / stride) + 1
n_W_out = int((n_W - f + 2 * pad) / stride) + 1
# Initialize the output array with zeros
Z = np.zeros((m, n_H_out, n_W_out, n_filters)) # Adjusted to include the batch dimension and number of filters
# Apply padding to the input data
data_padded = zero_pad(data, pad)
# Perform the convolution operation for each sample in the batch
for i in range(m): # Loop over each sample
for h in range(n_H_out):
for w in range(n_W_out):
for k in range(n_filters):
# Define the corners of the slice
h_start = h * stride
h_end = h_start + f
w_start = w * stride
w_end = w_start + f
# Extract the current slice from the padded data for the i-th sample
# Perform convolution for each filter
Z[i, h, w, k] = conv_single_step(data_padded[i, h_start:h_end, w_start:w_end, :], W[:,:,:,k], b[:,:,:,k])
### End code
return Z # (convolved output)
Task 8: Run the cell to perfom the convolution operation using the method defined above.
Expected output: 0.145
# Task 8:
np.random.seed(1)
input_ = np.random.randn(10, 4, 4, 3)
W = np.random.randn(2, 2, 3, 8)
b = np.random.randn(1, 1, 1, 8)
hparameters = {"pad" : 1,
"stride": 1}
output_ = conv_forward(input_, W, b, hparameters)
print(np.mean(output_))
Task 9: Run the below cell to define edge_detect filter, the filter values for edge detection has been define for you
# Task 9:
edge_detect = np.array([[-1,-1,-1],[-1,8,-1],[-1,-1,-1]]).reshape((3,3,1,1))
Task 10: Perform strided convolution.
Instructions!
- Define a dictionary hparams with stride = 1 and pad = 0
- initialize bias parameter b to zero of dimension (1,1,1,1) hint: use np.zeros()
- Perform strided convolution using the method conv_forward() you defined previously:
- pass edge_detect filter, bais b and hparams as parameters to perform convolution on the data variable defined previously.
- assign the result to variable Z
# Task 10:
###Start code
hparams = {"stride": 1, "pad": 0}
b = np.zeros((1, 1, 1, 1)) # Bias initialized to zeros
# Step 2: Perform convolution using conv_forward
Z = conv_forward(data, edge_detect, b, hparams)
plt.clf()
plt.imshow(Z[0,:,:,0], cmap='gray',vmin=0, vmax=1)
plt.grid(False)
print("dimension of image before convolution: ", data.shape)
print("dimension of image after convolution: ", Z.shape)
output2 = np.mean(Z[0,100:200,200:300,0])
##below are the filters for vetical as well as horizontal edge detection, try these filters once you have completed this handson.
##vertical_filter = np.array([[-1,2,-1],[-1,2,-1],[-1,2,-1]]).reshape(3,3,1,1)
##horizontal_filter = np.array([[-1,-1,-1],[2,2,2],[-1,-1,-1]]).reshape((3,3,1,1))
Task 11: Define a function Max pooling.
Define method max_pool to perform max pooling on the input data:
- Parameters:
- data: input data on which convolution is performed
- hparams: dictionary defined by {"f": f, "stride": s}, 'f' is the filter size and 's' the number of strides
- Returns:
- output: output after pooling
- Refer the code snippet provided in the course.
# Task 11:
import numpy as np
def max_pool(data, hparams):
### Start code
m, h_prev, w_prev, c_prev = data.shape # Input shape
f = hparams["f"] # Filter size (pooling window)
stride = hparams["stride"] # Stride size
# Calculate output dimensions
h_out = int((h_prev - f) / stride) + 1
w_out = int((w_prev - f) / stride) + 1
# Initialize output array
output = np.zeros((m, h_out, w_out, c_prev))
# Perform max pooling operation
for i in range(m): # Loop over each sample
for h in range(h_out): # Loop over output height
for w in range(w_out): # Loop over output width
for c in range(c_prev): # Loop over each channel
h_start = h * stride # Starting height index
h_end = h_start + f # Ending height index
w_start = w * stride # Starting width index
w_end = w_start + f # Ending width index
# Extract the current slice and compute the maximum
output[i, h, w, c] = np.max(data[i, h_start:h_end, w_start:w_end, c])
### End code
return output
Task 12: Run the below cell to test the method you define above.
Expected output: 1.075
# Task 12:
pool_params = {"stride" : 2, "f" : 2}
output_ = max_pool(input_, pool_params)
print(np.mean(output_))
# Output:
# 1.0753012177728354
Task 13: Define pooling parameters "stride" and filter size "f" as a dictionary named hparams with stride = 1 and f = 2 call the method max_pool with parameters 'Z' (the convolved output) and 'hparams'.
# Task 13:
###start code
# Define pooling parameters
hparams = {"stride": 1, "f": 2} # Filter size and stride
# Perform max pooling on the convolved output Z
Z_pool = max_pool(Z, hparams)
###End code
print("dimension before pooling :", Z.shape)
print("dimension after pooling :", Z_pool.shape)
plt.imshow(Z_pool[0,:,:,0], cmap = "gray")
with open("output.txt", "w+") as file:
file.write("output1 = %f" %output1)
file.write("\noutput2 = %f" %output2)
Lab 2: Welcome to Convolution neural network tensorflow
Lab 2: Convolution Using TensorFlow: CNN_tensorflow.
Task 1: Read the sample image "bird.png".
Instruction!
- Read the image file 'bird.png'(in current directory) using mpimg.imread("file_path") function provided by matplotlib.image module. This function reads the image and returns the pixel intensities in numpy format. Assign this result to variable img.
- The dimension of img will now be 𝑛𝐻 x 𝑛𝑤 x 𝑛𝑐
- reshape img to dimension 𝑚 x 𝑛𝐻 x 𝑛𝑤 x 𝑛𝑐and assign it to variable data. The dimension m will be one since we are dealing with one image data. (use numpy's reshape()).
###Start code here
img = mpimg.imread("bird.png")
data = np.reshape(img,(1,img.shape[0], img.shape[1], img.shape[2]))
###End code
print(type(img))
print("Image dimension ",img.shape)
print(img.shape)
print("input data dimension ", data.shape)
Task 2: Run the below cell to plot the image
plt.imshow(data[0,:,:,:])
Task 3: Single layer convolution
Instruction!
- Initialise filter variable W each with random values using tf.random_normal() and filter configurations:
- num_filters = 32
- num_rows, num_columns, num_channels = 5,5,3
- initilze bias variable using tf.random_normal() of shape 32
- using tf.nn.conv2d() perform strided convolution on input_ using filter W of stride one and same padding, assign the result to variable conv
- use tf.nn.bias_add() to add biase b to vector conv and assign the result to variable conv_bias
- apply relu activation on vector conv_bias and assign it to variable conv_out
- perform max pooling using tf.nn.pool() using filter of size 3 x 3 and valid padding
graph = tf.Graph()
with graph.as_default():
tf.random.set_seed(1)
input_= tf.constant(data.astype(np.float32)) ##The input data is coverted into tensor of type float32
### Start code here
# Filter configurations
num_rows, num_columns, num_channels,num_filters = 5, 5, 3, 32
# Initialize filter W with random values
W = tf.Variable(tf.random.normal(shape=(num_rows, num_columns, num_channels, num_filters)))
# Initialize bias b with random values of shape 32
b = tf.Variable(tf.random.normal(shape=(num_filters,)))
# Perform strided convolution with same padding
conv = tf.nn.conv2d(input_, W, strides=[1, 1, 1, 1], padding='SAME') +b
# Add bias to the convolution output
conv_bias = tf.nn.bias_add(conv, b)
# Apply ReLU activation
conv_out = tf.nn.relu(conv_bias)
# Perform max pooling with filter size 3x3 and valid padding
conv_pool = tf.nn.pool(conv_out, window_shape=[3, 3], pooling_type='MAX', padding='VALID')
### End code
Task 4: Run the below cell to run the tensorflow graph defined in the above steps.
with tf.compat.v1.Session(graph=graph) as sess:
sess.run(tf.compat.v1.global_variables_initializer())
filters = sess.run(W)
conv_output = sess.run(conv_out)
after_pooling = sess.run(conv_pool)
###sanity check
print(conv_out)
print(conv_pool)
print(conv_output[0,100:105,200:205, 7])
print("\n", after_pooling[0,100:105,200:205, 7])
print('mean1',np.mean(conv_output))
print('mean2', np.mean(after_pooling))
with open("output.txt", "w+") as file:
file.write("mean1 = %f" %np.mean(conv_output))
file.write("\nmean2 = %f" %np.mean(after_pooling))
# Expected output:
'''
Tensor("Relu:0", shape=(1, 194, 259, 32), dtype=float32)
Tensor("max_pool:0", shape=(1, 192, 257, 32), dtype=float32)
[[ 2.35204768 2.43864083 2.06985545 2.01861191 2.53203893]
[ 2.50827527 2.18754387 1.9434787 1.68445456 2.16825724]
[ 2.24186778 2.29028106 2.66557431 2.32409024 2.51346755]
[ 2.09425473 2.65057802 3.0601604 2.65026021 2.57551527]
[ 2.33120751 2.55626559 2.69701314 2.72019339 2.46118355]]
[[ 2.66557431 2.66557431 2.66557431 3.11053085 3.11053085]
[ 3.0601604 3.0601604 3.0601604 3.11053085 3.11053085]
[ 3.0601604 3.0601604 3.0601604 3.11053085 3.11053085]
[ 3.0601604 3.0601604 3.0601604 2.99760103 2.99760103]
[ 2.69701314 2.89145637 3.06911826 3.06911826 3.06911826]]
'''
Task 5: Run the below cell to visualize the actual filters and plot the convolution output.
def show_conv_results(data, title):
fig1 = plt.figure()
fig1.suptitle(title, fontsize=30)
rows, cols = 4, 8
for i in range(np.shape(data)[3]):
img = data[0, :, :, i]
ax1 = fig1.add_subplot(rows, cols, i + 1)
ax1.imshow(img, interpolation='none')
ax1.axis('off')
def show_weights(W,title):
fig2 = plt.figure()
fig2.suptitle(title, fontsize=30)
rows, cols = 4, 8
for i in range(np.shape(W)[3]):
img = W[:, :, 0, i]
ax2 = fig2.add_subplot(rows, cols, i + 1)
ax2.imshow(img, interpolation='none')
ax2.axis('off')
show_weights(filters, title = "filters, "+"shape:" +str(filters.shape))
show_conv_results(conv_output, title = "after_convolution, "+ "shape:" + str(conv_output.shape))
show_conv_results(after_pooling, title = "after_pooling, "+"shape:"+str(after_pooling.shape))