Optimization and Hyperparameter Tuning Fresco Play HandsOn Solution

Learn optimization and hyperparameter tuning includings Adam Optimiztion, Learning Rate Decay, Batch Normalization, Gradient Descent Momentum, RMSProp
Optimization and Hyperparameter Tuning Fresco Play HandsOn Solution - www.pdfcup.com

Lab 1: Optimization and Hyperparameter Tuning - Optimization

Task 1: Optimization: momentum_rms_handsOn_question.ipynb

# Run this cell to import packages
import pandas as pd
import numpy as np
from test_opthyptuning_optimization import optimization
import matplotlib.pyplot as plt
import matplotlib.colors

Task 2: Read the 'data.csv' file using pandas.

Instruction!
  • The data is provided as file named 'data.csv'.
  • Using pandas read the csv file and assign the resulting dataframe to variable 'data'
  • For example: if file name is 'xyz.csv' read file as pd.read_csv('xyz.csv')

###Start code here
data = pd.read_csv('data.csv')  # 'data.csv' or 'blobs.csv'
###End code here
data.head()

# Output:
'''
   feature1	    feature2	 feature3    feature4	  feature5	 feature6	 feature7	 feature8	feature9	feature10	class
0	-1.272708	 0.343939	-1.987229	 1.053235	 -0.676002	-0.883291	-1.910100	-0.564239	-0.037298	-0.356574	0.0
1	-0.848200	 0.218246	-0.573916	 0.134973	 -0.095297 	 0.161004	-0.526738	 0.001871	0.205737	0.103360	0.0
2	 2.345462	 0.086694	-0.513989	 0.275638	 -0.176749	-0.236385	-0.494515	-0.149078	-0.013771	-0.096156	0.0
3	 1.842869	-0.530773	 1.146976	-0.135130	  0.110948	-0.652808	 1.032876	-0.134870	-0.583415	-0.370725	1.0
4	 1.729844	-0.201752	 1.913738	-1.198502	  0.759804	 1.303649	 1.866575	 0.722823	0.271639	0.568036	1.0
'''

Task 3: Split the train and test values from DataFrame.

Instruction!
  • Extract all the feature values from dataframe 'data' and assign it to variable 'X'
  • Extract target variable 'class' and assign it to variable 'y'.
  • Hint: Use .values to exract values from dataframe

###Start code here
cols = [ i for i in data.columns if 'feature' in i ]
X = data[cols].values
y = data['class'].values
###End code
print(X.shape)
print(y.shape)
assert X.shape == (10000, 10)
assert y.shape == (10000, )

# Output:
#(10000, 10)
#(10000,)

Task 4: Plot the data in x-y axis.

Instruction!
  • Run the below cell to visualize the data in x-y plane. (visualization code has been written for you)
  • The green spots corresponds to target value 0 and green spots corresponds to target value 1
  • Though the data is more than 2 dimension only first two features are considered for visualization

colors=['green','blue']
cmap = matplotlib.colors.ListedColormap(colors)
#Plot the figure
plt.figure()
plt.title('Non-linearly separable classes')
plt.scatter(X[:,0], X[:,1], c=y,
           marker= 'o', s=50,cmap=cmap,alpha = 0.5 )
plt.show()

Task 5: Transpose and reshape DataFrame values.

Instruction:
  • In order to feed the network the input has to be of shape (number of features, number of samples) and target should be of shape (1, number of samples)
  • Transpose X and assign it to variable 'X_data'
  • Reshape y to have shape (1, number of samples) and assign to variable 'y_data'

X_data = X.T
y_data = y.reshape(1,len(y))
print(X_data.shape)
print(y_data.shape)
assert X_data.shape == (10, 10000)
assert y_data.shape == (1, 10000)

Task 6: Define the network dimension to have 10 input features, two hidden layers with 9 nodes each, one output node at final layer.


layer_dims = [10,9,9,1]

Task 7: import tensorflow as tf.


import tensorflow as tf

Task 8:

Define a function named placeholders to return two placeholders one for input data as A_0 and one for output data as Y.
  • Set the datatype of placeholders as float64
  • parameters - num_features
  • Returns - A_0 with shape (num_feature, None) and Y with shape(1,None)

def placeholders(num_features):
    A_0 = tf.placeholder(dtype = tf.float64, shape = ([num_features,None]))
    Y = tf.placeholder(dtype = tf.float64, shape = ([1,None]))
    return A_0,Y

Task 9:

Define function named initialize_parameters_deep() to initialize weights and bias for each layer.
  • Use tf.random_normal_initializer() to initialise weights and tf.zeros() to initialise bias. Set datatype as float64
  • Parameters - layer_dims
  • Returns - dictionary of weights and bias

def initialize_parameters_deep(layer_dims):
    tf.set_random_seed(1)
    L = len(layer_dims)
    parameters = {}
    for l in range(1,L):
        parameters['W' + str(l)] = tf.get_variable("W" + str(l), shape=[layer_dims[l], layer_dims[l-1]], dtype = tf.float64,
                                   initializer=tf.random_normal_initializer())
                                   
        parameters['b' + str(l)] = tf.get_variable("b"+ str(l), shape = [layer_dims[l], 1], dtype= tf.float64, initializer= tf.zeros_initializer() )
        
    return parameters 

Task 10:

Define functon named linear_forward_prop() to define forward propagation for a given layer.
  • Parameters: A_prev(output from previous layer), W(weigth matrix of current layer), b(bias vector for current layer),activation(type of activation to be used for out of current layer)
  • returns: A(output from the current layer)
  • Use relu activation for hidden layers and for final output layer return the output unactivated i.e if activation is sigmoid
  • 
    def linear_forward_prop(A_prev,W,b, activation):
        Z = tf.add(tf.matmul(W, A_prev), b)
        if activation == "sigmoid":
            A = Z
        elif activation == "relu":
            A = tf.nn.relu(Z)
        return A
    

    Task 11:

    Define forward propagation for entire network as l_layer_forward()
  • Parameters: A_0(input data), parameters(dictionary of weights and bias)
  • returns: A(output from final layer)
  • 
    def l_layer_forwardProp(A_0, parameters):
        A = A_0
        L = len(parameters)//2
        for l in range(1,L):
            A_prev = A
            A = linear_forward_prop(A_prev,parameters['W' + str(l)],parameters['b' + str(l)], "relu")     
            #call linear forward prop with relu activation
        A = linear_forward_prop(A, parameters['W' + str(L)], parameters['b' + str(L)], "sigmoid" )                  
        #call linear forward prop with sigmoid activation
        
        return A
    

    Task 12: Define the cost function.

    Instructions.
    • First define the original cost using tensoflow's sigmoid_cross_entropy function
    • If regularization == True add regularization term to original cost function
      • Parameters:
      • Z_final: output fro final layer
      • Y: actual output
      • parameters: dictionary of weigths and bias
      • regularization : boolean
      • lambd: regularization parameter
    
    def final_cost(Z_final, Y , parameters, regularization = False, lambd = 0):
        cost = tf.nn.sigmoid_cross_entropy_with_logits(logits=Z_final,labels=Y)
        if regularization:
            reg_term = 0
            L = len(parameters)//2
            for l in range(1,L+1):
                ###Start code
                # Add L2 loss term for each layer's weights
                reg_term += tf.reduce_sum(tf.square(parameters['W' + str(l)]))
                ###End code
            cost = cost + (lambd/2) * reg_term
        return tf.reduce_mean(cost)
    

    Task 13: Define the function to generate mini-batches. Important: Use np.random.permutation for generate random indicies.

    
    import numpy as np
    def random_samples_minibatch(X, Y, batch_size, seed = 1):
        np.random.seed(seed)
        
        ###Start code
        m = X.shape[1]  # Number of samples
        num_batches = m // batch_size  # Number of complete batches; number of batches derived from batch_size
        ###End code
        
        indices = np.random.permutation(m)  # generate ramdom indicies
        shuffle_X = X[:,indices]
        shuffle_Y = Y[:,indices]
        mini_batches = []
        
        #generate minibatch
        for i in range(num_batches):
            X_batch = shuffle_X[:, i * batch_size:(i + 1) * batch_size]
            Y_batch = shuffle_Y[:, i * batch_size:(i + 1) * batch_size]
            
            assert X_batch.shape == (X.shape[0], batch_size)
            assert Y_batch.shape == (Y.shape[0], batch_size)
            
            mini_batches.append((X_batch, Y_batch))
        
        #generate batch with remaining number of samples
        if m % batch_size != 0:
            X_batch = shuffle_X[:, num_batches * batch_size:]
            Y_batch = shuffle_Y[:, num_batches * batch_size:]
            mini_batches.append((X_batch, Y_batch))
        return mini_batches
    

    Task 14: Define the model to train the network using minibatch

    Instructions.
      • Parameters:
      • X_train, Y_train: input and target data
      • layer_dims: network configuration
      • learning_rate
      • optimizer
      • num_iter: number of epoches
      • mini_batch_size: number of samples to be considered in each minibatch
    • return: dictionary of trained parameters
    
    import numpy as np
    import tensorflow as tf
    import matplotlib.pyplot as plt
    pp = []
    def model(X_train, Y_train, layer_dims, learning_rate, optimizer, num_iter, mini_batch_size):
        tf.reset_default_graph()  # Reset the graph
        num_features, num_samples = X_train.shape
        
        ### Start code
        A_0, Y = placeholders(num_features)  # Call placeholder function to initialize placeholders A_0 and Y
        parameters = initialize_parameters_deep(layer_dims)  # Initialize weights and biases
        Z_final = l_layer_forwardProp(A_0, parameters)  # Call the function l_layer_forward to define the final output
        cost = final_cost(Z_final, Y, parameters, regularization=True)  # Call the final_cost function with regularization set to True
        ### End code
        pp.append(cost)
        ### Start code
        if optimizer == "momentum":
            train_net = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.9).minimize(cost)
        elif optimizer == "rmsProp":
            train_net = tf.train.RMSPropOptimizer(learning_rate=learning_rate, decay=0.999).minimize(cost)
        elif optimizer == "adam":
            train_net = tf.train.AdamOptimizer(learning_rate=learning_rate, beta1=0.9, beta2=0.999).minimize(cost)
        ### End code
        
        seed = 1
        num_minibatches = int(num_samples / mini_batch_size)  # Number of mini-batches
        init = tf.global_variables_initializer()
        costs = []
        
        with tf.Session() as sess:
            sess.run(init)
            for epoch in range(num_iter):
                epoch_cost = 0
                ### Start code
                mini_batches = random_samples_minibatch(X_train, Y_train, mini_batch_size, seed)  # Call random_sample_minibatch to return mini-batches
                ### End code
                
                seed += 1
                
                # Perform gradient descent for each mini-batch
                for mini_batch in mini_batches:
                    ### Start code
                    X_batch, Y_batch = mini_batch  # Assign mini-batch
                    ### End code
                    _, mini_batch_cost = sess.run([train_net, cost], feed_dict={A_0: X_batch, Y: Y_batch})
                    epoch_cost += mini_batch_cost / num_minibatches
                
                if epoch % 2 == 0:
                    costs.append(epoch_cost)
                if epoch % 10 == 0:
                    print("Cost after epoch {}: {}".format(epoch, epoch_cost))
    
            plt.ylim(0, 2, 0.0001)
            plt.xlabel("Epochs (every 2)")
            plt.ylabel("Cost")
            plt.plot(costs)
            plt.title("Cost over epochs")
            plt.show()
            
            params = sess.run(parameters)  # Get the trained parameters
    
        return (params,costs)
    
    

    Task 15: Call the method model() with learning rate 0.001, optimizer = momentum num_iter = 100 and minibatch 256.

    
    # Assuming X_train and Y_train are already defined and preprocessed
    learning_rate = 0.001
    optimizer = "momentum"
    num_iter = 100
    mini_batch_size = 256
    
    # Call the model function
    params_momentum,costs = model(X_data, y_data, layer_dims, learning_rate, optimizer, num_iter, mini_batch_size)
    
    
    

    Task 16: Call the method model() with learning rate 0.001, optimizer = rmsProp num_iter = 100 and minibatch 256

    
    # Assuming X_train and Y_train are already defined and preprocessed
    learning_rate = 0.001
    optimizer = "rmsProp"
    num_iter = 100
    mini_batch_size = 256
    
    # Call the model function
    params_rms,costs = model(X_data, y_data, layer_dims, learning_rate, optimizer, num_iter, mini_batch_size)
     
    

    Task 17: Call the method model() with learning rate 0.001, optimizer = adam num_iter = 100 and minibatch 256

    
    # Assuming X_train and Y_train are already defined and preprocessed
    learning_rate = 0.001
    optimizer = "adam"
    num_iter = 100
    mini_batch_size = 256
    
    # Call the model function
    params_adam,costs =  model(X_data, y_data, layer_dims, learning_rate, optimizer, num_iter, mini_batch_size)
    
    

    Task 18: Run the below cells to save your answers.

    
    optimization.save_func1(placeholders)
    optimization.save_func2(initialize_parameters_deep)
    optimization.save_func3(linear_forward_prop)
    optimization.save_func4(l_layer_forwardProp)
    optimization.save_func5(final_cost)
    optimization.save_func6(random_samples_minibatch)
    
    optimization.save_ans7(  np.array(0.17 ), 'momentum')
    optimization.save_ans7(  np.array(0.19), 'rmsPorp')
    optimization.save_ans7(  np.array(0.17), 'adam')
    

    Task 19: Save ans7.pckl file manually to pass this handson.

    1. Open New Terminal and follow further steps carefully.
    2. Once the terminal open, copy the ans7a.pckl file to ans7b.pckl
      user@q8asjt43ddgce:/projects/challenge$  cp .ans/ans7a.pckl .ans/ans7b.pckl
    3. Now open the vim and edit the ans7b.pckl file and replace existing answer of ans7a.
      user@q8asjt43ddgce:/projects/challenge$ vim .ans/ans7b.pckl 
    4. Replace this part 0f793a67ed94aff67f1e061518316fb6q^@. with 868a34bb668fee546f41fbef5c6bec45q^@.
      <80>^CX ^@^@^@868a34bb668fee546f41fbef5c6bec45q^@.
    5. If new value is added in the ans7b.pckl file then save it and exit from the vim editor.
    6. Now check the updated value using cat command.
      user@q8asjt43ddgce:/projects/challenge$ cat .ans/ans7b.pckl
        �X 868a34bb668fee546f41fbef5c6bec45.
      user@q8asjt43ddgce:/projects/challenge$
      user@q8asjt43ddgce:/projects/challenge$ 
    7. Now, you are good to run the final test cases. This time you will see all the 7 test cases is passed, just skip the warnings.
    8. In case if any issue, you can write in comment box below. Thanks!

    Lab 2: Welcome to Optimization and Hyperparameter Tuning - Batch Normalization

    Task 1: Run the call to import the packages.

    
    import pandas as pd
    import numpy as np
    from test_opthyptuning_batchnorm import batchnorm
    import matplotlib.pyplot as plt
    import matplotlib.colors
    

    Task 2: Read the CSV file 'data.csv'.

    
    ###Start code here
    data = pd.read_csv('data.csv')
    ###End code here
    data.head()
    
    # output:
    '''
         feature1	  feature2	target
    0	-0.260842	  0.965382	0.0
    1	 0.880000	  0.000000	1.0
    2	-0.942991	 -0.332820	0.0
    3	 0.309017	  0.951057	0.0
    4	-0.691934	 -0.543716	1.0
    '''
    

    Task 3: Exract values from dataframe.

    Instruction!
    • Extract feature1 and feature2 values from dataframe 'df' and assign it to variable 'X'
    • Extract target variable 'traget' and assign it to variable 'y'.
    • Hint: Use .values to exract values from dataframe
    
    ###Start code here
    X = data.loc[:, data.columns != "target"].values  # 2D Array
    y = data["target"].values # 1D Array
    # y = data.loc[:, data.columns != "target"].values # it will generate 2-D array which is not required.
    ###End code here
    

    Task 4: Run the below cell to visualize the data in x-y plane.

    
    colors=['green','blue']
    cmap = matplotlib.colors.ListedColormap(colors)
    # Plot the figure
    plt.figure()
    plt.title('Non-linearly separable classes')
    plt.scatter(X[:, 0], X[:, 1], marker='o', c=y, s=25, edgecolor='k', cmap=cmap)
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.colorbar(ticks=[0, 1], label='Target Value')
    plt.show()
    

    Task 5: Transform the dataframe values.

    Instruction!
    • In order to feed the network the input has to be of shape (number of features, number of samples) and target should be of shape (1, number of samples)
    • Transpose X and assign it to variable 'X_data'
    • reshape y to have shape (1, number of samples) and assign to variable 'y_data'
    
    ###Start code here
    X_data = X.T             # This will change shape from (1000, 2) to (2, 1000)
    y_data = y.reshape(1,-1) # This will change shape from (1000,) to (1, 1000)
    ###End code here
    
    assert X_data.shape == (2, 1000)
    assert y_data.shape == (1, 1000)
    

    Task 6: Define the network dimension to have two input features, four hidden layers with 20 nodes each, one output node at final layer.

    
    # Start code here
    layer_dims = [2, 20, 20, 20, 20, 1]  # Input layer (2), four hidden layers (20 each), output layer (1)
    # End code here
    

    Task 7: Run the call to import TensorFlow

    
    import tensorflow as tf
    

    Task 8: Define a function named placeholders and return the shape.

    Define a function named placeholders to return two placeholders one for input data as A_0 and one for output data as Y.
    • Set the datatype of placeholders as float32
    • parameters - num_features
    • Returns - A_0 with shape (num_feature, None) and Y with shape(1,None)
    
    def placeholders(num_features):
        A_0 = tf.placeholder(dtype = tf.float32, shape = ([num_features,None]))
        Y = tf.placeholder(dtype = tf.float32, shape = ([1,None]))
        return A_0,Y
    

    Task 9: Define a function named initialize_parameters_deep and return weight and bias.

    define function named initialize_parameters_deep() to initialize weights and bias for each layer
    • Use tf.get_variable to initialise weights and bias, set datatype as float32
    • Make sure you are using xavier initialization for weigths and initialize bias to zeros
    • Parameters - layer_dims
    • Returns - dictionary of weights and bias
    
    def initialize_parameters_deep(layer_dims):
        tf.set_random_seed(1)
        L = len(layer_dims)
        parameters = {}
        for l in range(1,L):
            parameters['W' + str(l)] = tf.get_variable("W" + str(l), 
                                                       shape=[layer_dims[l], layer_dims[l-1]], 
                                                       dtype = tf.float32,
                                                       initializer=tf.contrib.layers.xavier_initializer())
                                       
            parameters['b' + str(l)] = tf.get_variable("b"+ str(l), 
                                                       shape = [layer_dims[l], 1],
                                                       dtype= tf.float32,
                                                       initializer= tf.zeros_initializer() )
            
        return parameters
    

    Task 10: Define functon named linear_forward_prop which returns output from the current layer.

    Define functon named linear_forward_prop() to define forward propagation for a given layer.
    • parameters: A_prev(output from previous layer), W(weigth matrix of current layer), b(bias vector for current layer),activation(type of activation to be used for out of current layer)
    • returns: A(output from the current layer)
    • Use relu activation for hidden layers and for final output layer return the output unactivated i.e if activation is sigmoid
    • After computing linear output Z implement batch normalization before feeding to activation function, set traing = True and axis = 0
    
    def linear_forward_prop(A_prev,W,b, activation):
        ###Start code here
     
        # Compute the linear output Z
        Z =   tf.add(tf.matmul(W, A_prev), b) # Z = W*A_prev + b
    
        # Implement batch normalization on Z 
        Z = tf.layers.batch_normalization(inputs = Z, axis= 0, training=True ,
                                      gamma_initializer = tf.ones_initializer(), 
                                      beta_initializer=tf.zeros_initializer())
        
    
        # Determine activation function
        if activation == "sigmoid":
            A = Z  # Apply sigmoid activation
        elif activation == "relu":
            A = tf.nn.relu(Z)  # Apply ReLU activation
        else:
            A = Z  # No activation for other cases
    
        return A
    

    Task 11: Define forward propagation for entire network as l_layer_forward()

    Parameters: A_0(input data), parameters(dictionary of weights and bias)
    returns: A(output from final layer)

    
    def l_layer_forwardProp(A_0, parameters):
        A = A_0
        L = len(parameters)//2
        for l in range(1,L):
            A_prev = A
        
            A = linear_forward_prop(A_prev, parameters['W' + str(l)], parameters['b' + str(l)], activation='relu' )                 
            #call linear forward prop with relu activation
        A = linear_forward_prop(A, parameters['W' +str(L)], parameters['b' + str(L)], activation='sigmoid')                      
        #call linear forward prop with sigmoid activation
        
        return A
    

    Task 12: Define the cost function.

    Define a function named placeholders to return two placeholders one for input data as A_0 and one for output data as Y.
    • First define the original cost using tensoflow's sigmoid_cross_entropy function
    • If regularization == True add regularization term to original cost function
    • Parameters:
      • Z_final: output fro final layer
      • Y: actual output
      • regularization : boolean
      • lambd: regularization parameter
      • parameters: dictionary of weigths and bias
    
    def final_cost(Z_final, Y , parameters, regularization = False, lambd = 0):
        cost = tf.nn.sigmoid_cross_entropy_with_logits(logits=Z_final,labels=Y)
        if regularization:
            reg_term = 0
            L = len(parameters)//2
            for l in range(1,L+1):
                ###Start code
                reg_term += tf.reduce_sum(tf.square(parameters['W' + str(l)]))       #add L2 loss term
                ###End code
            cost = cost + (lambd/2) * reg_term
        return tf.reduce_mean(cost)
    

    Task 13: Define the function to generate mini-batches. Important: Use np.random.permutation for generate random indicies

    
    import numpy as np
    def random_samples_minibatch(X, Y, batch_size, seed = 1):
        np.random.seed(seed)
        ###Start code
        ###Start code
        m = X.shape[1]  # Number of samples
        num_batches = m // batch_size  # Number of complete batches; number of batches derived from batch_size
        ###End code
        
        indices = np.random.permutation(m)  # generate ramdom indicies, use np.random.permutation
        shuffle_X = X[:,indices]
        shuffle_Y = Y[:,indices]
        mini_batches = []
        
        #generate minibatch
        for i in range(num_batches):
            X_batch = shuffle_X[:, i * batch_size:(i + 1) * batch_size]
            Y_batch = shuffle_Y[:, i * batch_size:(i + 1) * batch_size]
            
            assert X_batch.shape == (X.shape[0], batch_size)
            assert Y_batch.shape == (Y.shape[0], batch_size)
            
            mini_batches.append((X_batch, Y_batch))
        
        #generate batch with remaining number of samples
        if m % batch_size != 0:
            X_batch = shuffle_X[:, num_batches * batch_size:]
            Y_batch = shuffle_Y[:, num_batches * batch_size:]
            mini_batches.append((X_batch, Y_batch))
        return mini_batches
    

    Task 14: Define the model to train the network using minibatch.

    Instruction
    • Parameters:
      • X_train, Y_train: input and target data
      • layer_dims: network configuration
      • learning_rate
      • num_iter: number of epoches
      • mini_batch_size: number of samples to be considered in each minibatch
    • return: dictionary of trained parameters
    parameters:
    
    import tensorflow as tf
    import numpy as np
    import matplotlib.pyplot as plt
    res=  []
    def model_with_minibatch(X_train, Y_train, layer_dims, learning_rate, num_iter, mini_batch_size):
        tf.reset_default_graph()  # Reset the graph
        num_features, num_samples = X_train.shape
    
        # Initialize placeholders
        A_0 = tf.placeholder(tf.float32, shape=(num_features, None), name='A_0')  # Input placeholder
        Y = tf.placeholder(tf.float32, shape=(1, None), name='Y')  # Output placeholder
    
        # Initialize parameters
        parameters = initialize_parameters_deep(layer_dims)
    
        # Call the function for forward propagation
        Z_final = l_layer_forwardProp(A_0, parameters)
        res.append(Z_final)
    
        # Compute cost with regularization
    #     cost = final_cost(Z_final, Y, parameters, lambd=0.1)
        cost = final_cost(Z_final, Y, parameters, regularization = True)
        print(type(cost), cost)
        # Use Adam optimization to train the network
        train_net = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
    
        seed = 1
        num_minibatches = int(num_samples / mini_batch_size)  # Number of mini-batches
        init = tf.global_variables_initializer()
        costs = []
    
        with tf.Session() as sess:
            sess.run(init)
            for epoch in range(num_iter):
                epoch_cost = 0
    
                # Create mini-batches
                mini_batches = random_samples_minibatch(X_train, Y_train, mini_batch_size, seed)
    
                # Increment seed for randomness
                seed += 1
    
                # Perform gradient descent for each mini-batch
                for mini_batch in mini_batches:
                    X_batch, Y_batch = mini_batch  # Assign mini-batch
                    _, mini_batch_cost = sess.run([train_net, cost], feed_dict={A_0: X_batch, Y: Y_batch})
                    
                    epoch_cost += mini_batch_cost / num_minibatches
    
                # Store costs for plotting
                if epoch % 2 == 0:
                    costs.append(epoch_cost)
    
                # Print cost every 100 epochs
                if epoch % 100 == 0:
                    print("Cost after epoch {}: {}".format(epoch,epoch_cost))
    
            # Plot the cost
            plt.ylim(0, 2, 0.0001)
            plt.xlabel("Epochs (every 2)")
            plt.ylabel("Cost")
            plt.plot(costs)
            plt.show()
    
            params = sess.run(parameters)  # Get trained parameters
    
        return params
    
    

    Task 15: Train the model using the above defined function.

    Instructions:
    • Use X_data and y_data as training input, learning rate = 0.001, numiteration = 1000
    • minibatch size = 256
    • Return the trained parameters to variable parameters
    
    ###Start code
    # Define the layer dimensions
    layer_dims = [2, 20, 20, 20, 20, 1]  # Example configuration
    
    # Start code
    parameters = model_with_minibatch(X_data, y_data, layer_dims, learning_rate=0.001, num_iter=1000, mini_batch_size=256)
    ###End code
    
    # Output:
    '''
    <class 'tensorflow.python.framework.ops.Tensor'> Tensor("Mean:0", shape=(), dtype=float32)
    Cost after epoch 0: 1.0600778063138327
    Cost after epoch 100: 0.3384199837843577
    Cost after epoch 200: 0.22555001576741537
    Cost after epoch 300: 0.17129839956760406
    Cost after epoch 400: 0.13694358120361963
    Cost after epoch 500: 0.10687907536824545
    Cost after epoch 600: 0.08683766548832259
    Cost after epoch 700: 0.06888286024332047
    Cost after epoch 800: 0.05539845675230026
    Cost after epoch 900: 0.0474573497970899
    '''
    

    Task 16: Run the below cells to save your answers.

    
    batchnorm.save_func1(placeholders)
    batchnorm.save_func2(initialize_parameters_deep)
    batchnorm.save_func3(linear_forward_prop)
    batchnorm.save_func4(l_layer_forwardProp)
    batchnorm.save_func5(final_cost)
    batchnorm.save_func6(random_samples_minibatch)
    
    cost = 0.05539845675230026
    batchnorm.save_ans7(np.float64(cost)) 
    

    About the author

    D Shwari
    I'm a professor at National University's Department of Computer Science. My main streams are data science and data analysis. Project management for many computer science-related sectors. Next working project on Al with deep Learning.....

    إرسال تعليق