Ask Ghassem - Recent questions tagged ml-final

How to update the weights in backpropagation algorithm when activation function in not linear?

Mon, 10 Aug 2020 21:55:19 +0000

The goal of backpropagation is to optimize the weights so that the neural network can learn how to correctly map arbitrary inputs to outputs.

Assume for the following neural network, inputs = [$i_1,i_2$] = [0.05, 0.10], we want the neural network to output = [$o_1$,$o_2$] = [0.01, 0.99], and for learning rate, $\alpha=0.5$.
In addition, the activation function for the hidden layer (both $h_1$ and $h_2$) is sigmoid (logistic):

$S(x)=\frac{1}{1+e^{-x}}$

https://i.imgur.com/cnY5feu.png

Hint:
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$

$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$

a) Show step by step solution to calculate weights $w_1$ to $w_8$ after one update in table below.
b) Calculate initial error and error after one update (assume biases $[b_1,b_2]$ are not changing during the updates).

Updating weights in backpropagation algorithm
Weights	Initialization	New weights after one step
$w1$	0.15	?
$w2$	0.20	?
$w3$	0.25	?
$w4$	0.30	?
$w5$	0.40	?
$w6$	0.45	?
$w7$	0.50	?
$w8$	0.55	?

How to calculate convolutions on a CONV layer for a Convolutional Neural Network?

Wed, 26 Jun 2019 08:54:12 +0000

Assume we have a $5\times5$ px RGB image with 3 channels respectively for R, G, and B. If

R
2	0	0	0	0
1	2	0	0	1
2	0	1	0	2
1	2	1	0	1
0	1	0	2	0

G
0	2	1	2	2
1	1	1	0	0
0	0	2	2	0
2	0	0	2	0
0	2	1	1	1

B
0	1	0	0	1
1	1	2	0	1
1	0	2	0	2
1	0	1	1	0
1	2	1	1	2

We have one $3\times3$ px kernel (filter) with 3 channels as follows:

Filter - R
0	0	1
1	0	1
1	0	0

Filter - G
0	0	-1
1	0	0
1	-1	0

Filter - B
1	0	1
0	1	-1
1	-1	0

a) If Stride = 2, and Zero-padding = 1, and Bias = 1, what will be the result of convolution?

b) What is the result after applying a ReLU layer ($max(z,0)$)on the result with the same size of the reuslt in part a?

c) Calculate the output by applying max-pooling layer with the size of $2\times2$ on the output of part b, and Stride = 1. (hint: max-pooling layer here and usually do not include any zero-paddings)

d) What is the result after applying flatten on the output of part c and creating a vector?

e) Assume the vector you created contains m elements. Consider it as the input vector for a Softmax Regression classifier (without any hidden layers and biases and it is fully connected). Assume there are 2 classes of 0 and 1. For all the weights from each element in the feature vector, the optimized weights are 1 for odd elements and 2 for even elements. For example, if the feature vector is [10,11,12,13,14], all the weights from 10 are 1 (because 10 is element 1 and 1 is odd), all the weights from 11 are 2, all the weights from 12 are 1, all the weights from 13 are 2 and all the weights from 14 are 1 and so on. Draw the Softmax Regression network and calculate the class should be 0 or 1?

Hint:
Softmax Regression: $p_{i}=\frac{e^{z_{i}}}{\sum_{i=1}^{c} e^{z_{i}}}$
Where $p_{i}$ is the probability of class $i$ anc $c$ is the number of classes.

How to update weights in backpropagation algorithm (a numerical example)?

Thu, 11 Apr 2019 17:02:04 +0000

Assume we have the following neural network and all activation functions are $f(z)=z$. If the weights are initialized with the values you see in table below, what will be new updated weights after one step if learning rate, $\alpha = 0.05$?

Assume the input values are [$i_1$,$i_2$] = [2,3] and target value $out = 1$.

Hint:
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$

$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$

Updating weights in backpropagation algorithm
Weights	Initialization	New weights after one step
$w1$	0.11	?
$w2$	0.21	?
$w3$	0.12	?
$w4$	0.08	?
$w5$	0.14	?
$w6$	0.15	?

https://i.imgur.com/v0RMeOQ.png

How to calculate Softmax Regression probabilities in this example?

Thu, 04 Apr 2019 18:20:53 +0000

The scatter plot of Iris Dataset is shown in the figure below. Assume Softmax Regression is used to classify Iris to Setosa, Versicolor, or Viriginica using just petal length and petal width. If weights required for Softmax Regression initialized to 1 for class Setosa, 2 for class Versicolor, and 3 for Virginica,

1) What will be the probability of an iris with petal length = 4.6 and petal width = 1.7 to be classified as Virginica?

2) What will be the probability of Virginica, if we use all features petal length = 4.6 and petal width = 1.7, sepal length = 5.5 and sepal width = 3.0 with the same weight initialization?

How to calculate feed-forward (forward-propagation) in neural network?

Thu, 04 Apr 2019 15:54:17 +0000

In the figure below, a neural network is shown. Calculate the following:

1) How many neurons do we have in the input layer and the output layer?

2) How many hidden layers do we have?

3) If all the weights initialized with 1 ($w1=w2=w3=...=w19=1$), what is the output of this network after feed-forward for the sample shown in the figure (X = (x1,x2,x3) = (2,5,3) and y=10)? What is the error of the network ($\text { Error }=\frac{1}{2}(\hat{y}-y)^{2}$)? Assume activation functions for all neurons except the output neuron is $f(z)=z$.

4) If we change the activation function of all the neurons in the second hidden layer to Sigmoid ($S(x)=\frac{1}{1+e^{-x}}=\frac{e^{x}}{e^{x}+1}$), what would be the output of the network after this change? Calculate the error as well.

https://i.imgur.com/rtqPiRa.jpg

What is the difference between a batch and an epoch in a Neural Network?

Tue, 30 Oct 2018 14:45:56 +0000

Both of the batch size and number of epochs are integer values and seem to do the same thing in Stochastic gradient descent. What are these two hyper-parameters of this learning algorithm?