<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Ask Ghassem - Recent questions tagged ml-final</title>
<link>https://ask.ghassem.com/tag/ml-final</link>
<description>Powered by Question2Answer</description>
<item>
<title>How to update the weights in backpropagation algorithm when activation function in not linear?</title>
<link>https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function</link>
<description>&lt;p&gt;The goal of backpropagation is to optimize the weights so that the neural network can learn how to correctly map arbitrary inputs to outputs.&lt;/p&gt;

&lt;p&gt;Assume for the following neural network, inputs = [$i_1,i_2$] = [0.05,&amp;nbsp;0.10], we want the neural network to output = [$o_1$,$o_2$] = [0.01,&amp;nbsp;0.99], and&amp;nbsp;for learning rate, $\alpha=0.5$.&lt;br&gt;
In addition, the activation function for the hidden layer (both $h_1$ and $h_2$)&amp;nbsp;is sigmoid (logistic):&lt;/p&gt;

&lt;p&gt;$S(x)=\frac{1}{1+e^{-x}}$&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/cnY5feu.png&quot;&gt;https://i.imgur.com/cnY5feu.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hint:&lt;/strong&gt;&lt;br&gt;
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&lt;/p&gt;

&lt;p&gt;$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;a) &lt;/strong&gt;Show step by step solution to&amp;nbsp;calculate weights $w_1$ to $w_8$ after one update in table below.&lt;br&gt;
&lt;strong&gt;b) &lt;/strong&gt;Calculate initial error and error after one update (assume&amp;nbsp;biases $[b_1,b_2]$ are not changing during the updates).&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot;&gt;
&lt;caption&gt;Updating weights in backpropagation algorithm&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Weights&lt;/td&gt;
&lt;td&gt;Initialization&lt;/td&gt;
&lt;td&gt;New weights after one step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w1$&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w2$&lt;/td&gt;
&lt;td&gt;0.20&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w3$&lt;/td&gt;
&lt;td&gt;0.25&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w4$&lt;/td&gt;
&lt;td&gt;0.30&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w5$&lt;/td&gt;
&lt;td&gt;0.40&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w6$&lt;/td&gt;
&lt;td&gt;0.45&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w7$&lt;/td&gt;
&lt;td&gt;0.50&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w8$&lt;/td&gt;
&lt;td&gt;0.55&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function</guid>
<pubDate>Mon, 10 Aug 2020 21:55:19 +0000</pubDate>
</item>
<item>
<title>How to calculate convolutions on a CONV layer for a Convolutional Neural Network?</title>
<link>https://ask.ghassem.com/650/calculate-convolutions-layer-convolutional-neural-network</link>
<description>&lt;p&gt;Assume we have a $5\times5$ px&amp;nbsp;RGB image with 3&amp;nbsp;channels respectively for R, G, and B. If&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;R&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;G&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;B&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;We have one&amp;nbsp;$3\times3$ px kernel (filter) with 3 channels as follows:&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - R&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - G&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;0&quot; style=&quot;height:100px; width:100px&quot;&gt;
&lt;caption&gt;Filter - B&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;a)&lt;/strong&gt; If&amp;nbsp;&lt;strong&gt;Stride = 2&lt;/strong&gt;,&lt;strong&gt; &lt;/strong&gt;and&lt;strong&gt;&amp;nbsp;Zero-padding = 1&lt;/strong&gt;, and &lt;strong&gt;Bias&amp;nbsp;= 1&lt;/strong&gt;, what will be the result of convolution?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b)&lt;/strong&gt; What is the result after applying a &lt;strong&gt;ReLU&amp;nbsp;layer ($max(z,0)$)&lt;/strong&gt;on the result with the same size of the reuslt&amp;nbsp;in part a?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;c)&lt;/strong&gt; Calculate the output&amp;nbsp;by applying &lt;strong&gt;max-pooling&lt;/strong&gt; layer with the size of $2\times2$ on the output of part b, and &lt;strong&gt;Stride = 1&lt;/strong&gt;. (hint: max-pooling layer here and&amp;nbsp;usually do not include any zero-paddings)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;d)&lt;/strong&gt; What is the result after applying &lt;strong&gt;flatten&lt;/strong&gt; on the output of part c and creating a vector?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;e)&lt;/strong&gt; Assume the vector you created contains m elements. Consider it as the input vector for a &lt;strong&gt;Softmax&lt;/strong&gt; &lt;strong&gt;Regression classifier&amp;nbsp;&lt;/strong&gt;(without any hidden layers and biases and it is fully connected). Assume there are 2 classes of 0 and 1. For all the weights from&amp;nbsp;each element in the feature vector, the optimized weights are 1 for odd elements and 2 for even elements. For example, if the feature vector is [10,11,12,13,14], all the weights &lt;strong&gt;from &lt;/strong&gt;10 are 1 (because 10 is element 1 and 1 is odd), all the weights &lt;strong&gt;from&lt;/strong&gt; 11 are 2, all the weights &lt;strong&gt;from&lt;/strong&gt; 12 are&amp;nbsp;1, all the weights &lt;strong&gt;from&lt;/strong&gt; 13 are&amp;nbsp;2 and all the weights &lt;strong&gt;from&lt;/strong&gt; 14 are 1 and so on. Draw the&amp;nbsp;Softmax&amp;nbsp;Regression network and calculate the class should be 0 or 1?&lt;/p&gt;

&lt;p&gt;Hint:&amp;nbsp;&lt;br&gt;
&lt;strong&gt;Softmax Regression:&lt;/strong&gt;&amp;nbsp;$p_{i}=\frac{e^{z_{i}}}{\sum_{i=1}^{c} e^{z_{i}}}$&lt;br&gt;
Where $p_{i}$ is the probability of class $i$ anc $c$ is the number of classes.&lt;/p&gt;</description>
<category>Deep Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/650/calculate-convolutions-layer-convolutional-neural-network</guid>
<pubDate>Wed, 26 Jun 2019 08:54:12 +0000</pubDate>
</item>
<item>
<title>How to update weights in backpropagation algorithm (a numerical example)?</title>
<link>https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example</link>
<description>&lt;p&gt;Assume we have the following neural network and all activation functions are $f(z)=z$. If the weights are initialized with the values you see in table below, what will be new updated weights after one step if learning rate, $\alpha = 0.05$?&lt;/p&gt;

&lt;p&gt;Assume the input values are [$i_1$,$i_2$] = [2,3] and target value&amp;nbsp;$out = 1$.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hint:&lt;/strong&gt;&lt;br&gt;
$w_{new} = w_{old} - \alpha \frac{\partial E}{\partial w}$&lt;/p&gt;

&lt;p&gt;$E_{\text {total}}=\sum \frac{1}{2}(\text {target}-\text {output})^{2}$&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;height:225px; width:394px&quot;&gt;
&lt;caption&gt;Updating weights in backpropagation algorithm&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Weights&lt;/td&gt;
&lt;td&gt;Initialization&lt;/td&gt;
&lt;td&gt;New weights after one step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w1$&lt;/td&gt;
&lt;td&gt;0.11&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w2$&lt;/td&gt;
&lt;td&gt;0.21&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w3$&lt;/td&gt;
&lt;td&gt;0.12&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w4$&lt;/td&gt;
&lt;td&gt;0.08&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w5$&lt;/td&gt;
&lt;td&gt;0.14&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$w6$&lt;/td&gt;
&lt;td&gt;0.15&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/v0RMeOQ.png&quot;&gt;https://i.imgur.com/v0RMeOQ.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example</guid>
<pubDate>Thu, 11 Apr 2019 17:02:04 +0000</pubDate>
</item>
<item>
<title>How to calculate Softmax Regression probabilities in this example?</title>
<link>https://ask.ghassem.com/605/calculate-softmax-regression-probabilities-this-example</link>
<description>&lt;p&gt;The scatter plot of Iris Dataset is shown in the figure below. Assume&lt;strong&gt;&amp;nbsp;Softmax Regression&lt;/strong&gt;&amp;nbsp;is used to classify Iris to Setosa, Versicolor, or Viriginica&amp;nbsp;using just petal length and petal width. If&amp;nbsp; weights required for Softmax&amp;nbsp;Regression initialized to 1 for class Setosa, 2 for class Versicolor, and 3 for Virginica,&lt;/p&gt;

&lt;p&gt;1) What will be the probability of an iris with petal&amp;nbsp;length = 4.6&amp;nbsp; and petal width = 1.7 to be classified as Virginica?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;2) What will be the probability of Virginica, if we use all features&amp;nbsp;petal&amp;nbsp;length = 4.6&amp;nbsp; and petal width = 1.7, sepal length = 5.5 and sepal width = 3.0&amp;nbsp;with the same weight initialization?&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; src=&quot;https://i.imgur.com/CezSTPM.png&quot;&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/605/calculate-softmax-regression-probabilities-this-example</guid>
<pubDate>Thu, 04 Apr 2019 18:20:53 +0000</pubDate>
</item>
<item>
<title>How to calculate feed-forward (forward-propagation) in neural network?</title>
<link>https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network</link>
<description>&lt;p&gt;In the figure&amp;nbsp;below, a neural network is shown. Calculate the following:&lt;/p&gt;

&lt;p&gt;1) How many neurons do we have in the input layer and the output layer?&lt;/p&gt;

&lt;p&gt;2) How many hidden layers do we have?&lt;/p&gt;

&lt;p&gt;3) If all the weights initialized with 1 ($w1=w2=w3=...=w19=1$), what is the output of this network after feed-forward for the sample shown in the figure&amp;nbsp;(X = (x1,x2,x3) = (2,5,3) and y=10)? What is the error of the network ($\text { Error }=\frac{1}{2}(\hat{y}-y)^{2}$)? Assume activation functions for all neurons except the output neuron is $f(z)=z$.&amp;nbsp;&lt;br&gt;
&lt;br&gt;
4) If we change the activation function of all&amp;nbsp;the neurons in the second hidden layer to Sigmoid ($S(x)=\frac{1}{1+e^{-x}}=\frac{e^{x}}{e^{x}+1}$), what would be the output of the network after this change? Calculate the error as well.&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/rtqPiRa.jpg&quot;&gt;https://i.imgur.com/rtqPiRa.jpg&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network</guid>
<pubDate>Thu, 04 Apr 2019 15:54:17 +0000</pubDate>
</item>
<item>
<title>What is the difference between a batch and an epoch in a Neural Network?</title>
<link>https://ask.ghassem.com/497/what-the-difference-between-batch-and-epoch-neural-network</link>
<description>Both of the batch size and number of epochs are integer values and seem to do the same thing in Stochastic gradient descent. What are these two hyper-parameters of this learning algorithm?</description>
<category>Machine Learning Interview Questions</category>
<guid isPermaLink="true">https://ask.ghassem.com/497/what-the-difference-between-batch-and-epoch-neural-network</guid>
<pubDate>Tue, 30 Oct 2018 14:45:56 +0000</pubDate>
</item>
</channel>
</rss>