<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Ask Ghassem - Recent questions tagged epoch</title>
<link>https://ask.ghassem.com/tag/epoch</link>
<description>Powered by Question2Answer</description>
<item>
<title>How to optimize weights in Logistic Regression?</title>
<link>https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression</link>
<description>&lt;p&gt;The hypothesis (model) of Logistic Regression which is a binary classifier&amp;nbsp;( $y =\{0,1\} $) is given in the equation below:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hypothesis&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;$S(z)=P(y=1 | x)=h_{\theta}(x)=\frac{1}{1+\exp \left(-\theta^{\top} x\right)}$&lt;/p&gt;

&lt;p&gt;Which calculates probability of Class 1, and by setting a threshold (such as $h_{\theta}(x) &amp;gt; 0.5 $) we can classify to 1, or 0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The cost function for Logistic Regression is defined as below. It is called&amp;nbsp;&lt;em&gt;binary cross entropy loss function&lt;/em&gt;&lt;strong&gt;:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;$J(\theta)=-\frac{1}{m} \sum_{i}^{m}\left(y^{(i)} \log \left(h_{\theta}\left(x^{(i)}\right)\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\right)$&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Iterative updates&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Assume we start all the model parameters&amp;nbsp;with a random number (in this case the only model parameters we have are&amp;nbsp;$\theta_j$ and assume we initialized all of them with 1:&amp;nbsp;&amp;nbsp;for all $\theta_j = 1$ for $j=\{0,1,...,n\}$ and $n$ is the number of features we have)&lt;/p&gt;

&lt;p&gt;$\theta_{j_{n e w}} \leftarrow \theta_{j_{o l d}}+\alpha \times \frac{1}{m} \sum_{i=1}^{m}\left[y^{(i)}-\sigma\left(\theta_{j_{o l d}}^{\top}\left(x^{(i)}\right)\right)\right] x_{j}^{(i)}$&lt;/p&gt;

&lt;p&gt;Where:&lt;br&gt;
$m =$ number of rows in the training batch&lt;br&gt;
$x^{(i)} = $ the feature &lt;em&gt;vector&lt;/em&gt; for sample $i$&lt;br&gt;
$\theta_j = $ the coefficient &lt;em&gt;vector &lt;/em&gt;corresponding the features&lt;br&gt;
$y^{(i)} = $ actual class label for sample $i$ in the training batch&lt;br&gt;
$x_{j}^{(i)} = $ the element (column) $j$ in&amp;nbsp;the feature &lt;em&gt;vector&lt;/em&gt; for sample $i$&lt;br&gt;
$\alpha =$ the learning rate&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The training dataset of pass/fail in an exam for 5 students is given in the table below:&lt;br&gt;
&lt;img alt=&quot;&quot; height=&quot;203&quot; src=&quot;https://i.imgur.com/aVDAxTj.png&quot; width=&quot;300&quot;&gt;&lt;/p&gt;

&lt;p&gt;If we initialize all the model parameters with 1 (all $\theta_j = 1$), and the learning rate is $\alpha = 0.1$, and if we use &lt;strong&gt;batch gradient descent&lt;/strong&gt;, what will be the:&lt;/p&gt;

&lt;p&gt;$a)$ Accuracy of the model at initialization of the train set ($\text{accuracy} = \frac{\text{number of correct classifications}}{\text{all classifications}}$)?&lt;br&gt;
$b)$&amp;nbsp;Cost at initialization?&lt;br&gt;
$c)$ Cost after 1 epoch?&lt;br&gt;
$d)$ Repeat all $a,b,c$ steps if we use &lt;strong&gt;mini-batch gradient descent &lt;/strong&gt;and&lt;strong&gt;&amp;nbsp;&lt;/strong&gt;$\text{batch size} = 2$&lt;/p&gt;

&lt;p&gt;(Hint: For $x_{j}^{(i)}$ when $j=0$ we have&amp;nbsp;$x_{0}^{(i)}&amp;nbsp; = 1$ for all $i$ )&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression</guid>
<pubDate>Wed, 05 Jun 2019 17:38:50 +0000</pubDate>
</item>
<item>
<title>What is the difference between a batch and an epoch in a Neural Network?</title>
<link>https://ask.ghassem.com/497/what-the-difference-between-batch-and-epoch-neural-network</link>
<description>Both of the batch size and number of epochs are integer values and seem to do the same thing in Stochastic gradient descent. What are these two hyper-parameters of this learning algorithm?</description>
<category>Machine Learning Interview Questions</category>
<guid isPermaLink="true">https://ask.ghassem.com/497/what-the-difference-between-batch-and-epoch-neural-network</guid>
<pubDate>Tue, 30 Oct 2018 14:45:56 +0000</pubDate>
</item>
</channel>
</rss>