<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Ask Ghassem - Recent questions tagged logistic-regression</title>
<link>https://ask.ghassem.com/tag/logistic-regression</link>
<description>Powered by Question2Answer</description>
<item>
<title>How to calculate the probability and accuracy of a Logistic Regression classifier?</title>
<link>https://ask.ghassem.com/795/calculate-probability-accuracy-logistic-regression-classifier</link>
<description>&lt;p&gt;How to solve this problem?&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/8urywpf.jpg&quot;&gt;https://i.imgur.com/8urywpf.jpg&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Q1) Complete the ? sections&lt;/p&gt;

&lt;p&gt;Q2) Accuracy of system if threshold = 0.5?&lt;/p&gt;

&lt;p&gt;Q3)&amp;nbsp;Accuracy of system if threshold = 0.95?&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/795/calculate-probability-accuracy-logistic-regression-classifier</guid>
<pubDate>Mon, 03 Feb 2020 20:31:49 +0000</pubDate>
</item>
<item>
<title>How to optimize weights in Logistic Regression?</title>
<link>https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression</link>
<description>&lt;p&gt;The hypothesis (model) of Logistic Regression which is a binary classifier&amp;nbsp;( $y =\{0,1\} $) is given in the equation below:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hypothesis&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;$S(z)=P(y=1 | x)=h_{\theta}(x)=\frac{1}{1+\exp \left(-\theta^{\top} x\right)}$&lt;/p&gt;

&lt;p&gt;Which calculates probability of Class 1, and by setting a threshold (such as $h_{\theta}(x) &amp;gt; 0.5 $) we can classify to 1, or 0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The cost function for Logistic Regression is defined as below. It is called&amp;nbsp;&lt;em&gt;binary cross entropy loss function&lt;/em&gt;&lt;strong&gt;:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;$J(\theta)=-\frac{1}{m} \sum_{i}^{m}\left(y^{(i)} \log \left(h_{\theta}\left(x^{(i)}\right)\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\right)$&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Iterative updates&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Assume we start all the model parameters&amp;nbsp;with a random number (in this case the only model parameters we have are&amp;nbsp;$\theta_j$ and assume we initialized all of them with 1:&amp;nbsp;&amp;nbsp;for all $\theta_j = 1$ for $j=\{0,1,...,n\}$ and $n$ is the number of features we have)&lt;/p&gt;

&lt;p&gt;$\theta_{j_{n e w}} \leftarrow \theta_{j_{o l d}}+\alpha \times \frac{1}{m} \sum_{i=1}^{m}\left[y^{(i)}-\sigma\left(\theta_{j_{o l d}}^{\top}\left(x^{(i)}\right)\right)\right] x_{j}^{(i)}$&lt;/p&gt;

&lt;p&gt;Where:&lt;br&gt;
$m =$ number of rows in the training batch&lt;br&gt;
$x^{(i)} = $ the feature &lt;em&gt;vector&lt;/em&gt; for sample $i$&lt;br&gt;
$\theta_j = $ the coefficient &lt;em&gt;vector &lt;/em&gt;corresponding the features&lt;br&gt;
$y^{(i)} = $ actual class label for sample $i$ in the training batch&lt;br&gt;
$x_{j}^{(i)} = $ the element (column) $j$ in&amp;nbsp;the feature &lt;em&gt;vector&lt;/em&gt; for sample $i$&lt;br&gt;
$\alpha =$ the learning rate&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The training dataset of pass/fail in an exam for 5 students is given in the table below:&lt;br&gt;
&lt;img alt=&quot;&quot; height=&quot;203&quot; src=&quot;https://i.imgur.com/aVDAxTj.png&quot; width=&quot;300&quot;&gt;&lt;/p&gt;

&lt;p&gt;If we initialize all the model parameters with 1 (all $\theta_j = 1$), and the learning rate is $\alpha = 0.1$, and if we use &lt;strong&gt;batch gradient descent&lt;/strong&gt;, what will be the:&lt;/p&gt;

&lt;p&gt;$a)$ Accuracy of the model at initialization of the train set ($\text{accuracy} = \frac{\text{number of correct classifications}}{\text{all classifications}}$)?&lt;br&gt;
$b)$&amp;nbsp;Cost at initialization?&lt;br&gt;
$c)$ Cost after 1 epoch?&lt;br&gt;
$d)$ Repeat all $a,b,c$ steps if we use &lt;strong&gt;mini-batch gradient descent &lt;/strong&gt;and&lt;strong&gt;&amp;nbsp;&lt;/strong&gt;$\text{batch size} = 2$&lt;/p&gt;

&lt;p&gt;(Hint: For $x_{j}^{(i)}$ when $j=0$ we have&amp;nbsp;$x_{0}^{(i)}&amp;nbsp; = 1$ for all $i$ )&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression</guid>
<pubDate>Wed, 05 Jun 2019 17:38:50 +0000</pubDate>
</item>
<item>
<title>How to calculate LogLoss in logistic regression?</title>
<link>https://ask.ghassem.com/588/how-to-calculate-logloss-in-logistic-regression</link>
<description>&lt;p&gt;The dataset of pass/fail in an exam for 5 students is given in the table below. If we use&amp;nbsp;&lt;strong&gt;Logistic Regression&lt;/strong&gt;&amp;nbsp;as the classifier and assume the model suggested by the optimizer will become the following for Odds of passing a course:&lt;/p&gt;

&lt;p&gt;$\log_e(Odds) = -64 + 2 \times hours$&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;&quot; height=&quot;203&quot; src=&quot;https://i.imgur.com/aVDAxTj.png&quot; width=&quot;300&quot;&gt;&lt;/p&gt;

&lt;p&gt;1) How to calculate&amp;nbsp;&lt;strong&gt;the loss of model&lt;/strong&gt;&amp;nbsp;for the student who studied 33 hours?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;2) What is the &lt;strong&gt;total loss &lt;/strong&gt;of the model given in equation below?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;$Logloss = -\frac{1}{N} \sum_{i=1}^N(y_i\log_e(p_i) + (1 - y_i)\log_e(1 - p_i))$&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/588/how-to-calculate-logloss-in-logistic-regression</guid>
<pubDate>Mon, 18 Mar 2019 20:34:40 +0000</pubDate>
</item>
<item>
<title>How to calculate probability in Logistic Regression?</title>
<link>https://ask.ghassem.com/587/how-to-calculate-probability-in-logistic-regression</link>
<description>&lt;p&gt;The dataset of pass/fail in an exam for 5 students is given in the table below. If we use&amp;nbsp;&lt;strong&gt;Logistic Regression&lt;/strong&gt;&amp;nbsp;as the classifier and assume the model suggested by the optimizer will become the following for Odds of passing a course:&lt;/p&gt;

&lt;p&gt;$\log (Odds) = -64 + 2 \times hours$&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;&quot; height=&quot;203&quot; src=&quot;https://i.imgur.com/aVDAxTj.png&quot; width=&quot;300&quot;&gt;&lt;/p&gt;

&lt;p&gt;1) How to calculate the&amp;nbsp;&lt;strong&gt;probability of Pass&lt;/strong&gt;&amp;nbsp;for the student who studied 33 hours?&amp;nbsp;&lt;/p&gt;

&lt;p&gt;2) &lt;strong&gt;At least how many hours &lt;/strong&gt;the student should study that makes sure will pass the course with the probability of more than 95%?&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/587/how-to-calculate-probability-in-logistic-regression</guid>
<pubDate>Mon, 18 Mar 2019 20:22:35 +0000</pubDate>
</item>
</channel>
</rss>