<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Ask Ghassem - Recent activity in Machine Learning</title>
<link>https://ask.ghassem.com/activity/machine-learning</link>
<description>Powered by Question2Answer</description>
<item>
<title>Answered: Step-by-Step Hidden State Calculation in a Recurrent Neural Network</title>
<link>https://ask.ghassem.com/1049/step-step-hidden-state-calculation-recurrent-neural-network?show=1050#a1050</link>
<description>&lt;p&gt;We compute each hidden state step-by-step using&lt;/p&gt;

&lt;p&gt;$$ h_t = \text{ReLU}(W_{ih} \cdot x_t + W_{hh} \cdot h_{t-1}). $$&lt;/p&gt;

&lt;p&gt;\( h_1 = \text{ReLU}(0.4 \cdot 3 + 0.6 \cdot 0) = 1.2 \)&lt;/p&gt;

&lt;p&gt;\( h_2 = \text{ReLU}(0.4 \cdot 3 + 0.6 \cdot 1.2) = 1.92 \)&lt;/p&gt;

&lt;p&gt;\( h_3 = \text{ReLU}(0.4 \cdot 3 + 0.6 \cdot 1.92) = 2.352 \)&lt;/p&gt;

&lt;p&gt;\( h_4 = \text{ReLU}(0.4 \cdot 3 + 0.6 \cdot 2.352) = 2.6112 \)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final Answer: \( h_4 = 2.6112 \)&lt;/strong&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1049/step-step-hidden-state-calculation-recurrent-neural-network?show=1050#a1050</guid>
<pubDate>Mon, 01 Dec 2025 18:33:19 +0000</pubDate>
</item>
<item>
<title>Answered: How to calculate feed-forward (forward-propagation) in neural network for classification?</title>
<link>https://ask.ghassem.com/1047/calculate-forward-forward-propagation-network-classification?show=1048#a1048</link>
<description>&lt;p&gt;The answer is provided below. Please comment if you have question of find mistakes&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/Qchg1sl.jpeg&quot;&gt;https://i.imgur.com/Qchg1sl.jpeg&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1047/calculate-forward-forward-propagation-network-classification?show=1048#a1048</guid>
<pubDate>Wed, 09 Oct 2024 13:00:11 +0000</pubDate>
</item>
<item>
<title>Commented: How to update weights using gradient decent algorithm?</title>
<link>https://ask.ghassem.com/596/how-to-update-weights-using-gradient-decent-algorithm?show=1046#c1046</link>
<description>Isn&amp;#039;t the derivative in 3) wrong? I got 2w-5 instead of 4w-10? I think person in solution forgot to incorporate the 1/2</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/596/how-to-update-weights-using-gradient-decent-algorithm?show=1046#c1046</guid>
<pubDate>Wed, 17 Apr 2024 22:58:56 +0000</pubDate>
</item>
<item>
<title>Commented: How to update weights in backpropagation algorithm (a numerical example)?</title>
<link>https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example?show=1045#c1045</link>
<description>It does make a difference, because after taking the derivative it should be (target - output) but in the solution it&amp;#039;s (output - target)</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/612/update-weights-backpropagation-algorithm-numerical-example?show=1045#c1045</guid>
<pubDate>Fri, 05 Apr 2024 21:48:20 +0000</pubDate>
</item>
<item>
<title>When to use one hot encode a category and when to segment by category?</title>
<link>https://ask.ghassem.com/1034/when-to-use-one-hot-encode-category-and-when-segment-category</link>
<description>When pre processing data for machine learning. Is there any difference in using one hot encoding to turn categoric variables into numeric variables or to segment the data and the model being used along the category. So say you run a multivariate regression model on data covering 5 cities. Would a single model with one variable for each city be more better or worse than having 5 models specific for each city? Or is there no difference? Or does it depend on certain factors and intuition?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1034/when-to-use-one-hot-encode-category-and-when-segment-category</guid>
<pubDate>Wed, 22 Feb 2023 20:30:38 +0000</pubDate>
</item>
<item>
<title>Answered: How to calculate the residual errors, (MSE),(MAE), and (RMSE)?</title>
<link>https://ask.ghassem.com/1031/how-to-calculate-the-residual-errors-mse-mae-and-rmse?show=1032#a1032</link>
<description>&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;1. First, we need to calculate the residual errors. Residual errors are the difference between the actual values and predicted values.&lt;/p&gt;

&lt;table border=&quot;1&quot; cellpadding=&quot;1&quot; style=&quot;width:500px&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;Sample&lt;/th&gt;
&lt;th&gt;Feature 1&lt;/th&gt;
&lt;th&gt;Feature 2&lt;/th&gt;
&lt;th&gt;Actual Value&lt;/th&gt;
&lt;th&gt;Predicted Value&lt;/th&gt;
&lt;th&gt;Residual Error (Actual - Predicted)&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;-2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;-1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;ol start=&quot;2&quot;&gt;
&lt;li&gt;Next, we can calculate the MSE by taking the average of the squared residual errors.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;$MSE = ((-2)^2 + (-1)^2 + (-1)^2 + (-1)^2 + (-1)^2) / 5 = 10 / 5 = 2$&lt;/p&gt;

&lt;ol start=&quot;3&quot;&gt;
&lt;li&gt;To calculate the MAE, we take the average of the absolute residual errors.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;$MAE = (|-2| + |-1| + |-1| + |-1| + |-1|) / 5 = 6 / 5 = 1.2$&lt;/p&gt;

&lt;ol start=&quot;4&quot;&gt;
&lt;li&gt;Finally, to calculate the RMSE, we take the square root of the MSE.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;$RMSE = sqrt(2) = 1.41$&lt;/p&gt;

&lt;p&gt;Therefore, the residual errors are [-2, -1, -1, -1, -1], the MSE is 2, the MAE is 1.2, and the RMSE is 1.41.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1031/how-to-calculate-the-residual-errors-mse-mae-and-rmse?show=1032#a1032</guid>
<pubDate>Fri, 27 Jan 2023 04:16:33 +0000</pubDate>
</item>
<item>
<title>Creating tables from unstructured texts about stock market</title>
<link>https://ask.ghassem.com/1026/creating-tables-from-unstructured-texts-about-stock-market</link>
<description>&lt;div&gt;
&lt;div&gt;
&lt;div&gt;
&lt;p&gt;I am trying to extract information such as profits, revenues and others along with their corresponding dates and quarters from an unstructured text about stock market and convert it into a report in the table form but as there is not format of the input text, it is hard to know which entity belong to what date and quarters and which value belong to which entity. Chunking works on few documents but not enough. Is there any unsupervised way to linking entities with their corresponding dates, values and quarters?&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1026/creating-tables-from-unstructured-texts-about-stock-market</guid>
<pubDate>Tue, 02 Aug 2022 00:47:49 +0000</pubDate>
</item>
<item>
<title>Kmeans clustering in python - Giving original labels to predicted clusters</title>
<link>https://ask.ghassem.com/1022/kmeans-clustering-python-giving-original-predicted-clusters</link>
<description>&lt;p&gt;I have a dataset with 7 labels in the target variable.&lt;/p&gt;

&lt;pre class=&quot;prettyprint lang-python&quot; data-pbcklang=&quot;python&quot; data-pbcktabsize=&quot;4&quot;&gt;
X = data.drop(&#039;target&#039;, axis=1)
Y = data[&#039;target&#039;]
Y.unique()&lt;/pre&gt;

&lt;p&gt;array([&#039;Normal_Weight&#039;, &#039;Overweight_Level_I&#039;, &#039;Overweight_Level_II&#039;,&lt;br&gt;
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&#039;Obesity_Type_I&#039;, &#039;Insufficient_Weight&#039;, &#039;Obesity_Type_II&#039;,&lt;br&gt;
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&#039;Obesity_Type_III&#039;], dtype=object)&lt;/p&gt;

&lt;pre class=&quot;prettyprint lang-python&quot; data-pbcklang=&quot;python&quot; data-pbcktabsize=&quot;4&quot;&gt;
km = KMeans(n_clusters=7, init=&quot;k-means++&quot;, random_state=300)
km.fit_predict(X)
np.unique(km.labels_)&lt;/pre&gt;

&lt;p&gt;array([0, 1, 2, 3, 4, 5, 6])&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;After performing KMean clustering algorithm with number of clusters as 7, the resulted clusters are labeled as 0,1,2,3,4,5,6. But how to know which real label matches with the predicted label.&lt;/p&gt;

&lt;p&gt;In other words, I want to know how to give original label names to new predicted labels, so that they can be compared like how many values are clustered correctly (Accuracy).&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1022/kmeans-clustering-python-giving-original-predicted-clusters</guid>
<pubDate>Wed, 27 Apr 2022 05:32:54 +0000</pubDate>
</item>
<item>
<title>Bankruptcy prediction and credit card</title>
<link>https://ask.ghassem.com/1021/bankruptcy-prediction-and-credit-card</link>
<description>Hello everyone newbie data scientist here.&lt;br /&gt;
I&amp;#039;m working on a project to predict companies (probability of default) bankruptcy probability and to assign them a credit rating/score based on that :&lt;br /&gt;
For example below 50 probability is good and above is bad ( just for the example)&lt;br /&gt;
I have a dataset contains financial ratios and a class refers if the company is bankrupted or not (0 and one).&lt;br /&gt;
I&amp;#039;m planning to use this models:&lt;br /&gt;
Logistic regression linear discrimination analysis, decision trees, random forest, ANN, adaboost, Svm.&lt;br /&gt;
&lt;br /&gt;
The question is and i know it is a dumb question:&lt;br /&gt;
Does those models return a probability? Which i can transform to labels, I saw that in a thesis and I&amp;#039;m not sure about it.&lt;br /&gt;
&lt;br /&gt;
Otherwise, any guidance,tips anything will be appreciated.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1021/bankruptcy-prediction-and-credit-card</guid>
<pubDate>Sun, 10 Apr 2022 05:50:14 +0000</pubDate>
</item>
<item>
<title>Answered: When dealing with categorical values, should the &#039;year&#039; column be encoded using OHE or OrdinalEncoder?</title>
<link>https://ask.ghassem.com/1012/dealing-categorical-values-should-encoded-ordinalencoder?show=1013#a1013</link>
<description>You should ask yourself if the order of years has an effect in predicting the price? It seems it is important. Therefore, OrdinalEncoder seems to be a better choice. If you use OneHotEncoder, you consider the years with equal weights in predicting the price.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1012/dealing-categorical-values-should-encoded-ordinalencoder?show=1013#a1013</guid>
<pubDate>Mon, 20 Dec 2021 18:10:13 +0000</pubDate>
</item>
<item>
<title>Answer selected: How to create a Decision Tree using the ID3 algorithm?</title>
<link>https://ask.ghassem.com/1008/how-to-create-a-decision-tree-using-the-id3-algorithm?show=1009#a1009</link>
<description>&lt;p&gt;&lt;strong&gt;a)&lt;/strong&gt; See the following figure for the ID3 decision tree:&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/kizNjoc.png&quot;&gt;https://i.imgur.com/kizNjoc.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b)&lt;/strong&gt; Only the disjunction of conjunctions for Martians was required:&lt;/p&gt;

&lt;p&gt;$\begin{aligned}&lt;br&gt;
&amp;amp;(\text { Legs }=3) \vee \\&lt;br&gt;
&amp;amp;(\text { Legs }=2 \wedge \text { Green }=\text { Yes } \wedge \text { Height }=\text { Tall }) \vee \\&lt;br&gt;
&amp;amp;(\text { Legs }=2 \wedge \text { Green }=\text { No } \wedge \text { Height }=\text { Short } \wedge \text { Smelly }=\text { Yes })&lt;br&gt;
\end{aligned}$&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://github.com/tofighi/MachineLearning/blob/master/Decision_Tree_Example.ipynb&quot;&gt;Python Code&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Step 1: Organize the Dataset&lt;/h2&gt;

&lt;p&gt;Our data has the following features and values:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Species&lt;/strong&gt;: Target variable (M or H)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Features&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Green&lt;/strong&gt;: \( N \) or \( Y \)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Legs&lt;/strong&gt;: \( 2 \) or \( 3 \)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Height&lt;/strong&gt;: \( S \) or \( T \)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Smelly&lt;/strong&gt;: \( N \) or \( Y \)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;table border=&quot;1&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;Index&lt;/th&gt;
&lt;th&gt;Species&lt;/th&gt;
&lt;th&gt;Green&lt;/th&gt;
&lt;th&gt;Legs&lt;/th&gt;
&lt;th&gt;Height&lt;/th&gt;
&lt;th&gt;Smelly&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;M&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;S&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;M&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;T&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;M&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;T&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;M&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;S&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;M&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;T&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;H&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;T&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;H&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;S&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;H&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;T&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;H&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;S&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;H&lt;/td&gt;
&lt;td&gt;N&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;T&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;h2&gt;Step 2: Calculate the Initial Entropy for the Target Variable (Species)&lt;/h2&gt;

&lt;p&gt;We start by calculating the entropy of the target variable, &lt;strong&gt;Species&lt;/strong&gt;, which has two classes: &lt;strong&gt;M&lt;/strong&gt; (Martian) and &lt;strong&gt;H&lt;/strong&gt; (Human).&lt;/p&gt;

&lt;h3&gt;Total Counts&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Martians (M): 5&lt;/li&gt;
&lt;li&gt;Humans (H): 5&lt;/li&gt;
&lt;li&gt;Total: 10&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Entropy Formula&lt;/h3&gt;

&lt;p&gt;The entropy \( E \) for a binary classification is calculated as:&lt;/p&gt;

&lt;p&gt;$$ E = -p_+ \log_2(p_+) - p_- \log_2(p_-) $$&lt;/p&gt;

&lt;p&gt;Where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;\( p_+ \): Probability of positive class (M)&lt;/li&gt;
&lt;li&gt;\( p_- \): Probability of negative class (H)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Calculation&lt;/h3&gt;

&lt;p&gt;$$ p(M) = \frac{5}{10} = 0.5 $$&lt;/p&gt;

&lt;p&gt;$$ p(H) = \frac{5}{10} = 0.5 $$&lt;/p&gt;

&lt;p&gt;$$ E(Species) = -0.5 \cdot \log_2(0.5) - 0.5 \cdot \log_2(0.5) $$&lt;/p&gt;

&lt;p&gt;$$ = -0.5 \cdot (-1) - 0.5 \cdot (-1) $$&lt;/p&gt;

&lt;p&gt;$$ = 1.0 $$&lt;/p&gt;

&lt;h2&gt;Step 3: Calculate Entropy and Information Gain for Each Feature&lt;/h2&gt;

&lt;p&gt;We’ll calculate the entropy for each feature split and determine the information gain.&lt;/p&gt;

&lt;h3&gt;Feature: Green&lt;/h3&gt;

&lt;p&gt;Green can be either &lt;strong&gt;Y&lt;/strong&gt; or &lt;strong&gt;N&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For Green = Y:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Martians (M): 3&lt;/li&gt;
&lt;li&gt;Humans (H): 1&lt;/li&gt;
&lt;li&gt;Total: 4&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Entropy:&lt;/p&gt;

&lt;p&gt;$$ E(Green = Y) = -\left(\frac{3}{4}\right) \log_2\left(\frac{3}{4}\right) - \left(\frac{1}{4}\right) \log_2\left(\frac{1}{4}\right) $$&lt;/p&gt;

&lt;p&gt;$$ = -0.75 \cdot \log_2(0.75) - 0.25 \cdot \log_2(0.25) $$&lt;/p&gt;

&lt;p&gt;$$ = -0.75 \cdot (-0.415) - 0.25 \cdot (-2) $$&lt;/p&gt;

&lt;p&gt;$$ = 0.311 + 0.5 = 0.811 $$&lt;/p&gt;

&lt;p&gt;For Green = N:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Martians (M): 2&lt;/li&gt;
&lt;li&gt;Humans (H): 4&lt;/li&gt;
&lt;li&gt;Total: 6&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Entropy:&lt;/p&gt;

&lt;p&gt;$$ E(Green = N) = -\left(\frac{2}{6}\right) \log_2\left(\frac{2}{6}\right) - \left(\frac{4}{6}\right) \log_2\left(\frac{4}{6}\right) $$&lt;/p&gt;

&lt;p&gt;$$ = -0.333 \cdot \log_2(0.333) - 0.667 \cdot \log_2(0.667) $$&lt;/p&gt;

&lt;p&gt;$$ = -0.333 \cdot (-1.585) - 0.667 \cdot (-0.585) $$&lt;/p&gt;

&lt;p&gt;$$ = 0.528 + 0.389 = 0.917 $$&lt;/p&gt;

&lt;h3&gt;Weighted Entropy for Green&lt;/h3&gt;

&lt;p&gt;$$ E(Green) = \frac{4}{10} \cdot 0.811 + \frac{6}{10} \cdot 0.917 $$&lt;/p&gt;

&lt;p&gt;$$ = 0.3244 + 0.5502 = 0.8746 $$&lt;/p&gt;

&lt;h3&gt;Information Gain for Green&lt;/h3&gt;

&lt;p&gt;$$ IG(Species, Green) = E(Species) - E(Green) $$&lt;/p&gt;

&lt;p&gt;$$ = 1.0 - 0.8746 = 0.1254 $$&lt;/p&gt;

&lt;p&gt;Continue this process to calculate the entropy and information gain for each feature (Legs, Height, and Smelly) similarly.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1008/how-to-create-a-decision-tree-using-the-id3-algorithm?show=1009#a1009</guid>
<pubDate>Wed, 01 Dec 2021 11:56:37 +0000</pubDate>
</item>
<item>
<title>Answer selected: How to calculate LogLoss in logistic regression?</title>
<link>https://ask.ghassem.com/588/how-to-calculate-logloss-in-logistic-regression?show=874#a874</link>
<description>&lt;p&gt;Answer#2: Total Loss of the model&lt;/p&gt;

&lt;p&gt;first we have to find all the probability of the student passing the course&lt;/p&gt;

&lt;p&gt;lets i is representing the sampling index of the student&lt;/p&gt;

&lt;p&gt;P1:&lt;/p&gt;

&lt;p&gt;Z=-64+(2*29)=-6&lt;/p&gt;

&lt;p&gt;P=1/(1+e^6)=0.0024&lt;/p&gt;

&lt;p&gt;P2:&lt;/p&gt;

&lt;p&gt;Z=-64+(2*15)=-34&lt;/p&gt;

&lt;p&gt;P=1/(1+e^34)=0 (THE VALUE IS SO SMALL)&lt;/p&gt;

&lt;p&gt;P3: ALREADY KNOW = 0.88&lt;/p&gt;

&lt;p&gt;P4:&lt;/p&gt;

&lt;p&gt;Z=-64+(2*28)=-8&lt;/p&gt;

&lt;p&gt;P=1/(1+e^8)=0.00033&lt;/p&gt;

&lt;p&gt;P5:&lt;/p&gt;

&lt;p&gt;Z=-64+(2*39)=14&lt;/p&gt;

&lt;p&gt;P=1/(1+e^-14)=0.999&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;THE TOTAL LOSS OF THE MODEL IS CALCULATED BELOW, BY USING THE FORMULA&lt;/p&gt;

&lt;p&gt;Log-loss= -(yi*ln(P1)+(1-yi)ln(1-P1))&lt;/p&gt;

&lt;p&gt;LOG-LOSS 1= -2.4E-3&lt;/p&gt;

&lt;p&gt;LOG-LOSS 2= 0&lt;/p&gt;

&lt;p&gt;LOG-LOSS 3= -&amp;nbsp;0.128&lt;/p&gt;

&lt;p&gt;LOG-LOSS 4= -8.0164&lt;/p&gt;

&lt;p&gt;LOG-LOSS 5= -0.001&lt;/p&gt;

&lt;p&gt;TOTAL LOSS OF THE MODEL= LOG-LOSS= - (1/5)(-2.4E-3+0-&amp;nbsp;0.128-8.0164-0.001) = 1.6296&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Answer 1:&lt;strong&gt;the loss of model&lt;/strong&gt;&amp;nbsp;for the student who studied 33 hours&lt;/p&gt;

&lt;p&gt;Step 1: we have to find the probability to passing the course&lt;/p&gt;

&lt;p&gt;P=1/(1+e^-z)&lt;/p&gt;

&lt;p&gt;where z= odd= -64+(2*33)=2&lt;/p&gt;

&lt;p&gt;after putting the values... P=1/(1+e^-2)=0.88&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Now, lets calculate the log-loss of the model for that particular student, has sample number 3 which is &quot;i&quot; the sampling index&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Log-loss= (yi*ln(P1)+(1-yi)ln(1-P1))&lt;/p&gt;

&lt;p&gt;Log-loss=[1*ln(0.88)+(1-1)ln(1-0.88)]&lt;/p&gt;

&lt;p&gt;Answer#1: Log-loss= - 0.128 loss of model for the student&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/588/how-to-calculate-logloss-in-logistic-regression?show=874#a874</guid>
<pubDate>Sun, 17 Oct 2021 16:48:29 +0000</pubDate>
</item>
<item>
<title>Which algorithm is best to detect anomalies within a data set of 5k+ user-login events?</title>
<link>https://ask.ghassem.com/1000/which-algorithm-best-detect-anomalies-within-login-events</link>
<description>I am trying to build an unsupervised ML model to detect anomalies within 5000+ users&amp;#039; login data. &amp;nbsp;I selected 5 features contained within each of the user-login events (e.g. IP, hour of day, day of week, device_id, OS). &amp;nbsp;I am looking for the best algorithm to use. &amp;nbsp;I am considering using density function to determine probabilities of the feature values and whether an event is an outlier. &amp;nbsp;The problem is that feature values are only relevant to the specific user. &amp;nbsp;For example, you cannot compare login IP across users, login IP is only applicable to the user. &lt;br /&gt;
Ultimately, I want to detect events that are changes in a user login behavior, like different IP, day, hour, device_id, or OS, where the more features that have changed increase the probability of an outlier. &lt;br /&gt;
At this point, I am not sure how to build a model with data that contains multiple users, because I don&amp;#039;t know how to separate the user data so the model is trained per user and finding anomalies within the individual user&amp;#039;s features.&lt;br /&gt;
&lt;br /&gt;
I also don&amp;#039;t have any labeled data to use for testing, should I fabricate some?&lt;br /&gt;
&lt;br /&gt;
Any advice greatly appreciated.&lt;br /&gt;
&lt;br /&gt;
Thank you!</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/1000/which-algorithm-best-detect-anomalies-within-login-events</guid>
<pubDate>Tue, 05 Oct 2021 17:45:38 +0000</pubDate>
</item>
<item>
<title>How we incorporate the polyline in machine learnning tools</title>
<link>https://ask.ghassem.com/999/how-we-incorporate-the-polyline-in-machine-learnning-tools</link>
<description>Suppose I have to predict the traffic of a road segment based on available data such as number of houses and business along the road segment. Which machine learning tool would be the option to use that can incorporate the road segment (polylines) through coordinates in the attributes.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/999/how-we-incorporate-the-polyline-in-machine-learnning-tools</guid>
<pubDate>Wed, 29 Sep 2021 06:16:30 +0000</pubDate>
</item>
<item>
<title>Commented: How to calculate Softmax Regression probabilities?</title>
<link>https://ask.ghassem.com/591/how-to-calculate-softmax-regression-probabilities?show=998#c998</link>
<description>The question doesn&amp;#039;t make any mention of a bias, so we just assume it is 1?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/591/how-to-calculate-softmax-regression-probabilities?show=998#c998</guid>
<pubDate>Mon, 05 Jul 2021 18:14:35 +0000</pubDate>
</item>
<item>
<title>Commented: How to update the weights in backpropagation algorithm when activation function in not linear?</title>
<link>https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function?show=996#c996</link>
<description>While the question says activation function for hidden layer, the solution applies the same activation function to output layer as well. Do we also need to apply activation function to output layer?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/901/update-weights-backpropagation-algorithm-activation-function?show=996#c996</guid>
<pubDate>Sat, 03 Jul 2021 14:44:08 +0000</pubDate>
</item>
<item>
<title>Commented: What are the main branches of Machine Learning?</title>
<link>https://ask.ghassem.com/13/what-are-the-main-branches-of-machine-learning?show=995#c995</link>
<description>Hi, can I use this picture as reference in my master thesis? Thank you</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/13/what-are-the-main-branches-of-machine-learning?show=995#c995</guid>
<pubDate>Mon, 28 Jun 2021 11:17:30 +0000</pubDate>
</item>
<item>
<title>Answered: Do  I need to save the standardization transformation?</title>
<link>https://ask.ghassem.com/970/do-i-need-to-save-the-standardization-transformation?show=971#a971</link>
<description>&lt;p&gt;The details of standardized vector is stored in the transformer object. For example&amp;nbsp;&lt;strong&gt;X_scaled &lt;/strong&gt;on&lt;span style=&quot;-webkit-tap-highlight-color:rgba(0, 0, 0, 0)&quot;&gt;&amp;nbsp;this &lt;a rel=&quot;nofollow&quot; href=&quot;https://scikit-learn.org/stable/modules/preprocessing.html&quot;&gt;page&lt;/a&gt;&amp;nbsp;is storing all details for the mean and SD of the original vectors in training data, and you can use it to scale the new vectors.&lt;/span&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/970/do-i-need-to-save-the-standardization-transformation?show=971#a971</guid>
<pubDate>Tue, 15 Dec 2020 14:40:42 +0000</pubDate>
</item>
<item>
<title>Why should I use Dynamic Time Warping over GMM for timer series clustering?</title>
<link>https://ask.ghassem.com/962/why-should-dynamic-time-warping-over-timer-series-clustering</link>
<description></description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/962/why-should-dynamic-time-warping-over-timer-series-clustering</guid>
<pubDate>Fri, 04 Dec 2020 03:19:16 +0000</pubDate>
</item>
<item>
<title>Answered: How to predict from unseen data?</title>
<link>https://ask.ghassem.com/954/how-to-predict-from-unseen-data?show=960#a960</link>
<description>&lt;p&gt;My recommendation:&lt;/p&gt;

&lt;p&gt;Speak to or think as a football fun, obviously I am not a that type person :)&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Try to find out what can help us to predict &quot;next&quot; game&#039;s result as an expert. Collect that data to feed in your model.(and/or any relevant data available)&lt;/p&gt;

&lt;p&gt;For example, all matches have been played between Arsenal and Chelsea so far might have a value in your model. Also the last games each team played might have an affect at the next match&#039;s result.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;As you stated in your question go with pre-match variables to &quot;predict&quot; next game&#039;s score.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Another model could be:&lt;/p&gt;

&lt;p&gt;You can take the features(data) in your question for the first &lt;em&gt;t &lt;/em&gt;minute of the match and try to predict the result. Let&#039;s say use the data belonging the first half&amp;nbsp;of the match to predict second half&#039;s result.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;On the other hand, the way you are doing at the moment can be helpful if you are looking for some exploratory analysis. For example which feature(s) has more impact on winning a game.&lt;/p&gt;

&lt;p&gt;Hope this helps and looking forward to see other answers/and your analysis results.&lt;/p&gt;

&lt;p&gt;ia&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/954/how-to-predict-from-unseen-data?show=960#a960</guid>
<pubDate>Fri, 20 Nov 2020 10:01:23 +0000</pubDate>
</item>
<item>
<title>Answer selected: How to model unknown yet data</title>
<link>https://ask.ghassem.com/943/how-to-model-unknown-yet-data?show=944#a944</link>
<description>Your answer is actually based on what we always do in machine learning. We collect datasets, split to training and testing set, we train using the training set, and evaluate performance based on the testing set.&lt;br /&gt;
&lt;br /&gt;
Assume you have 100 matches with all statistics and parameters you want to use in the training (such as ball possession, number of shots, corners, etc). You can take 80 of these matches for training and the rest of 20 matches for evaluating the model you created based on 80% of data simply because you already know that &amp;quot;future&amp;quot; statistics and outcome to compare with the output of your model to check the performance.&lt;br /&gt;
&lt;br /&gt;
I hope this answers your question.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/943/how-to-model-unknown-yet-data?show=944#a944</guid>
<pubDate>Tue, 27 Oct 2020 15:04:19 +0000</pubDate>
</item>
<item>
<title>Answered: From microarray data, which tools of pattern recognition can you apply to identify the genes responsible for diseases?</title>
<link>https://ask.ghassem.com/936/microarray-pattern-recognition-identify-responsible-diseases?show=937#a937</link>
<description>&lt;p&gt;I am not sure if this is the best tool or not, but there is a company acquired by Nvidia and gives you access to GPU cloud to use it for application such as the one you mentioned:&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://www.nvidia.com/en-us/healthcare/clara-parabricks/&quot;&gt;https://www.nvidia.com/en-us/healthcare/clara-parabricks/&lt;/a&gt;&lt;/p&gt;


</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/936/microarray-pattern-recognition-identify-responsible-diseases?show=937#a937</guid>
<pubDate>Thu, 15 Oct 2020 20:26:53 +0000</pubDate>
</item>
<item>
<title>Answered: Can we use a trained model to supervise the other machine learning models?</title>
<link>https://ask.ghassem.com/930/can-trained-model-supervise-other-machine-learning-models?show=931#a931</link>
<description>&lt;p&gt;If your goal is training a machine learning model using other machine learning models it is called&amp;nbsp;&lt;strong&gt;meta-learning&lt;/strong&gt;.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;You can find more information in &lt;a rel=&quot;nofollow&quot; href=&quot;https://en.wikipedia.org/wiki/Meta_learning_(computer_science)&quot;&gt;this article&lt;/a&gt;:&amp;nbsp;&lt;strong&gt;&quot;Meta-learning&amp;nbsp;&lt;/strong&gt;is a subfield of&amp;nbsp;machine learning&amp;nbsp;where automatic learning algorithms are applied to&amp;nbsp;metadata&amp;nbsp;about machine learning experiments. As of 2017, the term had not found a standard interpretation, however, the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing&amp;nbsp;learning algorithms&amp;nbsp;or to learn (induce) the learning algorithm itself, hence the alternative term&amp;nbsp;&lt;strong&gt;learning to learn.&quot;&lt;/strong&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/930/can-trained-model-supervise-other-machine-learning-models?show=931#a931</guid>
<pubDate>Mon, 28 Sep 2020 14:47:39 +0000</pubDate>
</item>
<item>
<title>Where can I find illustrative real life machine learning examples (In business,  work. etc.)?</title>
<link>https://ask.ghassem.com/924/where-find-illustrative-machine-learning-examples-business</link>
<description>Is there a website for finding illustrative real-life examples of using machine learning? For instance: for End to End Machine Learning, End to End Machine Learning, Classification, Clustering, and Unsupervised Learning.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/924/where-find-illustrative-machine-learning-examples-business</guid>
<pubDate>Tue, 22 Sep 2020 00:47:09 +0000</pubDate>
</item>
<item>
<title>Where can I find simple machine learning mathematics explained visually?</title>
<link>https://ask.ghassem.com/923/where-simple-machine-learning-mathematics-explained-visually</link>
<description>Could you please let me know where I can find simple machine learning mathematics explained visually?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/923/where-simple-machine-learning-mathematics-explained-visually</guid>
<pubDate>Mon, 21 Sep 2020 23:55:12 +0000</pubDate>
</item>
<item>
<title>Commented: How to calculate the class probabilities and classify using Naive Bayes classifier?</title>
<link>https://ask.ghassem.com/899/calculate-class-probabilities-classify-using-classifier?show=913#c913</link>
<description>We can apply Laplace smoothing, still will not affect the result.... it will definitely affect if have &amp;nbsp;1 more feature which make Banana probability Zero... i.e. RED colour... in that case we dont have any other solution but to apply Laplace smoothing</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/899/calculate-class-probabilities-classify-using-classifier?show=913#c913</guid>
<pubDate>Thu, 13 Aug 2020 06:27:31 +0000</pubDate>
</item>
<item>
<title>Commented: How to calculate Accuracy, Precision, Recall or F1?</title>
<link>https://ask.ghassem.com/789/how-to-calculate-accuracy-precision-recall-or-f1?show=911#c911</link>
<description>is F1 here is the F score in the attached document?</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/789/how-to-calculate-accuracy-precision-recall-or-f1?show=911#c911</guid>
<pubDate>Wed, 12 Aug 2020 16:23:54 +0000</pubDate>
</item>
<item>
<title>Comment edited: How to calculate Covariance Matrix and Principal Components for PCA?</title>
<link>https://ask.ghassem.com/652/how-calculate-covariance-matrix-and-principal-components?show=910#c910</link>
<description>Thank you so much!</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/652/how-calculate-covariance-matrix-and-principal-components?show=910#c910</guid>
<pubDate>Wed, 12 Aug 2020 15:54:22 +0000</pubDate>
</item>
<item>
<title>Commented: How to calculate feed-forward (forward-propagation) in neural network?</title>
<link>https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network?show=904#c904</link>
<description>agreed with hamzasi, I got the same answer, y hat = 0.993, Error = 40.56</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/603/calculate-feed-forward-forward-propagation-neural-network?show=904#c904</guid>
<pubDate>Tue, 11 Aug 2020 18:48:12 +0000</pubDate>
</item>
<item>
<title>Commented: How to optimize weights in Logistic Regression?</title>
<link>https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression?show=882#c882</link>
<description>Thanks for clarification Wahab, really appreciate it</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/639/how-to-optimize-weights-in-logistic-regression?show=882#c882</guid>
<pubDate>Wed, 15 Jul 2020 21:39:37 +0000</pubDate>
</item>
<item>
<title>Answered: How to calculate Softmax Regression probabilities in this example?</title>
<link>https://ask.ghassem.com/605/calculate-softmax-regression-probabilities-this-example?show=880#a880</link>
<description>Class 1: Setosa&lt;br /&gt;
&lt;br /&gt;
Class 2: Versicolor&lt;br /&gt;
&lt;br /&gt;
Class 3: Virginica&lt;br /&gt;
&lt;br /&gt;
The initialized weights:&lt;br /&gt;
&lt;br /&gt;
$w_{01} = w_{11}=w_{21} = 1$&lt;br /&gt;
&lt;br /&gt;
$w_{02} = w_{12}=w_{22} = 2$&lt;br /&gt;
&lt;br /&gt;
$w_{03} = w_{13}=w_{23} = 3$&lt;br /&gt;
&lt;br /&gt;
The weight equations:&lt;br /&gt;
&lt;br /&gt;
$z_1 = x_0w_{01} + x_1 w_{11} + x_2w_{21}$&lt;br /&gt;
&lt;br /&gt;
$z_2 = x_0w_{02} + x_1 w_{12} + x_2w_{22}$&lt;br /&gt;
&lt;br /&gt;
$z_1 = x_0w_{03} + x_1 w_{13} + x_2w_{23}$&lt;br /&gt;
&lt;br /&gt;
1) $x_0 = 1 \quad x_1 = 4.6 \quad \text{and} \quad x_2 = 1.7$&lt;br /&gt;
&lt;br /&gt;
$z_1 = w_{01} + x_1 w_{11} + x_2w_{21} = 1 + 4.6 + 1.7 = 7.3$&lt;br /&gt;
&lt;br /&gt;
$z_2 = w_{02} + x_1 w_{12} + x_2w_{22} = 2 + 4.6(2) + 1.7(2) = 16.3$&lt;br /&gt;
&lt;br /&gt;
$z_3 = w_{03} + x_1 w_{13} + x_2w_{23} = 3 + 4.6(3) + 1.7(3) = 21.9$&lt;br /&gt;
&lt;br /&gt;
$e^{z_1} + e^{z_2} + e^{z_3} = e^{7.3} + e^{16.3} + e^{21.9} = 1480.3 + 11994994 + 3243763284 = 3255759758$&lt;br /&gt;
&lt;br /&gt;
$p^3 = \frac{e^{z_3}}{\sum_{i=1}^3 e^{z_i}} = \frac{3243763284}{3255759758} = 0.996315307$&lt;br /&gt;
&lt;br /&gt;
$\therefore$ probability to classify to virginica is 99.6%&lt;br /&gt;
&lt;br /&gt;
2) $x_0 = 1 \quad x_1 = 4.6 \quad &amp;nbsp;x_2 = 1.7 \quad x_3 = 5.5 \quad x_4 = 3$&lt;br /&gt;
$\begin{align*} z_1 &amp;amp;= x_0 + w_{11}x_1 + w_{21}x_2 + w_{31}x_3 + w_{41}x_4 \\ &amp;amp;= 1 + 4.6 + 1.7+5.5+3\\ &amp;amp;=15.8\end{align*}$&lt;br /&gt;
&lt;br /&gt;
$\begin{align*} z_2 &amp;amp;= x_0 + w_{12}x_1 + w_{22}x_2 + w_{32}x_3 + w_{42}x_4 \\ &amp;amp;= 2 + (2)4.6 + (2)1.7+(2)5.5+(2)3\\ &amp;amp;=31.6 \end{align*}$&lt;br /&gt;
&lt;br /&gt;
$\begin{align*} z_3 &amp;amp;= x_0 + w_{13}x_1 + w_{23}x_2 + w_{33}x_3 + w_{43}x_4 \\ &amp;amp;= 3 + (3)4.6 + (3)1.7+(3)5.5+(3)3\\ &amp;amp;=47.4 \end{align*}$&lt;br /&gt;
&lt;br /&gt;
$\begin{align*}\sum_{i=1}^3 e^{z_i}&amp;amp;=e^{z_1} + e^{z_2} + e^{z_3}\\ &amp;amp;= e^{15.8} + e^{31.6} + e^{47.4} \\&amp;amp;= 7275332 + 5.29e13 + 3.85e20 \\&amp;amp;= 3.850866845e20\end{align*}$&lt;br /&gt;
&lt;br /&gt;
$p^3 = \frac{e^{z_3}}{\sum_{i=1}^3 e^{z_i}} = \frac{3.850866316e20}{3.850866845e20} = 0.999999863$&lt;br /&gt;
&lt;br /&gt;
$\therefore$ probability to classify to virginica is 99.9%</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/605/calculate-softmax-regression-probabilities-this-example?show=880#a880</guid>
<pubDate>Tue, 14 Jul 2020 19:28:03 +0000</pubDate>
</item>
<item>
<title>Commented: How to calculate the probability and accuracy of a Logistic Regression classifier?</title>
<link>https://ask.ghassem.com/795/calculate-probability-accuracy-logistic-regression-classifier?show=877#c877</link>
<description>just compare your estimated results with column 3 &amp;quot;Sold&amp;quot;&lt;br /&gt;
&lt;br /&gt;
accuracy =Output/Input= (Correct Estimated Value)/(No Values in Column 3, which is 4)&lt;br /&gt;
&lt;br /&gt;
Accuracy= 3/4= 75% in both cases if you compare your result with Column 3&lt;br /&gt;
&lt;br /&gt;
Hope it will help</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/795/calculate-probability-accuracy-logistic-regression-classifier?show=877#c877</guid>
<pubDate>Mon, 13 Jul 2020 20:10:22 +0000</pubDate>
</item>
<item>
<title>Answered: What is difference between Support vector machine and Support Vector Classification?</title>
<link>https://ask.ghassem.com/863/difference-between-support-machine-support-classification?show=864#a864</link>
<description>&lt;p&gt;In machine learning, support-vector machines (SVMs, also support-vector networks) are supervised learning models with associated learning algorithms that analyze data used for &lt;strong&gt;both classification and regression analysis&lt;/strong&gt;. When you call it Support Vector Classification, it means you are using these models for classification tasks.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/863/difference-between-support-machine-support-classification?show=864#a864</guid>
<pubDate>Sun, 17 May 2020 22:35:51 +0000</pubDate>
</item>
<item>
<title>Reshown: How to calculate k-means clustering with a numerical example?</title>
<link>https://ask.ghassem.com/656/how-to-calculate-k-means-clustering-with-numerical-example?show=656#q656</link>
<description>&lt;p&gt;Use the k-means algorithm and Euclidean distance to cluster the following 8 examples into 3 clusters:&lt;/p&gt;

&lt;p&gt;$A1=(2,10),&amp;nbsp;A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9)$.&lt;/p&gt;

&lt;p&gt;Suppose that the initial seeds (centers of each cluster) are $A1$, $A4$ and $A7$. Run the k-means algorithm for 1 epoch only. At the end of this epoch show:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;a)&lt;/strong&gt; The new clusters (i.e. the examples belonging to each cluster)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b)&lt;/strong&gt; The centers of the new clusters&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;c)&lt;/strong&gt; Draw a 10 by 10 space with all the 8 points and show the clusters after the first epoch and the new centroids.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;d)&lt;/strong&gt; How many more iterations are needed to converge? Draw the result for each epoch&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/656/how-to-calculate-k-means-clustering-with-numerical-example?show=656#q656</guid>
<pubDate>Fri, 17 Apr 2020 18:34:37 +0000</pubDate>
</item>
<item>
<title>Answered: Pre trainned word Embeddings and Preproceess</title>
<link>https://ask.ghassem.com/849/pre-trainned-word-embeddings-and-preproceess?show=850#a850</link>
<description>&lt;p&gt;I believe &lt;a rel=&quot;nofollow&quot; href=&quot;https://www.guru99.com/word-embedding-word2vec.html&quot;&gt;this article&lt;/a&gt; will help you to understand where to use each of the techniques you mentioned.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/849/pre-trainned-word-embeddings-and-preproceess?show=850#a850</guid>
<pubDate>Sat, 11 Apr 2020 07:23:18 +0000</pubDate>
</item>
<item>
<title>Answered: Can PCA be used for supervised learning?</title>
<link>https://ask.ghassem.com/832/can-pca-be-used-for-supervised-learning?show=833#a833</link>
<description>PCA can be used indirectly in supervised learning tasks such as classification and regression. When you have huge number of features, one way to reduce the number of features and probably avoid overfitting is using a feature reduction method such as PCA. Therefore, PCA can be used in preprocessing step to reduce the number of features.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/832/can-pca-be-used-for-supervised-learning?show=833#a833</guid>
<pubDate>Tue, 18 Feb 2020 22:12:16 +0000</pubDate>
</item>
<item>
<title>Answered: How to calculate residual errors for linear regression and interpret regression metrics?</title>
<link>https://ask.ghassem.com/829/calculate-residual-regression-interpret-regression-metrics?show=830#a830</link>
<description>&lt;p&gt;You can take a look at &lt;a rel=&quot;nofollow&quot; href=&quot;https://www.dataquest.io/blog/understanding-regression-error-metrics/&quot;&gt;this article&lt;/a&gt; which shows with an example linear regression equation. For example, the definition of MAE is given in the following figure:&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/tqnei6J.jpg&quot;&gt;https://i.imgur.com/tqnei6J.jpg&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/829/calculate-residual-regression-interpret-regression-metrics?show=830#a830</guid>
<pubDate>Tue, 18 Feb 2020 18:37:34 +0000</pubDate>
</item>
<item>
<title>Answered: How can I find the &quot;Sate of the art&quot; approaches in Machine Learning?</title>
<link>https://ask.ghassem.com/801/how-can-find-the-sate-of-the-art-approaches-machine-learning?show=826#a826</link>
<description>&lt;p&gt;A great website, called &lt;a rel=&quot;nofollow&quot; href=&quot;https://paperswithcode.com/sota&quot;&gt;PaperWithCode&lt;/a&gt;,&amp;nbsp;lists all the latest approaches&amp;nbsp;with their source codes on GitHub. For example, if you are looking the latest&amp;nbsp;methods&amp;nbsp;for object detection, you can take a look at the latest Object Detection methods on COCO test-dev&amp;nbsp;timeline &lt;a rel=&quot;nofollow&quot; href=&quot;https://paperswithcode.com/sota/object-detection-on-coco&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/kTtsmtQ.png&quot;&gt;https://i.imgur.com/kTtsmtQ.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/801/how-can-find-the-sate-of-the-art-approaches-machine-learning?show=826#a826</guid>
<pubDate>Tue, 18 Feb 2020 16:48:24 +0000</pubDate>
</item>
<item>
<title>Answered: How to map (string compare) a string with 10000+ strings in DB? which is the best way to do it?</title>
<link>https://ask.ghassem.com/809/how-map-string-compare-string-with-10000-strings-which-best?show=825#a825</link>
<description>I think the problem you mentioned can be solved using a tree structure such as Trie:&lt;br /&gt;
&lt;a href=&quot;https://www.cs.helsinki.fi/u/tpkarkka/opetus/10s/spa/lecture07.pdf&quot; rel=&quot;nofollow&quot; target=&quot;_blank&quot;&gt;https://www.cs.helsinki.fi/u/tpkarkka/opetus/10s/spa/lecture07.pdf&lt;/a&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/809/how-map-string-compare-string-with-10000-strings-which-best?show=825#a825</guid>
<pubDate>Tue, 18 Feb 2020 16:39:42 +0000</pubDate>
</item>
<item>
<title>Answered: why after applied the OneHotEncoder, it will create one more column whatis that column for?</title>
<link>https://ask.ghassem.com/814/after-applied-onehotencoder-will-create-column-whatis-column?show=824#a824</link>
<description>As you descierbed in the comments, additional columns are used as dummy variables.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/814/after-applied-onehotencoder-will-create-column-whatis-column?show=824#a824</guid>
<pubDate>Tue, 18 Feb 2020 16:38:41 +0000</pubDate>
</item>
<item>
<title>Answered: Can I use a single Pipeline for multiple estimators in scikit-learn?</title>
<link>https://ask.ghassem.com/819/can-use-single-pipeline-for-multiple-estimators-scikit-learn?show=820#a820</link>
<description>&lt;p&gt;Yes, it is possible. Please take a look at &lt;a rel=&quot;nofollow&quot; href=&quot;https://stackoverflow.com/questions/50285973/pipeline-multiple-classifiers?answertab=votes#tab-top&quot;&gt;this post &lt;/a&gt;on StackOverFlow.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/819/can-use-single-pipeline-for-multiple-estimators-scikit-learn?show=820#a820</guid>
<pubDate>Tue, 18 Feb 2020 14:19:54 +0000</pubDate>
</item>
<item>
<title>Answer selected: score() vs accuracy_score() in sklearn</title>
<link>https://ask.ghassem.com/777/score-vs-accuracyscore-in-sklearn?show=780#a780</link>
<description>&lt;p&gt;Q1: knn.score(X_test, y_test) calls accuracy_score of&amp;nbsp;sklearn.metrics&amp;nbsp;for classifier. For regressor, it calls r2_score, which is the coefficient of determination defined in the statistics course.&lt;/p&gt;

&lt;p&gt;You can find the source code of knn.score here. It’s open source.&amp;nbsp;&lt;a rel=&quot;nofollow&quot; href=&quot;https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/base.py#L324&quot;&gt;https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/base.py#L324&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Q2: accuracy_score is not a method of knn, but a method of sklearn.metrics. If normalize argument is true, accuracy_score(knn.predict(X_test),y_test) returns the same result as knn.score(X_test,y_test). You can check document below for more details&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html&quot;&gt;https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Q3: As explained above, yes, they return the same result,&amp;nbsp;but only in the give situation&lt;/p&gt;

&lt;p&gt;Q4: If there is bias after the split, the bias still exists whichever data set is compared. Here the bias exists when the data distribution in the train set and the data distribution in the whole set are not the same. Taking the Iris dataset as example, if the distribution of the three classes (Setosa, Versicolour, Virginica) is 50-50-50 in the 150 samples, and you make a 20-80 split, then the distribution of the three classes in the train set should be 40-40-40. If not, there’s bias, because your train set is different from the population in terms of data distribution.&lt;/p&gt;

&lt;p&gt;This may be why Elon&amp;nbsp;doesn&#039;t trust the simulation and insist on using the data from the real world to train the Tesla auto-pilot system.&amp;nbsp;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/777/score-vs-accuracyscore-in-sklearn?show=780#a780</guid>
<pubDate>Wed, 22 Jan 2020 01:53:33 +0000</pubDate>
</item>
<item>
<title>Commented: Best algorithm for table reservation</title>
<link>https://ask.ghassem.com/733/best-algorithm-for-table-reservation?show=737#c737</link>
<description>Thanks for reply. :)&lt;br /&gt;
I meant a restaurant table. You can also reframe this how to forecast a single hotel room reservations.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/733/best-algorithm-for-table-reservation?show=737#c737</guid>
<pubDate>Tue, 22 Oct 2019 04:22:58 +0000</pubDate>
</item>
<item>
<title>Answered: What are the types of Classification and regression algorithms in Machine learning ?</title>
<link>https://ask.ghassem.com/660/types-classification-regression-algorithms-machine-learning?show=661#a661</link>
<description>Depending on the definition of the model and how the cost function is designed one estimator can be used for both classification and regression. Some estimators such as k-NN or SVM can be used for both regression and classification. Logistic Regression and Softmax Regressions are just (normally) useful for classification. Therefore, you need to search to see if there is a version of the estimator with the same name that can be used for different purposes.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/660/types-classification-regression-algorithms-machine-learning?show=661#a661</guid>
<pubDate>Fri, 28 Jun 2019 19:10:38 +0000</pubDate>
</item>
<item>
<title>Answer selected: How to calculate the class probabilities and classify using Naive Bayes classifier for NLP?</title>
<link>https://ask.ghassem.com/654/calculate-class-probabilities-classify-using-classifier?show=655#a655</link>
<description>&lt;p&gt;The solution is provided in this &lt;a rel=&quot;nofollow&quot; href=&quot;https://monkeylearn.com/blog/practical-explanation-naive-bayes-classifier/&quot;&gt;url&lt;/a&gt;. The classifier will assign &lt;strong&gt;Sport&lt;/strong&gt;&amp;nbsp;for the tag of &quot;&lt;em&gt;A very close game&lt;/em&gt;&quot;.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/654/calculate-class-probabilities-classify-using-classifier?show=655#a655</guid>
<pubDate>Thu, 27 Jun 2019 03:23:08 +0000</pubDate>
</item>
<item>
<title>Answer selected: How to perform a classification or regression using k-NN?</title>
<link>https://ask.ghassem.com/658/how-to-perform-a-classification-or-regression-using-k-nn?show=659#a659</link>
<description>&lt;p&gt;&lt;strong&gt;a)&lt;/strong&gt; You can calculate the distances or simply visualize it:&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/PxZn1Sp.png&quot;&gt;https://i.imgur.com/PxZn1Sp.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(The code of above visualization is available&amp;nbsp;&lt;a rel=&quot;nofollow&quot; href=&quot;https://github.com/tofighi/MachineLearning/blob/master/knn_simple_example.ipynb&quot;&gt;here&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Based on the above visualization, 3 closest neighbors&amp;nbsp;are blue points. Therefore, &lt;strong&gt;the predicted class is blue.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b) &lt;/strong&gt;The 3 closest neighbors are data points: $(0,1,\$50)$,&amp;nbsp;$(1,0,\$30)$, and&amp;nbsp;$(1,2,\$40)$. Therefore, the estimated price is the mean of the target values $\frac{50+40+30}{3}=\$40$&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/658/how-to-perform-a-classification-or-regression-using-k-nn?show=659#a659</guid>
<pubDate>Thu, 27 Jun 2019 03:22:07 +0000</pubDate>
</item>
<item>
<title>Answered: What is the difference between cross-validation and validation set?</title>
<link>https://ask.ghassem.com/648/what-the-difference-between-cross-validation-and-validation?show=649#a649</link>
<description>There are several variations of the cross-validation algorithm. In the k-fold cross-validation, we have k-fold, and we divide the training set to multiple folds and run k-fold cross-validation. In some cases, usually when we are running machine learning diagnosis tests to see if our problem suffers from high variance or high bias, we split the dataset to 3 different splits of train, validation, and test, and measure the trained model on train set on all of these 3 splits by adding more data points each time.&lt;br /&gt;
&lt;br /&gt;
Another usage for having a separate validation set is when we have a complex model which takes a long time to train or when we deal with big data. In any of these two cases such as when we are training deep neural networks, k-fold cross validation is so expensive. In each epoch, we just validate the model trained by the validation set and based on the results on the validation set, we continue to update the hyper-parameters. We also compare the results with the test set that we have never used during the training to make sure our training can generalize.</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/648/what-the-difference-between-cross-validation-and-validation?show=649#a649</guid>
<pubDate>Wed, 19 Jun 2019 19:06:12 +0000</pubDate>
</item>
<item>
<title>Answered: In DBSCAN algorithm, how should we choose optimal eps and minimum points?</title>
<link>https://ask.ghassem.com/646/dbscan-algorithm-how-should-choose-optimal-minimum-points?show=647#a647</link>
<description>&lt;p&gt;There is no general way of choosing minPts. It depends on the context of the problem and what you are looking for. Similar to other unsupervised learning problems, the results could be totally wrong, even if you choose some optimal values for the hyperparameters. However, we can mention to some facts:&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rule of Thumb values for minPts and eps:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://en.wikipedia.org/wiki/DBSCAN#Parameter_estimation&quot;&gt;In this page&lt;/a&gt;, the rule of thumb values is discussed.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;minPts:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A low minPts will create clusters for outliers or noise. A low minPts means it will build more clusters from outliers, therefore we&amp;nbsp;&lt;strong&gt;don&#039;t choose a too small value for it.&amp;nbsp;&lt;/strong&gt;minPts is best set by a domain expert who understands the data well. Unfortunately many cases we don&#039;t know the domain knowledge, especially after data is normalized. One heuristic approach is using&amp;nbsp;$\ln(n)$, where&amp;nbsp;$n$&amp;nbsp;is the total number of points to be clustered.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;epsilon (eps):&lt;/strong&gt;&lt;br&gt;
For epsilon, there are various aspects. It again boils down to choosing whatever works on&amp;nbsp;&lt;em&gt;this&lt;/em&gt;&amp;nbsp;data set and&amp;nbsp;&lt;em&gt;this&lt;/em&gt;&amp;nbsp;minPts and&amp;nbsp;&lt;em&gt;this&lt;/em&gt;&amp;nbsp;distance function and&amp;nbsp;&lt;em&gt;this&lt;/em&gt;&amp;nbsp;normalization. You can try to do a kNN distance (k-distance plot) histogram for your dataset and choose a &quot;&lt;em&gt;knee&lt;/em&gt;&quot; there, but there might be no visible one, or multiple.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Basically, you compute the k-nearest neighbors (k-NN) for each data point to understand what is the density distribution of your data, for different k. the KNN is handy because it is a non-parametric method. Once you choose a minPts&amp;nbsp;(which strongly depends on your data again), you fix k to that value. Then you use an epsilon the k-distance corresponding to the area of the k-distance plot (for your fixed k) with a low slope. The method proposed consists of computing the&amp;nbsp;&lt;strong&gt;k-nearest neighbor distances&lt;/strong&gt;&amp;nbsp;in a matrix of points. The idea is to calculate the average of the distances of every point to its k nearest neighbors. The value of $k$ will be specified by the user and corresponds to minPts. Next, these k-distances are plotted in ascending order. The aim is to determine the “knee”, which corresponds to the optimal&amp;nbsp;&lt;strong&gt;eps&lt;/strong&gt;&amp;nbsp;parameter. A knee corresponds to a threshold where a sharp change occurs along the k-distance curve.&amp;nbsp;It can be seen that the optimal&amp;nbsp;&lt;strong&gt;eps&lt;/strong&gt;&amp;nbsp;value is around a distance of 0.15.&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/2Om1mD8.png&quot;&gt;https://i.imgur.com/2Om1mD8.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OPTICS and other extensions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Some extensions on top of the DBSCAN&amp;nbsp;is created such as OPTICS.&amp;nbsp; OPTICS produce hierarchical clusters, we can extract significant flat clusters from the hierarchical clusters by visual inspection, OPTICS implementation is available in Python module&amp;nbsp;&lt;a rel=&quot;nofollow&quot; href=&quot;https://codedocs.xyz/annoviko/pyclustering/classpyclustering_1_1cluster_1_1optics_1_1optics.html#afd44312d254b38fc1161be290e077eec&quot;&gt;pyclustering&lt;/a&gt;. One of the original authors of DBSCAN and OPTICS also proposed an automatic way to extract flat clusters, where no human intervention is required, for more information you can read&amp;nbsp;&lt;a rel=&quot;nofollow&quot; href=&quot;https://pdfs.semanticscholar.org/a426/67d0b7f8ed0a97e6d7e2881c6a35c8b23616.pdf&quot;&gt;this paper&lt;/a&gt;. Also, there are some other popular extensions such as HDBSCAN&amp;nbsp;that could be found &lt;a rel=&quot;nofollow&quot; href=&quot;https://en.wikipedia.org/wiki/OPTICS_algorithm#Extensions&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;At the end of the day, similar to other clustering algorithms, because we do not have labels, it is hard to get reliable results using clustering algorithms, and it is still one of the areas of need improvement in the future.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/646/dbscan-algorithm-how-should-choose-optimal-minimum-points?show=647#a647</guid>
<pubDate>Thu, 13 Jun 2019 17:54:32 +0000</pubDate>
</item>
<item>
<title>Answered: How do I Plot the linear classifier calculated with LIBLINEAR using sklearn?</title>
<link>https://ask.ghassem.com/629/plot-linear-classifier-calculated-liblinear-using-sklearn?show=638#a638</link>
<description>&lt;p&gt;I think &lt;a rel=&quot;nofollow&quot; href=&quot;https://python-graph-gallery.com/43-use-categorical-variable-to-color-scatterplot-seaborn/&quot;&gt;this article&lt;/a&gt; shows you how to achieve your goal by showing some examples of using a categorical variable to color scatterplot.&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/629/plot-linear-classifier-calculated-liblinear-using-sklearn?show=638#a638</guid>
<pubDate>Mon, 20 May 2019 16:25:15 +0000</pubDate>
</item>
<item>
<title>Answered: Could you please explain math symbols behind Machine Learning equations?</title>
<link>https://ask.ghassem.com/631/please-explain-symbols-behind-machine-learning-equations?show=632#a632</link>
<description>&lt;p&gt;The following figure shows the symbols which are common in machine learning:&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;nofollow&quot; href=&quot;https://i.imgur.com/nJVT8cN.png&quot;&gt;https://i.imgur.com/nJVT8cN.png&lt;/a&gt;&lt;/p&gt;</description>
<category>Machine Learning</category>
<guid isPermaLink="true">https://ask.ghassem.com/631/please-explain-symbols-behind-machine-learning-equations?show=632#a632</guid>
<pubDate>Sat, 18 May 2019 20:00:59 +0000</pubDate>
</item>
</channel>
</rss>