Ask Ghassem - Recent questions tagged predict

forecast log transformed fitted values for 2 years using ARMA model

Wed, 04 May 2022 20:31:44 +0000

Input is a stock price in exponential transformation. We are asked to forecast using ARMA results for 2 years.

How can you build dynamic pricing model with data only from rigid pricing?

Fri, 21 Jan 2022 06:44:31 +0000

I want to build a dynamic pricing model which means if product is too expansive for a client and there is a risk that we might loose a client we lower the price for them but if client doesn't care that much about the price we might increase price a little.

All the articles I've seen describe some kind of A/B testing for the pricing and then create a model.

I want to build a model only on the existing rigid pricing data. So I have prices offered to customers and I know who bought the product and who went to other company.

How can I do the increasing price part?

When dealing with categorical values, should the 'year' column be encoded using OHE or OrdinalEncoder?

Sat, 18 Dec 2021 18:46:07 +0000

It's a car prices dataset, and so I'm assuming that the more recent the more value a car should have. The values in the 'year' column simply consist of years from 1995 to 2020.
I am trying to predict the selling price of the car.

I'm a bit new to ML, currently still doing my undergraduate so any help / tips are appreciated. Thank you.

How do I know which encoder to use to convert from categorical variables to numerical?

Mon, 29 Nov 2021 04:09:06 +0000

So say I have a column with categorical data like different styles of temperature: 'Lukewarm', 'Hot', 'Scalding', 'Cold', 'Frostbite',... etc.

I know that we can use pd.get_dummies to convert the column to numerical data within the dataframe, but I also know that there are other 'converters' (not sure if that's the correct terminology) that we can use, i.e. OneHotEncoder from Sk-learn (like I could use the pipeline module to make a nice pipeline and feed my dataframe through the pipeline to also get my categorical data encoded to numerical).

How do I know which to use? Does it matter? If it does matter, when does it matter the most (i.e. what types of problems? When there are lots of categorical variables, or few?) If anyone can give me any pointers on this type of stuff I'd greatly appreciate it.

how many samples do we need to test image segmentation using synthetic data ?

Mon, 21 Jun 2021 12:26:32 +0000

Hello,

I trained a CNN using synthetic data to perform a segmentation task on human faces. During the test and to evaluate the prediction of this network, I used 200 examples from the database to compute precision and recall.

Is this number sufficient, knowing that I control myself the data generator and that I build the database by randomly drawing the elements using centered Gaussian distributions.

Thank you,

Do I need to save the standardization transformation?

Tue, 15 Dec 2020 13:06:48 +0000

When I standardized my data when I created my model. Do I need to save the standardization transformation when I want to predict with my model new data ?

How to predict from unseen data?

Tue, 17 Nov 2020 16:18:28 +0000

Hi. I have a question about model-based predictions when data is only available after the fact. Let me give you an example. I try to predict the result (HOME, AWAY or a DRAW) of the match based on data like number of shots, ball possession, number of fouls, etc.

TARGET	TEAM 1	TEAM 2	possesion team 1	possesion team 2	shots team 1	shots team 2	fouls team 1	fouls team 2
HOME	Arsenal	Chelsea	60	40	12	8	5	7

TARGET

TEAM 1

TEAM 2

possesion

team 1

possesion

team 2

shots

team 1

shots

team 2

fouls

team 1

fouls

team 2

HOME

Arsenal

Chelsea

Let's say I'm already after training the model and I want to see if I can predict the upcoming match. However, this match is only a few days away and I want to know the result of the model today. I understand that if the match had already taken place and I had the data, I could test it on the model and get the result. The goal is for the model to predict what will happen before the match.

Is it possible at all? What are my options? Should I only select pre-match variables? For example, last game form, match referee etc or should I aggregate the variables and include average possession, average shots and average number of fouls from recent matches?

How to model unknown yet data

Tue, 27 Oct 2020 10:39:47 +0000

So far, I have modeled on known historical data. What if there are variables known only after the fact?
Let me give you an example. I want to predict the outcome of the match, win, lose or draw. I use variables from previous games such as ball possession, number of shots, corners, etc. Let's say the Chelsea-Arsenal game is approaching Saturday. How am I supposed to build a model and predict the result if this data is not yet available for my event? What to do in such cases, is it possible to forecast such data?