Ask Ghassem - Recent questions tagged natural-language-process

Very short text classification when category text should be replaced by another category text?

Thu, 11 Feb 2021 12:48:47 +0000

I need some tool to classify articles based on short category text which consists of two or three words separated by '-'. The RSS/XML tag content is for example:

Foreign - News

Football - Foreign

I created my own categories in DB and now I need to classify categories from parsed RSS of this news source, so it fits news categories defined by me.

I would, for example need all articles containing category "football" to be identified as a category Sport but sometimes those categories XML tags contains exact match like Foreign - News should belong in the DB to category defined by me as Foreign.

Since I used only trained decision trees frameworks from AI so for another project so far, I would like to hear advice about probably AI based approach, technique or particular framework I can use to solve this problem. I don't want to get into a dead-end street by my own poor, in the field of AI not very experienced decision.

While it can be solved by many ifs and 'contains' function, it seems to me like not a very good solution.

TLDR; I need basically something like "clever, flexible and universal if-elseif".

NOTE: I can also use article description text, if that would be necessary but it seems to me that this former category text is unambiguous enough for this kind of problem.

Binary Classification and neutral tag

Sat, 30 Jan 2021 10:08:01 +0000

I am trying to create a sentiment analysis model using binary classification as loss.I have a batch of tweets that some of them are tagged as positive (labeled as 1) and negative (labeled as 0).I manage to gather some tweets that are tagged as neutral but there are less tweets than positive and negative.My thinking is to tag them with 0.5 to balance the classification probability.Is this legit?

Pre trainned word Embeddings and Preproceess

Fri, 10 Apr 2020 12:08:09 +0000

How should i preprocess my data if i am gonna use a pretrainned word embedding like glove or word2vec?Should I use stemming or stopword removal techniques?

How to calculate the class probabilities and classify using Naive Bayes classifier for NLP?

Wed, 26 Jun 2019 19:43:41 +0000

We want to use Naive Bayes for tagging documents. It is a classification task that we want to assign a class (tag) to each string. We currently have two tags: Sport and Not Sport

Which tag does the sentence A very close game belong to? Using Naive Bayes classifier, calculate the class probability for Sport and Not sport for this sentence based on the dataset and decide about the tag.

Text	Tag
“A great game”	Sports
“The election was over”	Not sports
“Very clean match”	Sports
“A clean but forgettable game”	Sports
“It was a close election”	Not sports