TF-IDF Weighted W2V Intuition

Nihar Jamdar
2 min readFeb 23, 2021

TF-IDF -W2V

Lets take a sentence :

Text = “I’m going to make him an offer he can’t refuse”

Step 1 : Cleaning text

text = str(text).lower() → to lower text

text = text.replace(“i’m”,”i am”).replace(“can’t”,”cannot”) → expanding contradiction

>>> Cleaned text → “ i am going to make him an offer he cannot refuse”

Step 2 Remove stopwords

Stopwords.remove(“ i ”)

Stopwords.remove(“ him ”)

Stopwords.remove(“ he ”)

>>>> final_text → i going make him offer he cannot refuse

Step 3 Apply TFIDF Weighted W2V on this final_text

a] Compute Tf-idf for final_text:

b ] compute tf-idf w2v for final_text : t1 * w2v(w1) [i.e first compute w2v for word 1 (w1) then multiply it with t1 [i.e. tf-idf vector w.r.to final_text for w1]]

Complete formula : t1 * w2v(w1) + t2 * w2v(w2) + t3 * w2v(w3) +…….+ t8 * w2v(w8)

c ] Divide it by tf-idf vectors:

This is known as weighted average

Final Formula revisited:

i = words in sentence , t = tf-idf vectors , w = w2v computed vectors

--

--