TF-IDF Weighted W2V Intuition
TF-IDF -W2V
Lets take a sentence :
Text = “I’m going to make him an offer he can’t refuse”
Step 1 : Cleaning text
text = str(text).lower() → to lower text
text = text.replace(“i’m”,”i am”).replace(“can’t”,”cannot”) → expanding contradiction
>>> Cleaned text → “ i am going to make him an offer he cannot refuse”
Step 2 Remove stopwords
Stopwords.remove(“ i ”)
Stopwords.remove(“ him ”)
Stopwords.remove(“ he ”)
>>>> final_text → i going make him offer he cannot refuse
Step 3 Apply TFIDF Weighted W2V on this final_text
a] Compute Tf-idf for final_text:
b ] compute tf-idf w2v for final_text : t1 * w2v(w1) [i.e first compute w2v for word 1 (w1) then multiply it with t1 [i.e. tf-idf vector w.r.to final_text for w1]]
Complete formula : t1 * w2v(w1) + t2 * w2v(w2) + t3 * w2v(w3) +…….+ t8 * w2v(w8)
c ] Divide it by tf-idf vectors:
Final Formula revisited: