Skip to content

Word Embeddings (Word2Vec, GloVe)

What embeddings are

Embeddings map words to dense vectors:

  • similar words have similar vectors

Example intuition:

  • king and queen vectors are close
  • dog and puppy vectors are close

Why embeddings are better than BoW (sometimes)

BoW/TF-IDF are sparse and don’t capture similarity.

Embeddings capture:

  • semantic relationships
  • analogies (to some extent)

false


  flowchart LR
  W[Word] --> E[Embedding vector (dense)]
  E --> M[Model]

false

Word2Vec

Learns embeddings by predicting:

  • a word from its context (CBOW)
  • context words from a word (Skip-gram)

GloVe

Learns from global word co-occurrence statistics.

Practical note

In modern NLP, embeddings are often learned as part of a deep model (Transformers). But Word2Vec/GloVe are great for understanding the concept.

Mini-checkpoint

What does it mean if two words have a high cosine similarity between embeddings?

  • they appear in similar contexts and are semantically related.

If this helped you, consider buying me a coffee β˜•

Buy me a coffee

Was this page helpful?

Let us know how we did