How Machines Understand Language
A guide to word embeddings — where meaning becomes mathematics, and vectors do the talking.
When a search engine retrieves a document about automobiles in response to a query about cars, it is not matching text character by character. Somewhere beneath the interface, the system understands that these two words are semantically related. The mechanism behind that understanding is the word embedding — and once you see the geometry, you cannot unsee it.
This article walks through the key mathematical operations that make embeddings work: distance, similarity, arithmetic, scaling, and the dot product. Each concept is illustrated with concrete numerical vectors so the math is visible, not just described. Real embeddings typically use hundreds of dimensions; the 3- and 4-dimensional examples here preserve all the structure while staying readable on a page.
Quick Reference: Embedding Operations
| Operation | Formula | Section 2-5 Result |
|---|---|---|
| Euclidean Distance | √( Σ (ai − bi)2 ) | d(Hot,Warm) = 0.346 d(Hot,Cold) = 2.163 |
| Cosine Similarity | (a·b) / (‖a‖×‖b‖) | cos(Hot,Warm) = +0.998 cos(Hot,Cold) = -0.499 |
| Vector Arithmetic | a ± b | King-Man+Woman → nearest Queen (d = 0.400) |
| Scalar Multiplication | λ · a | Large × 2 → near Gigantic Loud ÷ 2 → near Soft |
| Dot Product | a·b = Σ aibi | cos = 1.00 for both; dot 0.29 (soft) vs 2.61 (loud) |
✦ This article was generated with the assistance of Claude by Anthropic ✦