Cosine similarity in 60 seconds
Cosine similarity is the one math concept of this module. It is straightforward, and once you see it, every "vector database" makes sense.
If two arrows point in the same direction, they are similar. If they point opposite ways, they are dissimilar. Cosine similarity is the math of "are these arrows pointing the same way?".
Cosine similarity = the cosine of the angle between two vectors. Range: -1 (opposite) to 1 (identical direction). For embeddings, it usually lies between 0 and 1.
Formula: cos(theta) = (a . b) / (|a| * |b|). Implementations are one line in any library.
import numpy as np
def cosine(a, b):
a, b = np.array(a), np.array(b)
return float(a.dot(b) / (np.linalg.norm(a) * np.linalg.norm(b)))
OpenAI's embeddings are normalized (length 1), so cosine simplifies to just the dot product.
Quick recall
3 prompts · think before you flip
Prompt 1 of 3
What is cosine similarity?
Quiz time
1 question · tap an answer to check it
1. Two identical embeddings have cosine similarity
Finished lesson 8.2?
Mark complete to update your module progress and unlock the streak.