What Are Embeddings? How AI Represents Meaning as Numbers
At some point in almost any substantive conversation about AI architecture, the word embeddings comes up. It gets mentioned in the context of semantic search, RAG systems, recommendation engines, and vector databases, usually without much explanation, on the assumption that the audience already knows what it means. If you've been nodding along while making a mental note to look it up later, this is the piece for you.
Embeddings are a way of representing text, or images, or audio, or almost any kind of data, as a list of numbers in such a way that similar things end up with similar numbers. That definition is simple enough to state in one sentence, but what it makes possible is worth unpacking carefully.
Start with the problem embeddings are solving. Computers are good at working with numbers and not naturally good at working with language. A word like "dog" means nothing to a machine unless you give it a representation the machine can compute with. Early approaches to this were blunt: assign each word an arbitrary number, or represent a document as a list of which words appeared in it and how often. These approaches work for some tasks but they throw away almost everything that makes language meaningful. They have no way of knowing that "dog" and "puppy" are closely related, or that "bank" means something different in "river bank" than in "savings bank," or that "the food was not bad" is closer in meaning to "the food was good" than to "the food was terrible."
Embeddings solve this by placing words, sentences, or documents in a high-dimensional mathematical space where proximity reflects semantic similarity. Two words or phrases that mean similar things end up close together in that space. Things that are unrelated end up far apart. The classic demonstration is that in a well-trained embedding space, the vector for "king" minus the vector for "man" plus the vector for "woman" lands close to the vector for "queen." The model has learned something genuine about the relationships between concepts, not because it was told about those relationships explicitly, but because it absorbed them from patterns in large amounts of text.
This is what makes embeddings useful in practice. When you build a semantic search system, you convert both your documents and the user's query into embeddings, and then find the documents whose embeddings are closest to the query embedding. Unlike keyword search, which looks for literal matches, semantic search finds documents that mean the same thing even if they use different words. A search for "how do I cancel my subscription" will surface documents about "ending your membership" or "stopping your plan" because those phrases live close together in embedding space.
RAG systems, which were covered earlier in this blog, depend on embeddings for exactly this reason. When a user asks a question, the system converts it to an embedding, searches a vector database for the stored embeddings closest to it, retrieves the corresponding documents, and passes them to the language model. The retrieval step only works well if the embedding model understands meaning well enough to find genuinely relevant content rather than just keyword matches. The quality of the embedding model is therefore a significant variable in the quality of the overall system.
Embeddings are also central to how recommendation systems work. When a streaming service recommends something you might like, or an e-commerce platform surfaces related products, there's often an embedding model representing items and users in the same space, with recommendations coming from finding items whose embeddings are close to the user's embedding based on their history. The same mathematical structure that captures semantic similarity in language captures preference similarity in behavior data.
A few practical things are worth knowing if you're working with embeddings rather than just learning about them. Different embedding models produce different vector spaces, and embeddings from one model can't be directly compared to embeddings from another. The dimensionality of the embedding, how many numbers are in the list representing each piece of text, affects both the richness of the representation and the computational cost of working with it. And embedding models, like all models, reflect their training data, which means they can encode biases present in that data in ways that affect downstream system behavior.
The concept is more approachable than the terminology suggests. If you've ever wondered how a search system finds what you mean rather than just what you typed, or how a recommendation engine seems to understand your taste, or how a RAG system knows which documents are relevant to a question, embeddings are usually a significant part of the answer. Understanding them gives you a clearer picture of what's actually happening inside the AI systems you're building on or building with.