What is semantic search & how to implement it?

April 6, 2025 (1w ago)

#vector embeddings
#semantic search
#llm
#db

GIF of search icon rotating around computer screen

Last week, I was assigned an interesting task at work to implement semantic search in our API pipeline so that our LLM could answer customer queries more accurately and in real-time. I had a rough idea of how semantic search worked shoutout to my ML professor, that class finally paid off, but I didn't really know how to get started with implementing it. Here's how I did it.

TL;DR

Semantic search focuses on matching meaning rather than just keywords. I implemented it using OpenAI's embeddings model, optimized the vector dimensions for better performance, created a fast indexing layer, and updated our query flow to use semantic similarity instead of parameter-based matching. Result was a smarter search with faster, more relevant responses improving performance and accuracy of the LLM responses.

What is semantic search and how is it different?

rest vs semantic search based agents Traditional search systems rely on lexical matching where they look for exact or partial keyword matches in the data. If a user searches for “laptop”, the system will only return results that contain the exact word “laptop.” It doesn't understand that “chromebook” or “MacBook” might mean the same thing in context.

Semantic search works differently. Instead of just matching strings, it tries to match meanings.

So even if the user doesn't use the exact keyword in the dataset, the search engine can still find conceptually relevant results. This is what makes semantic search more intelligent and useful than plain keyword-based search.

How to implement semantic search?

1. Choose a model and generate embeddings

2. Choose the right dimensionality

3. Index Your Embeddings

4. Update the Query Flow

Incase you are wondering if it's worth the effort, here are some performance benchmarks I ran: semantic search performance benchmarks

Final Thoughts

signature gif