We get this question every week. A founder has a pile of docs and wants a chatbot. Should they do RAG, or is plain keyword search enough?
The honest answer: it depends on what users will ask.
When vanilla search wins
If your users know the exact words in the documents, search wins. It is faster, cheaper, and easier to debug. A help center where people search for "refund policy" does not need embeddings. It needs a good index.
You should reach for plain search when:
- Users ask short, factual queries
- Documents are well-titled and well-tagged
- Cost matters and latency matters more
When RAG wins
RAG earns its complexity when users ask questions in their own words, and the answers live across multiple documents.
A user asks "what should I do if my payment failed during checkout?" The answer is in three docs and none of them use the word "failed." This is where embeddings shine. The model understands intent.
You should reach for RAG when:
- Users ask conversational questions
- Answers require synthesis across documents
- The corpus is too large to skim by hand
The hybrid play
Most production systems we ship are hybrid. Keyword search first, then re-rank the top results with embeddings, then pass the best to the LLM. You get the speed of search and the smarts of retrieval.
If you can only build one thing this quarter, start with keyword search and a small eval set. Then upgrade to hybrid when the misses pile up.