Home / Blog / How I Built a Semantic Search Engine in a Weekend ...

How I Built a Semantic Search Engine in a Weekend and Saved My SaaS from Dying

By CaelLee | | 8 min read

How I Built a Semantic Search Engine in a Weekend and Saved My SaaS from Dying

I almost shut down my SaaS six months ago. Not exaggerating—I was staring at a 12% monthly churn rate, support tickets were literally drowning me, and I was sleeping maybe 4 hours a night manually tagging customer queries like some kind of zombie. Then I stumbled onto OpenAI's embeddings API, built a semantic search engine in a weekend, and it basically saved my business. Here's the full build process, the numbers, and the screw-ups I made so you don't have to.

Back in March, I was running this customer support tool for e-commerce stores. Think Intercom-lite, but specifically for Shopify merchants. The problem? My search feature was absolute garbage. It relied on exact keyword matching, so if a customer typed "how do I get my money back" and our help docs said "refund policy," they'd get zero results. Zip. Ticket volume kept climbing. I was burning out hard.

Pieter Levels talks a lot about solving your own problems first. This was mine. I needed a search engine that understood intent, not just keywords. Traditional full-text search (I was using PostgreSQL's tsvector) couldn't handle synonyms or paraphrasing at all. Elasticsearch felt like massive overkill for my tiny user base. Then I remembered reading about embeddings on Simon Willison's blog back in like... February? Maybe January. Time's a blur when you're not sleeping.

What Actually Are Embeddings?

Okay so here's the non-technical version—and I'm probably going to butcher this but whatever. Embeddings turn text into a list of 1,536 numbers (a vector) where similar meanings are close together in mathematical space. "Refund my order" and "I want my money back" end up with vectors that are basically neighbors. Exact keywords don't matter.

Actually, wait—I should clarify that the 1,536 dimensions is specific to OpenAI's model. If you use something like all-MiniLM-L6-v2 from sentence-transformers, it's 384 dimensions. Smaller, faster, but less nuanced. Tradeoffs, right?

OpenAI's text-embedding-3-small model costs $0.02 per 1,000 tokens. For my scale, that's pocket change. The real magic is combining embeddings with a vector database—I used pgvector, a PostgreSQL extension, because I was already on Supabase and didn't want to learn yet another tool.

The Build: Weekend Timeline

Friday night, 9 PM: Started reading OpenAI docs. Created an API key. Wrote this janky Python script to generate embeddings for my 2,400 help articles. Processed everything in batches of 100, added a 1-second delay between calls to avoid rate limits. Took about 45 minutes.

Cost: $0.17 total. I literally laughed.

Saturday morning: Installed pgvector on Supabase. One SQL command: CREATE EXTENSION vector; Added an embedding column to my articles table. This was surprisingly painless. I'd expected database hell—migration scripts failing, version conflicts, the usual—but Supabase handled it like a champ. Their pgvector support is solid as of mid-2024.

Saturday afternoon: Built the search function. Here's the core logic:

  1. User types a query
  2. Generate embedding for that query via OpenAI API
  3. Run a cosine similarity search in pgvector: SELECT * FROM articles ORDER BY embedding <=> query_embedding LIMIT 5;
  4. Return results

The <=> operator calculates distance between vectors. Smaller distance = more relevant. I set a threshold of 0.8 similarity; anything below that gets filtered out. Well... that's what I thought was right at the time. More on that disaster in a minute.

Sunday: Built a simple dashboard to monitor search quality. Tracked click-through rates on results, zero-result queries, and average response time. Deployed to production at 11 PM. Couldn't sleep anyway.

The Numbers That Changed Everything

Here's what happened in the first 30 days:

The churn rate that was killing me? It dropped from 12% to 6.8% in 60 days. Turns out, when customers can actually find answers, they stick around. Who knew.

But here's what I didn't expect: the search engine became a feature I could charge for. I added semantic search as a premium tier at $49/month (vs. $29 basic). Twenty-three customers upgraded in the first week. That's $460 MRR from something that cost me less than $15 to run. I think that's when it clicked that this wasn't just a fix—it was a differentiator.

The Screw-Ups I Made

Mistake 1: I didn't cache embeddings. For the first two weeks, every single search query generated a fresh embedding. So dumb. 80% of queries were repeats—"refund status," "shipping time," "cancel order." Over and over. Now I cache embeddings in Redis with a 24-hour TTL. API costs dropped 60% overnight. I'm using Upstash's free tier which gives you 10,000 commands per day. More than enough for now.

Mistake 2: My similarity threshold was too strict. I started at 0.9, which filtered out perfectly good results. A query like "where's my package" wasn't matching "order tracking" because the phrasing was different. Lowering to 0.75 caught more edge cases without introducing noise. Took three weeks of A/B testing to find the sweet spot. Three weeks of manually reviewing search logs every morning with coffee. Not fun.

Actually—I should mention that the threshold depends heavily on your content. What works for my e-commerce help docs might be terrible for technical documentation or legal text. You really have to test it with your own data. I wasted a week trying to copy someone else's threshold from a blog post. Don't do that.

Mistake 3: I ignored non-English queries. About 15% of my users search in Spanish or German. Embeddings handle multilingual content surprisingly well—I think OpenAI's model was trained on enough multilingual data that it just works—but I should have tested this before launch. Had to scramble to generate embeddings for translated help docs. That was a fun Sunday.

The Stack I'm Using Now

I know some indie hackers are using open-source models like all-MiniLM-L6-v2 to avoid API costs. I tested it on my dataset—accuracy was about 15% worse, and I'd need a GPU instance ($0.50/hour on Hugging Face). For my volume, OpenAI is cheaper and better. But if you're doing 500,000+ searches a day, self-hosting probably makes sense. Different scale, different tradeoffs.

What I'd Do Differently

If I were starting over today, I'd build the semantic search before launching the product. It's not a nice-to-have; it's table stakes. Users expect Google-level search in every app now. I wasted six months patching a broken keyword system when the fix took one weekend. Six months.

I'd also test hybrid search earlier. Pure semantic search sometimes misses exact matches—like product SKUs or error codes. "ERR523" should match "ERR523" exactly, not some semantically similar phrase about server errors. Now I combine pgvector similarity with a traditional full-text search score using a weighted average: 70% semantic, 30% keyword. Best of both worlds. Took me way too long to figure that out.

And I'd talk to users sooner about what they're actually searching for. My analytics showed the top failed query was "cancel subscription"—which I didn't even have a help article for. Added one in 20 minutes. Problem solved. Felt like an idiot for not checking sooner.

TL;DR / Key Takeaways

Why This Matters for Bootstrappers

The AI hype cycle is exhausting, I get it. Every week there's some new model or framework or whatever. But embeddings aren't hype. They're infrastructure, like databases or CDNs. The cost is negligible, the implementation is straightforward, and the impact on user experience is massive. You don't need a team of ML engineers or a $50k budget.

One founder. One weekend. $15 in API credits.

Pieter Levels built Nomad List with simple tech and relentless focus on user needs. Embeddings let us do the same for search. It's not about building the fanciest AI—it's about making your product actually work for the people paying you.

I'm curious: has anyone else replaced their search stack with embeddings? What threshold values are you using? I'm still tuning mine and would love to compare notes. Drop your setup in the comments. Especially if you're doing hybrid search—I feel like I'm still leaving performance on the table there.

And if you're stuck on keyword search and drowning in support tickets like I was—just build the damn thing this weekend. Your sleep schedule will thank you. Mine finally did.

Product: SupportGPT — Semantic search for e-commerce help desks

Try it: 14-day free trial, no credit card required. Use code INDIEHACKERS for 50% off the first 3 months.

buildinpublic #semanticsearch #openai #embeddings #bootstrapping #saas #indiehacker

C

Cael Lee

Full-stack developer with 8+ years of experience. Currently building AI-powered developer tools. I've tested 20+ AI API providers and coding assistants.

Ready to get started?

Get your API key and start building with 180+ AI models.

Get API Key Free