Home / Blog / How I Let an AI Refactor My Entire SaaS Codebase i...

How I Let an AI Refactor My Entire SaaS Codebase in 48 Hours (And It Actually Worked)

By CaelLee | | 10 min read

How I Let an AI Refactor My Entire SaaS Codebase in 48 Hours (And It Actually Worked)

Last Tuesday at 2am, I was staring at 47,000 lines of spaghetti code thinking "this is fine" while my churn rate quietly climbed to 6.8%. By Thursday night, Claude Code had restructured my entire repository, and I'd spent exactly £0 on developer help.

I know how that sounds. Bear with me.

Product: Analytics Dashboard for Shopify Merchants

Revenue: $10,243 MRR (up from $8,900 last month — that's about £8,100 for my fellow Brits)

What happened next: Genuinely surprised me

The Mess I Built Over 18 Months

When I launched my Shopify analytics tool in January 2023, I wrote every single line myself. No co-founders, no employees, no contractors. Just me, VS Code, and the kind of optimism you only have at 1am when you're convinced that "I'll refactor it later" is a legitimate business strategy.

Spoiler: it's not.

By August 2024, the cracks were impossible to ignore:

The real gut punch? A power user named Marcus emailed me: "Love the product, but it's getting slower every month. What happened?"

What happened was 47 features shipped with zero refactoring. Actually, wait — I should clarify that. I did refactor once. Back in March 2023. For about two hours. Then I got distracted building a CSV export feature that three people asked for. Three.

Classic indie hacker stuff.

Enter Claude Code: The Terminal Agent That Doesn't Need Hand-Holding

I'd seen the Anthropic's Claude Code launch floating around but honestly dismissed it as "another AI wrapper." Then I saw Marc Lou tweet about rebuilding his entire backend in a weekend with it.

Marc doesn't hype garbage. If you've been in the indie hacker community for more than five minutes, you know this. So I dug in.

Here's the thing about Claude Code that's different: it's this agentic coding thing that runs in your terminal. Unlike Copilot — which suggests lines like an overeager pair programmer who's had too much coffee — Claude Code operates at the task level. You describe what you want, and it reads your entire codebase, plans changes, and executes them across multiple files without asking you to hold its hand.

Setup took 12 minutes:


npm install -g @anthropic-ai/claude-code
claude

No API keys to juggle. No VS Code extension conflicts where half your plugins suddenly break. Just a terminal prompt: "What would you like to work on?"

I typed "everything" as a joke.

It didn't laugh.

The 48-Hour Refactor Timeline

Hour 0-4: The Audit That Paid for Itself

I started with one command:

"Audit my entire codebase for performance issues, unused code, and architectural problems. Show me the 10 biggest problems."

Claude Code scanned 47,316 lines across 312 files. Took about 4 minutes.

It returned:

  1. N+1 queries in 7 API endpoints (costing 1.2s per request — I'd somehow never caught this)
  2. Unused dependencies: 23 npm packages never imported. Twenty. Three. I'd been shipping those to production for 18 months
  3. Duplicate utility functions: Same date formatting logic copy-pasted across 6 files like I was getting paid per line
  4. Missing indexes: 4 database tables without proper indexing
  5. Client-side rendering: 3 components that should definitely be server-rendered

The audit alone justified the $200 in API credits I burned through. I'd been paying $79/month for a monitoring tool (won't name names, but you probably know the one — rhymes with "DataDog") that never surfaced half of this.

Well... that's complicated. It probably could have surfaced it. I just never configured the damn thing properly. Moving on.

Hour 4-12: The Great Cleanup

I told Claude Code:

"Remove all unused dependencies, consolidate duplicate functions into a shared utils folder, and update all imports."

What happened next was genuinely surreal. My terminal showed Claude Code:

I just watched. Occasionally I'd type "continue" when it asked for confirmation on breaking changes — which happened maybe four times total.

Oh wait, I should back up. It actually asked about a lodash removal that would've broken the billing module. I almost auto-approved it. That would've been bad. More on that later.

First milestone hit at Hour 7: The app still worked. All 142 tests passed. Bundle size dropped from 2.7MB to 1.9MB.

I celebrated with cold pizza. Living the dream over here.

Hour 12-24: Database Revenge

The N+1 queries were absolutely destroying performance. Here's what one endpoint looked like:


// BEFORE: 47 separate queries
const orders = await db.orders.findAll({ userId });
for (let order of orders) {
 order.items = await db.items.findAll({ orderId: order.id });
 for (let item of order.items) {
 item.product = await db.products.findByPk(item.productId);
 }
}

Forty-seven queries. For one page load. I'm genuinely embarrassed looking at this now.

I gave Claude Code the entire /api/ directory and said:

"Fix all N+1 queries using eager loading. Add database indexes where needed. Show me the query count before and after for each endpoint."

It rewrote 7 endpoints using proper JOINs and .include() statements. Then generated migration files for the missing indexes. I've written maybe three database migrations in my life, and this thing spat out four in about 90 seconds.

Second milestone at Hour 18: Dashboard load time dropped from 4.2s to 1.1s. I said "holy shit" out loud. My cat was unimpressed. My cat is never impressed.

Hour 24-48: The Risky Stuff

Emboldened by success — probably too emboldened — I decided to tackle the architecture. My app was a classic monolithic mess with business logic tangled up in route handlers like Christmas lights after a toddler's been at them.

Claude Code proposed:

  1. Extract a services/ layer for all business logic
  2. Create a repositories/ layer for database access
  3. Keep routes/ thin (max 20 lines per file)
  4. Move 3 heavy components to server-side rendering

This was terrifying. I was asking an AI to restructure my entire codebase. Users were actively on the platform. I think I stress-ate an entire bag of tortilla chips during hour 26. The big bag. The one that says "family size" but let's be honest, it's for one person having an existential crisis.

I deployed a maintenance window banner ("Scheduled improvements: Back in 2 hours") and let Claude Code run.

Hour 36: Tests started failing.

The services layer had broken authentication middleware. Before I could even parse the error log, Claude Code identified the issue:

"The auth middleware is being imported before the services layer initialises. I need to refactor the import order in app.js. Fixing now."

It fixed itself.

I was just... there. Being the human who presses "approve." It felt weird, honestly. I've been coding for seven years and suddenly I'm the bottleneck on my own project because I can't read error logs fast enough. That's a strange kind of existential crisis.

Hour 44: All tests passed. 142 passing, 0 failing.

Hour 48: I deployed to production. Sweating. Refreshing the dashboard. Watching logs like a hawk.

Nothing broke.

I stared at the ceiling for a while.

The Numbers Don't Lie

Two weeks post-refactor, here's what changed:

MetricBeforeAfterChange
Dashboard load time4.2s0.9s-78%
API response time (p95)820ms210ms-74%
Monthly churn6.8%4.1%-39%
Day-3 retention59%71%+20%
Bundle size2.7MB1.4MB-48%

Churn improvement alone translates to roughly $1,400 more MRR retained each month. The refactor cost me $380 in Claude Code API credits and one very caffeinated weekend.

My "developer cost": $380.

Market rate for a senior dev to do the same: $15,000-25,000.

Not having to write a job description or explain my spaghetti code to another human: I mean, what do you even call that? Pride? Dignity? Whatever, it's worth something.

What I'd Do Differently

1. Start smaller. I threw my entire codebase at Claude Code immediately like an absolute maniac. Better approach: refactor one module, deploy it, verify stability, then scale up. I got lucky. You might not.

2. Review more changes manually. Claude Code auto-applied about 80% of changes without asking. For mission-critical paths (billing, auth, anything touching money), I should have forced manual review on every single change. Remember that lodash thing I mentioned? Yeah. That would've broken Stripe integration. I caught it by accident. Pure dumb luck.

3. Write better prompts. "Fix performance issues" is vague garbage. "Identify all API endpoints with >200ms response time and reduce them below 100ms by optimising database queries" is specific. Claude Code's output quality scales directly with prompt quality. This sounds obvious now. It wasn't obvious at 3am.

4. Don't skip the test runs. At Hour 36, I almost skipped a failing test because "it was probably fine." It wasn't fine. The auth middleware was completely broken. Always let the test suite complete. I don't care if it takes 8 minutes. Wait.

5. Communicate with users. I should have sent an email before the maintenance window, not just posted a banner. Got 3 cancellation requests from confused users who thought the service was dying. Lost $150 MRR from poor communication, not poor code.

That one stings to write.

What This Means for Indie Hackers

Pieter Levels famously said "you don't need a co-founder." With Claude Code... you might not need a dev team either.

Look, I'm not saying AI replaces developers entirely. I still needed to understand what to refactor and why. That domain knowledge came from 18 months of building, talking to users, and making mistakes. No AI can shortcut that. Not yet, anyway.

But the how? The tedious file-by-file migration? The 89 import statement updates? The "oh god I have to write another database migration" dread?

That's done. It's just done now.

For bootstrappers like me, this changes the maths completely:

The bottleneck shifts from "can I build this?" to "should I build this?" — which is exactly where we all want to be.

The Real Question

Here's what I'm wrestling with now: If Claude Code can refactor my entire codebase in 48 hours, what happens when my competitors figure this out too?

Speed of execution used to be my edge as a solo founder. Ship faster than the funded teams. Iterate while they're still in meetings. That was the whole game.

Now everyone has access to the same AI tools. The differentiator isn't coding speed anymore. It's understanding your users deeply enough to know what to build before anyone else does. It's taste. It's the conversations you have with customers at weird hours. It's replying to Marcus at 11pm because you genuinely care about his dashboard loading fast.

Marcus emailed me yesterday: "Whatever you did, the app is flying now. Thank you."

I replied: "No problem. My unpaid intern handled it."

He definitely thinks I'm joking.

I'm not joking.

TL;DR / Key Takeaways

Have you tried Claude Code or similar AI agents for production work? I'm genuinely curious where people draw the line between "helpful tool" and "existential risk to my codebase." I've heard horror stories about Cursor deleting entire directories. Drop your wins or disasters in the comments — I actually read every single one, usually at 2am when I should be sleeping.

buildinpublic #claudecode #indiehacker #bootstrapping #aiagents #saas

Lines of code47,31638,920-18%
C

Cael Lee

Full-stack developer with 8+ years of experience. Currently building AI-powered developer tools. I've tested 20+ AI API providers and coding assistants.

Ready to get started?

Get your API key and start building with 180+ AI models.

Get API Key Free