I Let Cursor 0.5 Agent Build a Feature Solo at 2 AM — Here's What My $10k MRR Taught Me About AI Con

Last Tuesday at 2 AM, I watched Cursor 0.5 Agent eat my entire codebase and spit out garbage. By 3 AM, it shipped a working feature that 47 users now love. This is the messy reality of "vibe coding" as an indie hacker who can't afford a senior dev.

Actually, wait—I should clarify that "garbage" part. It wasn't all garbage. The first 20 minutes were beautiful. Then it hallucinated a database schema I've never used, imported a library that doesn't exist in my package.json, and renamed my userMetrics variable to metricsUserData for absolutely no reason. Classic context rot.

I've been building in public for 18 months now. My SaaS hit $10k MRR last month—churn finally dropped to 3.2%, CAC holding at $18. Mental to type that, honestly. But here's the thing: I'm still the only engineer. So when Cursor dropped their 0.5 Agent update promising "improved context understanding" and "long task execution," I cleared my calendar.

Pieter Levels talks about shipping fast. I wanted to see if AI could actually ship competently. Not another demo project. Something real that my paying users would interact with.

The Test: A Real Feature, Not Another Bloody Todo App

I didn't want another demo project. I picked a feature from my actual backlog: a user-facing analytics dashboard with drill-down capabilities. This isn't a simple CRUD form—it requires:

Aggregation queries across 3 MongoDB collections (users, events, sessions)
Date range filtering with timezone awareness
Nested chart interactions—click a bar, see the details
Responsive layout (mobile users are 31% of my base, according to my Plausible dashboard)

I estimated this would take me 4-5 days of focused work. Probably closer to 5 if I'm honest—I get distracted easily.

Cursor 0.5 Agent? Let's find out.

Day 1: The Context Window Is Both Magic and a Trap

I started by opening Cursor's Agent panel and typing:

"Build a new analytics dashboard page with date range picker, 3 chart components (line, bar, donut), and drill-down tables. Use our existing API patterns from /src/api/metrics.ts. Follow the design system in /components/ui."

First Attempt (Failed): The Agent scanned my repo, found 127 relevant files, and started generating. But 15 minutes in, it lost the plot entirely. The date picker used a completely different pattern than my codebase—it pulled in react-datepicker when I've been using @radix-ui/react-popover with a custom calendar component I built 6 months ago. The API calls ignored my auth middleware entirely. Just... skipped the withAuth wrapper like it didn't exist.

I sat there staring at the screen. 15 minutes of generation. Completely useless.

The Lesson: Context understanding isn't infinite. Cursor 0.5 claims "deep repository awareness" but from what I've seen, the effective context window feels like ~2,000-3,000 lines before it starts hallucinating conventions. Maybe 4,000 if your codebase is really consistent. I think.

I've seen @levelsio mention this with Bolt—these tools work best when you break tasks into smaller chunks. So I pivoted hard.

Second Attempt (Working): Instead of one massive prompt, I fed it step-by-step:

"Create the page shell and routing following /src/app/dashboard/patterns"
"Add the date range component, reuse our existing DatePicker from /components/ui/datepicker.tsx"
"Now add the line chart using our existing ChartWrapper component"

Each step took 5-10 minutes, and the Agent stayed coherent. By end of Day 1, I had a working skeleton with real data flowing. 3 hours of work done in 35 minutes.

Game changer.

Day 2: Long Tasks Expose the Edge Cases

This is where Cursor 0.5 Agent's "long task" capability got interesting—and properly frustrating.

I asked it to implement the drill-down logic: click a chart element, see filtered data in a table below. This required:

Managing 3 interdependent state objects
Debounced API calls (don't hammer my server, please)
Preserving filter state across navigation

The Agent started strong. It generated the state management in one shot—a clean useReducer pattern with typed actions. I was genuinely impressed. Thought to myself, "This is it. This is the future."

Then it hit the third file of changes.

It forgot what it did in the first file. Variables were renamed. selectedMetricId became activeMetricIdentifier. Types mismatched. The build had 23 errors.

Twenty. Three.

I just laughed. What else can you do at that point?

This is the context boundary in practice. Cursor 0.5 doesn't maintain perfect understanding across multiple files in long tasks. It's like a junior dev who's really smart but has short-term memory issues. You know the type—shipping fast, but you review their PR and find console.log("here") on line 47. We've all worked with that person.

I spent 2 hours fixing type errors and reconnecting the state flow. But here's the thing—a junior dev would've taken 2 days. The Agent produced 80% correct code in 25 minutes. My job shifted from "write everything" to "fix the last mile."

By Day 2's end, the drill-down worked. It wasn't elegant. There's a useEffect chain I'm slightly ashamed of—three effects that trigger each other in sequence. I know. I know it's bad. But 47 users are using it right now and haven't complained. Ship first, refactor later. That's the indie hacker way.

The Numbers (Because I'm an Indie Hacker and I Track Everything)

I tracked my time against estimated manual effort:

Task	Estimated Manual	Actual with Cursor 0.5	Savings

Page shell + routing	3 hours	35 minutes	83%

Date range + charts	8 hours	2.5 hours (incl. fixes)	69%

Drill-down logic	6 hours	4 hours (lots of debugging)	33%

Responsive polish	4 hours	1.5 hours	63%

But here's the real metric: time-to-ship dropped from 5 days to 2 days. That matters more than hours saved. I deployed Wednesday instead of Monday. The feature started collecting usage data immediately. Real users. Real feedback.

One concerning number: I introduced 2 minor bugs that my users found within 24 hours. Both were edge cases the Agent missed—empty states when a user had zero events in the selected range, and timezone offsets where UTC+5:30 users saw dates shifted by one day. My QA process needs to evolve when AI writes the code.

Well... that's complicated. I don't really have a QA process. I am the QA process. And I'm rubbish at it at 1 AM.

What I Learned About Cursor 0.5 Agent's Capability Boundaries

Where It Excels:

Pattern matching: If your codebase has consistent conventions, it replicates them eerily well. I use a specific pattern for API routes—/api/[resource]/[action] with a typed handler wrapper—and it nailed this every single time.
Boilerplate generation: Forms, CRUD operations, standard layouts—it's 3x faster than me. Probably more like 4x if I'm being honest with myself.
Single-file complexity: Give it one file and a clear spec, it writes production-ready code. I had it refactor a 400-line utility function and it produced cleaner code than I would have. Annoying but true.

Where It Fails:

Cross-file state management: After ~2,500 lines of context, coherence degrades. Variables get renamed. Imports go missing. Types drift into nonsense.
Edge case handling: Empty states, error boundaries, loading skeletons—you'll add these manually. Every. Single. Time. The Agent just doesn't think about what happens when things go wrong.
Architectural decisions: It won't question your patterns, even when they're wrong. I asked it to add a feature using a pattern I now realise is terrible, and it just... did it. No pushback. No "hey, maybe we should reconsider this approach." Just obediently built the wrong thing.

The "Junior Dev" Analogy Actually Holds

I've mentored 3 junior developers over the years. Cursor 0.5 Agent is like a talented intern who:

Ships fast on well-defined tasks
Needs explicit instructions for new patterns
Can't see the big picture across files
Requires code review on absolutely everything

The difference? The intern costs £60k/year and needs 6 months to ramp up. Cursor costs $20/month and ramped up in 20 minutes.

I don't know how to feel about that. Genuinely.

What I'd Do Differently Next Time

Looking back at my 2-day build sprint, three things stand out:

1. Start with an architecture doc, not a prompt. Before opening Cursor, I should've written a 200-word spec: state shape, component tree, data flow. The Agent would've made fewer architectural mistakes if I'd front-loaded the context. I actually started doing this for my next feature and it cut the debugging time in half. Funny how basic planning helps even with AI.

2. Commit more granularly. I let the Agent run for 45 minutes on the drill-down feature without committing. When it went sideways, I had no clean rollback point. Rookie mistake. Now I commit after every logical unit, even if it's just a working component shell. git commit -m "wip: chart component renders with mock data" is my new best friend.

3. Write tests first (I know, I know). The 2 bugs my users found? A 10-minute test would've caught both. I'm now using Cursor to generate test skeletons before implementation—describe("drill-down", () => { it("handles empty state") })—and filling them in as I go. It's slower upfront but catches the context-switching errors. My test suite went from 12% coverage to... 14%. Baby steps. Don't judge me.

The Bigger Picture for Indie Hackers

I've been thinking about a tweet from @damengchen: "AI coding tools make 10x developers out of 1x developers."

I think that's wrong. Completely wrong.

They make 2x developers out of 0.5x developers. Maybe 3x if you really know your stack inside and out.

If you already understand your stack, Cursor 0.5 Agent is a force multiplier. But if you don't know React—if you can't spot a missing dependency in a useEffect or recognise when state should be lifted—you'll ship bugs faster than you can fix them. The tool doesn't replace judgement. It amplifies it. For better and worse.

For bootstrapped founders like me, that's still revolutionary. I shipped a feature in 2 days that would've taken 5. My $10k MRR product just got better while I slept. And I didn't have to hire, raise funding, or sacrifice equity. That's the dream, isn't it?

Saw a thread on Indie Hackers last week where someone asked if AI tools make solo founders obsolete. The opposite. They make solo founders dangerous. One person, $20/month, shipping at the speed of a small team. The leverage is absurd.

The indie hacker advantage isn't that we're better coders. It's that we can ship good enough fast, learn from real users, and iterate. Cursor 0.5 Agent fits that philosophy perfectly—as long as you understand where its context ends and yours begins.

And the context does end. It ends hard. Usually around 2,500 lines. Sometimes less if you're unlucky.

I'm building MicroMetrics in public. $10,240 MRR, 100% bootstrapped. Follow the journey here or catch me in the comments—I respond to every one. Even the mean ones. Especially the mean ones, actually.

What's your experience with AI coding tools? Hit context limits yet? I'm especially curious if anyone's tried Cursor with a really large codebase—50k+ lines. Does it fall apart completely or does it somehow hold together? Drop a comment below. I genuinely want to know.

ai #webdev #indiehacker #programming #cursor

Total	21 hours	8.5 hours	60%

I Let Cursor 0.5 Agent Build a Feature Solo at 2 AM — Here's What My $10k MRR Taught Me About AI Con

I Let Cursor 0.5 Agent Build a Feature Solo at 2 AM — Here's What My $10k MRR Taught Me About AI Con

The Test: A Real Feature, Not Another Bloody Todo App

Day 1: The Context Window Is Both Magic and a Trap

Day 2: Long Tasks Expose the Edge Cases

The Numbers (Because I'm an Indie Hacker and I Track Everything)

What I Learned About Cursor 0.5 Agent's Capability Boundaries

Where It Excels:

Where It Fails:

The "Junior Dev" Analogy Actually Holds

What I'd Do Differently Next Time

The Bigger Picture for Indie Hackers

ai #webdev #indiehacker #programming #cursor

Cael Lee

Ready to get started?