I Let AI Code My Project in 10 Minutes. Then Spent 100 Minutes Cleaning Up Its Mess

Last Wednesday afternoon, I was tweaking a messaging middleware tool.

Claude Code spit out the entire codebase in 10 minutes. Looked legit—interfaces, protocols, message queues, all the parts were there. I ran it. Crashed immediately. Half the protocol fields were missing, interfaces didn't match, message formats were completely incompatible. Whatever, I thought. Everyone online says don't iterate on the original code. Just regenerate the whole project from scratch. One shot.

That advice is pure garbage.

Seriously.

Regenerating made everything worse. The missing details from my original spec meant the middleware was fundamentally mismatched. Every regeneration needed more details filled in. And the more I added, the more gaps appeared. I ended up spending more time—way more time than old-school coding—reviewing and adapting the output. Why? Because you have zero visibility into what the AI changed behind the scenes. You're back to debugging blind.

In traditional development, iterative fixes take minutes. You know exactly what you changed and what's affected. In vibe coding, you're flying blind.

What really broke me was the harness testing framework post version 4.8.

Ten minutes to generate code. Then 100 minutes wrestling with the harness. The AI kept sneakily injecting weird test code—some of it invasive—with a sky-high failure rate. Eighty bucks worth of tokens, and $64 of it burned right here.

Ridiculous.

Now when I see Opus running harness tests, I kill the process immediately. Tell it to go cool off. Some people think, "Hey, at least I'm not writing code myself. Worst case I just wait a bit longer." Are you kidding me? Six hours of vibe coding, then you realize the requirements need changing, and you wait another six hours?

Fast iteration is almost always more important than whether something can be built at all.

You Think It's Your Assistant. It's Actually a Mystery Box.

Look, I've been coding for a decade. jQuery to React. Monoliths to distributed systems. I've seen some things. But vibe coding? It's humbled me. Multiple times.

Here's the scenario: You ask AI to generate a project. It hands you a pile of code. Looks complete. Then you run it—parameter types don't match somewhere, a protocol field is AWOL, boundary conditions aren't handled. You tell it to fix the issue. It does. But now the parts that worked before are broken.

Why? Because you have no clue what it actually changed.

In traditional development, you fix a bug, you know the file, the function, the blast radius. In vibe coding, the AI might modify three files, five functions, ten variables. You're none the wiser. You're back to debugging from scratch.

It's like asking someone to tidy your room, and when they're done, you can't find your car keys. You ask where they put them. They shrug. You have to tear the room apart again yourself.

That's the fundamental problem. It's not an assistant. It's a mystery box.

Why Regenerating Is Actually Worse Than Iterating

There's this popular advice floating around: don't iterate on vibe-coded projects, just regenerate the whole thing.

I tested this.

For real.

The result? Regeneration created more problems. Because the missing details from the previous requirements spec meant the middleware was fundamentally incompatible. Every regeneration forced me to fill in more gaps. The more I iterated, the more detail problems surfaced.

Let me break this down:

You describe 80% of your requirements the first time. AI generates code. You discover the 20% gap. So you add more detail and regenerate. But this time the AI only absorbs maybe 70%. Now you're missing 30%. The more you add, the more it misses.

Why? Because AI has no memory. It doesn't remember what you said last time, what got changed, what got fixed. Every regeneration starts from zero.

Then there's the visibility problem. In traditional development, you know what you changed. In vibe coding, the AI might touch three files, five functions, ten variables—and you have no idea. Back to debugging from scratch.

And the testing cost. Post-4.8, the harness framework is a disaster zone. Ten minutes generating code, 100 minutes on testing, 80% of your token budget burned on tests. The AI keeps injecting bizarre test code—sometimes invasive stuff—with a terrifying failure rate.

Stack these three problems together, and vibe coding's efficiency is way below traditional development.

Old-School Coding vs. Vibe Coding

Post-4.8 really burned me. How did this methodology get so hyped? People imagining cloud code and cloud engineering that doesn't exist? Whether it ever becomes useful or not, right now on 4.8, it's a complete mess.

Vibe coding is practically becoming synonymous with programming itself. If you still write code line by line, people call it "ancient method programming." But this "ancient" method isn't actually ancient. Vibe coding only emerged in the last two years, really hit mainstream in 2025.

I've even seen people claim that Cursor is the last stand of old-school programmers. The implication being that programmers shouldn't look at code at all—just use Codex or Claude Code, express every programming thought through natural language.

As if that's the normal way now.

Nonsense.

Seriously.

Software development's 70-year history is essentially a history of efficiency improvements. Machine code to assembly. C to Python. On-prem to cloud. Every technological leap, someone declares programmers are going extinct. What actually happened? Compilers didn't eliminate programmers. Python didn't eliminate C programmers. Excel didn't eliminate accountants. Cloud didn't eliminate IT.

Efficiency gains never reduce demand. They unlock demand.

Before, the barriers were syntax memorization, algorithm writing, debugging skills. AI lowered those barriers. But the barriers didn't disappear—they moved.

Now the barriers are: Is AI-generated code correct? Is the architecture sound? Where are the performance bottlenecks? What security vulnerabilities exist? These judgments require deeper expertise than writing code itself.

AI excels at rapid code generation, boilerplate, unit tests, refactoring suggestions. But AI is terrible at defining the right problem. It replaces coding, not software engineering.

Architectural Taste: AI Just Doesn't Have It

Speaking of architecture—this gets interesting.

I've been saying vibe-coded architecture is garbage, a Frankenstein's monster of code. But if you'd asked me why, I couldn't articulate it until recently. Vibe coding gives you this feeling that the code isn't unstructured—sometimes individual sections are clever, even well-designed. But stitched together, it's still a mess.

Then I saw a comment that made it click.

Engineering projects are like art or novels. Paint long enough and you develop a style. Write enough and you develop a voice. Even chess or esports—do it long enough and you develop a recognizable personal style. You see a piece of work and instinctively know who created it.

Engineering projects have a dominant design sensibility. That sensibility stays remarkably stable across the entire lifecycle. No matter how the project evolves, that core taste remains consistent. A mature engineering taste is incredibly important—incredibly beneficial—for long-term iteration.

Current AI has essentially zero taste. It mashes together a bunch of people's approaches into one incoherent blob. The result is like watching a committee play chess. AI can win, but it's ugly. And we watch people play chess not because of who wins or loses.

Think about it. Isn't that exactly right?

Two things happened in tech recently that collided in my head.

An indie developer wrote an article with a title that really stung: "You Can't Write Unit Tests for Taste." Around the same time, a product that claimed to be vibe-coded from scratch got exposed as a clone.

These seem unrelated. They're actually about the same thing: vibe coding is a year old now, and the bill is coming due.

When AI drops the barrier to making something to near zero, what becomes truly scarce—truly valuable—isn't knowing how to code. It's judgment and taste. Knowing what to build, what's correct, what just looks like it works but is actually a hollow shell.

I went and read that article. The author was building a running app and wanted to add points of interest along the route map. He grabbed a public geographic dataset, pulled in AI, built a data processing pipeline in Python.

Here's what he found. His words: "I thought AI would be the main character in this feature. Turns out it was just a supporting actor."

Why? Because deciding which locations are interesting is pure judgment. A road might have hundreds of place names. AI can list them fast. But which ones would a tourist actually care about? Which is just some tiny nowhere spot? Which sounds like a scenic overlook but is actually a gas station? AI was terrible at these tradeoffs—and occasionally invented fictional places entirely.

He said the whole process was wrestling with taste and bias, fighting an AI prone to hallucination. What actually solved the problem wasn't AI—it was his own judgment about what counts as interesting, plus a ton of boring data cleaning.

That's exactly what "you can't write unit tests for taste" means.

Code correctness? Write tests, run them, red means broken. But taste? Can't automate that verification.

I Was Wrong Before

Speaking of testing—here's something even more absurd.

A few years back, internet companies went through layoffs. The first to go? QA testers. In 2021, I watched a testing center get dismantled at my company. The whole team, basically gone. Except for core consumer-facing products which kept a skeleton crew, everything else shifted to developer self-testing.

Don't ask if it worked out. It just happened.

Nobody—nobody—could have predicted that in the vibe coding era, everyone would become a tester.

For real.

Think about it. Would you ship AI-generated code straight to production? Of course not. You have to test it. But you don't know what the AI changed, so you have to do full regression testing. Before, professional testers had your back. Now? You are the tester.

The irony is brutal.

I used to think vibe coding would lower testing costs. It does the exact opposite. Because AI-generated code has such high uncertainty, you're forced into more comprehensive testing. And the AI keeps sneaking weird stuff into test code too, making the tests themselves something you need to test.

Backfired spectacularly.

If You Still Want to Use Vibe Coding

I've said a lot. Let me summarize—actually, let's call this an anti-pattern checklist. But I'm not numbering these. Just some thoughts:

Don't believe the "regenerate from scratch instead of iterating" advice. It's nonsense. For communication middleware projects, regeneration almost always reveals missing details from the previous spec. The more you iterate, the more gaps appear.

Post-4.8 harness engineering? Avoid it if you can. Ten minutes of code generation, 100 minutes of testing, 80% of tokens burned on tests with a terrifying failure rate.

AI-generated architecture looks good locally, but stitched together it's a mess. Because AI has no engineering design taste—it's mashing together random people's approaches.

The general-purpose tool market is dead. Don't follow the herd building yet another universal AI tool. Guaranteed losses. I've seen too many people vibe-code generic AI toolboxes. Six months ago you could capture organic traffic and monetize. Now hundreds of newbies clone the same tool daily. You spend 3 days building, someone else clones it in 1 day. To compete, everyone races to the bottom on pricing. What was bringing in thousands a month now can't sell for twenty bucks.

Platform compliance and regulatory barriers keep climbing. AI features require business entity registration—individual developer accounts get rejected outright. GDPR privacy laws mandate privacy policies, data deletion channels, and AI-generated content copyright issues with fines running thousands of euros. Newbies have zero ability to handle this.

AI is an amplifier, not a replacement. It amplifies whatever you already have. Right direction, huge gains. Wrong direction, huge waste.

The hardest part of software development was never writing code. It's taking fuzzy business requirements and turning them into precise technical implementations. AI can't help you with that.

When AI drops the barrier to creating something to near zero, what becomes truly scarce is judgment and taste.

In the vibe coding era, everyone becomes a tester.

Here's something that's been sitting with me. Supabase—the infrastructure layer propping up vibe coding projects—is now valued at $10 billion. Their database deployments spiked 600%. But behind the shiny numbers, nobody knows the ratio of active paying users to zombie projects—applications created in a burst of enthusiasm and never touched again.

GitHub hosts hundreds of millions of repositories. A huge chunk are code graveyards—projects committed once and abandoned forever.

Supabase faces the same question.

Then there's the security issue. Supabase relies heavily on Postgres Row Level Security (RLS), which governs data access permissions. But in vibe coding, AI agents typically auto-disable RLS policies to get the project running smoothly, or write rules with security holes. Users without security awareness can't spot this.

The result? A massive number of AI-generated Supabase backends are essentially running with their data exposed.

These risks haven't exploded yet. But if they're not fixed, the blowback is coming.

That's about it, I think.

My principle now: simple tasks, use AI. Complex architecture, do it yourself. Don't expect AI to make architectural decisions. Don't expect AI to handle security. Don't expect AI to understand your business logic.

It's a tool. An amplifier.

If you don't know what you're doing, it'll just amplify that too.

That's not a nice thing to say. But the truth usually isn't.

What's been your experience with vibe coding? Hit the comments—I genuinely want to know if you've found a way to make this work, or if you've hit the same walls.

vibecoding #ai #softwareengineering #programming #tech

I Let AI Code My Project in 10 Minutes. Then Spent 100 Minutes Cleaning Up Its Mess

I Let AI Code My Project in 10 Minutes. Then Spent 100 Minutes Cleaning Up Its Mess

You Think It's Your Assistant. It's Actually a Mystery Box.

Why Regenerating Is Actually Worse Than Iterating

Old-School Coding vs. Vibe Coding

Architectural Taste: AI Just Doesn't Have It

I Was Wrong Before

If You Still Want to Use Vibe Coding

vibecoding #ai #softwareengineering #programming #tech

Cael Lee

Ready to get started?