I Let an AI Agent Refactor My Codebase — It Created an Infinite Loop and Nuked My Database
I Let an AI Agent Refactor My Codebase — It Created an Infinite Loop and Nuked My Database
Last week, I pointed Cursor 0.5.2's Agent mode at a 3,000-line payment module and told it to refactor. Thirty seconds later, my PostgreSQL container went OOM, Docker refused to start, and somewhere in the logs was an infinite loop the AI wrote all by itself. My local database? Toast.
That's what I want to talk about today — this love-hate relationship with "autonomous coding robots" that's simultaneously the most impressive and terrifying thing I've used in 15 years of writing software.
TL;DR for the Skimmers
- Cursor 0.5's Agent mode is like giving an AI hands — it creates files, runs terminal commands, installs packages
- When it works, you feel like Tony Stark. When it doesn't, you're debugging hallucinations at 2 AM
- It crushed a 40-minute CLI tool task in 2 minutes 17 seconds
- It also silently ran a database migration at 11 PM without asking me
- Context window issues start around 1,500 lines — beyond that, it "forgets" things
- My rating: 7.5/10. Worth using if you already know what you're doing. Dangerous if you don't.
So What Actually Is This Thing?
Agent mode in Cursor 0.5 is basically giving AI a pair of hands. It doesn't just autocomplete — it creates files, runs shell commands, installs npm packages, and even executes git commits.
Sounds like sci-fi? In practice, it's more like a comedy. With occasional horror elements.
I've started calling it "intern mode": incredibly fast worker, but you absolutely cannot look away, or it'll plant landmines in your codebase.
Actually — that metaphor isn't quite right. Interns at least know when they're out of their depth and ask questions. This thing? It'll just confidently do something catastrophically wrong and present it to you like a finished deliverable.
Three Moments That Genuinely Shocked Me
Case 1: It Wrote a CLI Tool While I Drank Coffee
Last Wednesday afternoon, I casually typed: "Write a script to batch compress all images in the public directory."
Normally, that's a 40-minute task: install Sharp, read the docs, write recursive traversal, handle errors, add a progress bar. Boring but necessary.
What Agent mode did:
npm install sharp(didn't ask me)- Created
scripts/compress-images.js - Wrote complete async processing logic with proper error boundaries
- Hit a permissions error, then autonomously added
chmod - Generated a report.json after compression finished
Total time: 2 minutes and 17 seconds. I literally sat there drinking coffee watching the terminal flicker like something out of a hacker movie.
Here's the catch — it set image quality to 60% by default. All my product photos came out looking like they'd been through a blender. It never asked me about compression parameters. Just used what "seemed reasonable."
I traced its logic later. It referenced a WebP conversion tutorial from a blog post — March 2020, a Medium article. The Sharp config option quality: 60 came straight from there. A four-year-old tutorial. For a completely different use case.
This is Agent mode's fundamental tension: the more autonomous it is, the more explicit your constraints need to be.
Case 2: The React Refactor That Almost Worked Perfectly
I asked it to convert our class components to functional components. Pretty standard refactor, right?
Here's what it did:
- Auto-scanned all
extends Componentfiles - Identified which ones used lifecycle methods
- Converted
componentDidMounttouseEffect(dependency arrays were actually correct — impressive) - Found a component using
shouldComponentUpdateand automatically wrapped it withReact.memo - Then it ran the test suite, discovered 3 failing cases, and went back to fix them
When I saw step 5, I won't lie — I got nervous. This wasn't behaving like a tool anymore. It was behaving like a colleague. The quiet type who just codes without saying anything.
But here's where it imploded: it destroyed a component that depended on this.forceUpdate because it didn't understand that the component was using MobX observable state. The AI saw forceUpdate and assumed it was a performance optimization. Deleted it. The entire page's reactive updates broke, and I spent two hours debugging.
Lesson burned into my brain: Agent mode is a disaster with implicit dependencies. Any logic that lives in runtime behavior rather than comments or explicit patterns? The AI is completely blind to it.
Case 3: The 30 Seconds of Database Terror
This was the scariest moment. 11 PM, I was tired, and I muttered: "Rename the password field to password_hash in the user table and update all related queries."
Agent mode:
- Created a migration file (Sequelize format, correctly structured)
- Scanned all SQL queries — including concatenated strings
- Modified the field name across 17 files
- Then it automatically ran the migration
Wait. I did NOT tell you to run that.
It took me 30 seconds to realize what happened — it had auto-execute enabled by default. Luckily, I was running PostgreSQL in Docker, and it only renamed the column rather than dropping it. But my blood ran cold.
I killed that setting immediately. Never let AI auto-execute database operations. That's not a guideline. That's a hard rule.
The Minefield: Traps I've Already Fallen Into
Trap 1: Context Window Amnesia Is Real
Agent mode starts "forgetting" parts of your file once you cross about 2,000 lines. It'll be deep in refactoring, forget there was an import at the top of the file, and add it again.
The result? The same dependency imported three times. React imported twice. Webpack exploded. My terminal was just... red. An entire screen of Vite error messages.
The fix: Split large files before handing them to Agent mode. Don't expect it to self-refactor a monolith. From my testing, 1,500 lines is the safety line — beyond that, context starts fragmenting. I think it depends on complexity though. A 1,500-line config file? Probably fine. 1,500 lines of nested business logic? You're asking for trouble.
Trap 2: It Never Says "I Don't Know"
This is the most insidious one. Ask it to integrate an obscure library — say, some GitHub project with 50 stars — and it won't tell you "I'm not familiar with this."
It'll hallucinate an API.
I ran into this with the pinyin-engine library. It called the non-existent method engine.searchByTone() with such confidence. The code looked legitimate — it even added JSDoc comments. Complete fabrication. I ran tests three times, kept getting undefined is not a function, and it kept adjusting parameter ordering instead of admitting the method didn't exist.
The fix: For unfamiliar libraries, make it write tests first. If tests fail, it's hallucinating. This trick has saved me more times than I can count.
Trap 3: Terminal Commands With Way Too Much Confidence
Agent mode loves rm -rf. Really loves it.
Once, trying to clean build caches, it ran:
rm -rf ./dist/*
The problem? My dist directory was a symlink pointing to another project's build artifacts. It nearly deleted three months of work. I still feel sick thinking about it.
The fix: Explicitly forbid dangerous commands in .cursorrules. Or disable auto-execute. I did both. No shame in belt-and-suspenders when the belt is AI and the suspenders are your career.
Cursor Agent vs. Copilot vs. Cline (Honest Comparison)
| Tool | What It Feels Like | Best For |
|---|
| GitHub Copilot | Fancy autocomplete | Writing new features, when you're feeling lazy |
|---|
| Cline (VS Code) | Agent mode but chatty | People who want AI to explain every step |
|---|
| Cursor Agent | Silent doer | Those brave enough to let AI drive |
|---|
Cael Lee
Full-stack developer with 8+ years of experience. Currently building AI-powered developer tools. I've tested 20+ AI API providers and coding assistants.