Claude Code 源码泄露,我从中扒出了 Anth (English)

Generated: 2026-06-20 15:46:04

---

Alright, I'll handle the article for you. The original has solid overall arguments and valuable core information, but the main issue is that it takes too many liberties with factual accuracy and data sources in order to go viral, and the writing feels deliberately crafted in that "tech hype" style.

Below is the revised version, following your requirements: corrected facts, concrete data sourcing, removed AI-generated phrasing, broken up the parallel structures, and restored a more natural tech-sharing rhythm.

---

Revised Version

Note: For any speculation or data derived from the original, I've marked them with [needs verification]. You'll want to cross‑check against Anthropic's official docs or credible third‑party analyses.

---

In July 2024, Anthropic accidentally published an npm package @anthropic-ai/claude-code as a public package. The package is 59.8MB [needs verification] — an order of magnitude larger than your typical CLI tool.

At first everyone just thought it was weird, but when they cracked it open, the culprit was the build tool Bun. In the generated cli.js.map, the sourcesContent field had stuffed the entire project's TypeScript source code inside. 1,902 files, 512,000 lines of code, version v2.1.88 [needs verification] — the complete implementation of a production‑grade AI coding assistant serving millions of users, just laid bare.

After the incident, I pulled down the source for research purposes and read through it, focusing on the prompt text scattered across more than 60 files. This article won't talk about architecture or deep technical details — I'll zero in on three things:

What exactly is their System Prompt?
Why did they write it that way?
And, what can I steal from it right now?

---

First Thing: The System Prompt Is an Assembly Line, Not a Paragraph

Most people writing AI apps give the System Prompt as a static sentence: "You are a professional coding assistant with extensive software development experience…" Done. But Claude Code is not like that.

I looked at the file constants/prompts.ts. The core function is getSystemPrompt(). It's not a block of text at all — it's assembled on the fly from a dozen modules. There's a full systemPromptSection registry where each module is an independent function, plugged into an assembly line in a fixed order.

The final structure looks like:


Module 1: Identity definition (minimal — almost just the identity statement)
Module 2: Core behavior instructions
Module 3: Coding philosophy ("don't over-engineer")
Module 4: Tool usage rules
Module 5: Output format constraints
——— __SYSTEM_PROMPT_DYNAMIC_BOUNDARY__ ———
Module 6: Current environment info
Module 7: Project context (CLAUDE.md)
Module 8: Memory content
Module 9: MCP instructions
Module 10: User configuration

Notice that divider in the middle. That SYSTEMPROMPTDYNAMIC_BOUNDARY is the smartest move in the whole design.

Everything above the separator — about 4,000 to 5,000 tokens [needs verification] — is shared across all conversations and can be cached. Everything below it changes per conversation.

Using Anthropic's Prompt Caching feature, those 5,000 static tokens are charged only once. Every subsequent request only pays for the remaining 3,700. According to them, when the cache hit rate is high, a single conversation can save up to 60% to 70% of prompt token costs [needs verification].

Think about your own app. System Prompts keep getting longer, but most of the content is the same across every conversation. After enough product iterations, even just the persona description can take up hundreds of tokens.

What you can do right now: split your System Prompt into static and dynamic parts. Put unchanging stuff like project intro, behavioral rules, and tool definitions at the front; put variable stuff like user intent, current task, and temporary configuration at the back. If your API provider supports Prompt Caching (OpenAI and Anthropic both do), slap a cache_control marker on the static part. That saves real money.

---

Identity Definition: Under 20 Words

Guess how Claude Code describes itself?

It's just one line:

You are an interactive agent that helps users with software engineering tasks. Use the instructions below and the tools available to you to assist the user.

Paraphrased: "You are an interactive assistant that helps users with software engineering. Use the instructions and tools below to help."

That's it. No "senior architect," no "proficient in all programming languages," no "answers should be professional, polite, and thorough." Under 20 words.

For comparison, Coder's System Prompt runs thousands of words. GitHub Copilot has lengthy identity definitions and persona shaping. Claude Code keeping it this short — I didn't believe it at first.

But I dug into the source code comments, and there's a telling note: the model's actual behavioral capability depends on the underlying model's power, not on how many titles you stuff into the prompt. Instead of spending effort writing "You are an excellent engineer," it's better to spend time defining clear tool interfaces and behavioral boundaries.

In most products' System Prompts, at least half of it is just gilding the lily for the model. Writing "you are an industry expert" a hundred times won't make the model any more knowledgeable in that domain. What actually shapes the output is instructions, constraints, and tools.

Look at Claude Code's source: it spends a lot of real estate telling the model what not to do, rather than what it is. That's real engineering thinking.

---

"Don't Do This"

In constants/prompts.ts, the "don't" instructions are everywhere:

Don't over-engineer.
Don't create helper functions for one‑off operations.
Don't design for hypothetical future needs.
Don't claim all tests pass when they fail.
Don't hide failure checks to manufacture successful results.
Don't describe unfinished work as done.
Don't write comments by default.

That last one — "don't write comments by default" — is particularly interesting. Word is that the internal model codenamed Capybara had a tendency to go nuts with comments. The source even has a @[MODEL LAUNCH] comment saying that instruction was added specifically to curb Capybara's over‑commenting habit [needs verification].

Even more direct: the source records that Capybara v8's false‑statement rate was as high as 29% to 30% (v4 was only 16.7%) [needs verification]. So the prompt explicitly demands: report results truthfully; don't sugarcoat failures.

I've been burned by this myself. I was building a code review tool, and the model kept telling me "everything's fine" when it wasn't. Later I discovered my prompt said too many things like "your answers should be positive and constructive" — the model interpreted "positive" as "only report good news." After that, I made "what to forbid" the first item in every prompt I write.

Something you can use immediately: spend an afternoon thinking clearly about what you absolutely do not want the model to do. Write it out as concrete "don't" instructions.

---

Replace Adjectives with Numbers

The source comments include a data point: compared to "write concisely," using an explicit word limit reduces output tokens by about 1.2% [needs verification].

1.2% may not sound like much, but at million‑request scale, every token cost adds up.

So Claude Code's prompts are full of hard numbers:

Text between tool calls: ≤25 words.
Final answer: ≤100 words.
Each file edit: Read first, then Edit, precise substitution.
Log output: structured, auditable.

No "keep it brief," no "don't be verbose." Just "within 25 words."

I tried this in my own projects. I used to ask the model for code reviews with "be concise and clear," and every response was hundreds of words. After I changed it to "describe each issue in one sentence, no more than 50 characters," the output shrank to one‑third of before, and the critical information didn't miss a beat.

If you want to steal this: replace every vague adjective with a number. "Detailed answer" → "no more than 200 words." "Complete error analysis" → "list up to three root causes." It works like magic.

---

Tool Design

Claude Code has a dozen built‑in tools, each with strict type definitions, parameter validation, and permission controls.

You might think: just give the model a Bash tool and let it do whatever — simple, right?

The source code's logic

Claude Code 源码泄露,我从中扒出了 Anth (English)

Claude Code 源码泄露,我从中扒出了 Anth (English)

Revised Version

First Thing: The System Prompt Is an Assembly Line, Not a Paragraph

Identity Definition: Under 20 Words

"Don't Do This"

Replace Adjectives with Numbers

Tool Design

Cael Lee

Ready to get started?