I Built an MCP Server From Scratch — Here's What the Docs Won't Tell You

TIL that implementing the Model Context Protocol isn't the "weekend project" every YouTube tutorial makes it out to be. Three all-nighters, two ragequits, and one 3 AM existential crisis where I seriously considered becoming a farmer later, my MCP server finally responds without randomly dropping connections. Here's the real talk.

I've been seeing MCP pop up everywhere since Anthropic released it back in November 2024, and like many of you, I thought "cool, another protocol to add to my CV." Spoiler: it's actually useful, but the learning curve is steeper than the hello-world examples suggest. Way steeper.

What actually is MCP? (The non-marketing version)

For those who haven't fallen down this rabbit hole yet, MCP is basically a standardised way for LLMs to talk to external tools and data sources. Think of it as USB-C for AI integrations—one protocol to rule them all, instead of writing custom integrations for every single tool.

Actually, wait—I should clarify that comparison. It's more like USB-C if USB-C also handled discovery and capability negotiation. Which real USB-C kinda does now I guess? Whatever, you get the point.

"It's like if REST and WebSocket had a baby that speaks JSON-RPC"—some comment I saw on r/MachineLearning that's surprisingly accurate

The architecture is client-server. Your MCP server exposes "tools" (functions the AI can call), "resources" (data the AI can read), and "prompts" (pre-written templates). The client—usually Claude Desktop or some VS Code extension—discovers these and uses them. Simple enough on paper.

My implementation journey (the ugly parts)

I decided to build an MCP server in TypeScript. Version 0.5.0 of the SDK, specifically. Why? Because I hate myself apparently. And because all the Python examples I found were using 0.3.2 and half the APIs had changed.

The official SDK helps. It does. But here's where things got real.

1. The transport layer isn't as plug-and-play as advertised

The docs make it seem like you just pick stdio or SSE and go. Reality check: stdio works brilliantly for local development—like, genuinely brilliant, no complaints. But if you're building anything that needs to handle multiple concurrent clients, you'll want SSE.

The problem?

Error handling in SSE is... let's call it "minimal." Charitably.

I spent 4 hours debugging why my server would randomly stop sending events. Not crashing, just... stopping. Silent. Turns out, if your SSE connection drops mid-stream, the default implementation just shrugs and walks away. No reconnection logic, no buffering, nothing. The client sits there waiting for events that aren't coming.


// What the tutorial shows:
const server = new McpServer({ name: 'my-server' });
server.connect(transport);

// What you actually need:
server.on('error', (err) => {
 // Hope you like debugging race conditions
 console.error('Something broke:', err);
 // Spoiler: this fires approximately never when you need it
});

I think there's a GitHub issue about this somewhere. Probably has 80+ thumbs up by now.

2. Tool definitions are deceptively simple

Defining a tool looks straightforward. Name, description, input schema. Three fields. How hard could it be?

Well.

The description field is doing heavy lifting. Like, all the lifting. The AI uses your description to decide when to call the tool, and vague descriptions lead to the AI either never calling your tool or calling it for literally everything.

Real example from my project: I built a database query tool. First description: "Query the database." Result: Claude called it to answer "what's your name?" I wish I was joking. The logs were just Claude repeatedly trying to SELECT username FROM somewhere.

Updated description: "Execute read-only SQL queries against the PostgreSQL database containing user analytics data. Use for questions about user counts, engagement metrics, or historical trends. Do NOT use for system configuration or metadata queries."

Night and day difference. Like, completely different tool behaviour.

But here's the thing nobody mentions—you've got to iterate on these descriptions. I went through 7 versions before it felt right. And what works for Claude might make GPT-4o call your tool way too aggressively or not at all. There's no standard for this yet.

3. The "tools/list" discovery is cool until it's not

When your MCP server starts, the client calls tools/list to discover what's available. Makes sense. But if you have 50+ tools (I may have gone overboard... okay I definitely went overboard with 73 tools), that initial handshake gets chunky.

There's no pagination in the spec. Not yet anyway. I've seen Claude Desktop timeout on the discovery call—just spin forever then silently fail. Nothing in the logs. Took me 2 hours to figure out what was happening.

Pro tip: batch your tool registrations and consider lazy-loading tools that aren't commonly used. Yes, this defeats some of the auto-discovery magic, but your users will thank you when the connection initialises in 2 seconds instead of 15.

I'm currently running about 12 "core" tools at startup and loading the rest on demand. It's hacky. I don't love it.

What actually worked well

It's not all complaints. Once I got past the initial hurdles:

The JSON-RPC 2.0 foundation means debugging is familiar. You can literally curl your endpoints and see what's happening. curl -X POST localhost:3000 -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"tools/list","id":1}' saved my sanity.
Resource templates are genuinely clever. Instead of hardcoding every possible data path, you define patterns like users/{userId}/profile and let the AI figure out the parameters. Felt like magic when it worked the first time.
The community is active. Found solutions to my SSE issues in a GitHub discussion from about 3 weeks ago with 47 comments of people having the exact same problem. Shoutout to @techiebackpacker who posted the workaround that finally fixed it.

My "production-ready" checklist

After this experience, here's what I'd recommend before calling your MCP server done:

Implement proper logging — The protocol has a logging system built in. Use it. Future you debugging at 2 AM will want those timestamps.
Add rate limiting — Your tools will get called in rapid succession. Claude doesn't know about your API limits. Or care, apparently.
Validate inputs with Zod — The AI will send malformed JSON sometimes. Not often, but when it does, you want a proper validation layer, not a cryptic "undefined is not a function" at 3 AM. Ask me how I know.
Test with multiple clients — Claude Desktop, Continue.dev (v0.9.5), and custom clients all behave slightly differently. Claude is the most forgiving, honestly.
Monitor tool call frequency — You'll discover which tools are actually useful vs. which ones seemed like a good idea after your third coffee. Spoiler: half my tools have never been called once.

The elephant in the room

Is MCP going to be the standard, or will it get Google Wave'd?

I think—and this is just my read after living in this codebase for two weeks—the protocol is solid enough to survive. But the real test is whether major players beyond Anthropic adopt it. OpenAI has their function calling, Google has their thing, and nobody likes being told their baby is ugly.

The protocol itself is open and well-designed, which gives me hope. But I've been burned by "standards" before. Looking at you, SOAP. And GraphQL kinda-sorta-not-really delivering on the dream.

I will say, I've seen more community tools and servers pop up in the last month than I expected. There's an MCP server for Brave Search now, for GitHub, for PostgreSQL. That ecosystem matters.

TL;DR (Key Takeaways)

Budget 2-3x the time you think it'll take. Maybe 4x if you're doing SSE transport
Tool descriptions are everything — iterate on them like you're explaining to a brilliant but extremely literal-minded intern
Start with stdio transport for development, then graduate to SSE when you need concurrent clients
Add error handling before you need it. Not after. Before
Use the MCP Inspector tool (npx @anthropic-ai/mcp-inspector) — it'll save you hours of debugging
Validate everything — the AI will send malformed data when you least expect it

Edit: Thanks for the gold, kind stranger! Since people are asking—yes, I'll open-source my implementation once I clean up the embarrassing parts. Give me a week. Maybe two. There's a comment block in my SSE handler that just says "// I don't know why this works but don't touch it" and I need to figure that out first.

Edit 2: Several people asked about Python vs TypeScript for MCP servers. I've used both now (Python with the 0.4.1 SDK, TypeScript with 0.5.0), and honestly? Python's async story makes the SSE transport slightly less painful, but TypeScript's type system catches schema validation bugs at compile time. Pick your poison. I went with TypeScript because I'd rather debug types at 2 PM than runtime errors at 2 AM.

Edit 3: Someone in the comments mentioned that MCP Inspector tool. If you're building an MCP server and haven't used it yet, stop reading this and go install it. I can't believe I didn't find it until day 6 of this project.

What's your experience been with MCP? Anyone found a clean way to handle tool versioning, or are we all just YOLOing it with breaking changes and hoping nobody notices?

mcp #ai #typescript #llm #programming #webdev

I Built an MCP Server From Scratch — Here's What the Docs Won't Tell You

I Built an MCP Server From Scratch — Here's What the Docs Won't Tell You

What actually is MCP? (The non-marketing version)

My implementation journey (the ugly parts)

1. The transport layer isn't as plug-and-play as advertised

2. Tool definitions are deceptively simple

3. The "tools/list" discovery is cool until it's not

What actually worked well

My "production-ready" checklist

The elephant in the room

TL;DR (Key Takeaways)

mcp #ai #typescript #llm #programming #webdev

Cael Lee

Ready to get started?