所有人都以为Ollama免费，其实它比API贵10倍 (English)

Generated: 2026-06-22 06:34:32

---

Don't Be Fooled: Ollama Isn't Your Only Free Option — You've Been Screwed by the Word "Free"

Friend, have you ever been through this?

Late at night, you excitedly downloaded Ollama, thinking: Finally, I can have a free private AI server! And then what? You spent an entire weekend configuring it, eagerly fired up Llama 3, and the tasks kept failing. After hours of troubleshooting, you found out that Ollama's default context length is only 2048 tokens — while Llama 3 needs at least 16K-24K for complex instructions. At that moment, didn't you want to smash your computer?

I get it, because that was me early last year.

You think "free" means zero cost? Wrong. Real free is when someone else shoulders the hardware and computing power for you — while you're still stubbornly wrestling with your own machine.

---

The First Trap: Ollama's "Free" Is a Lie — It Hides the Cost in Your Wallet

Let me break it down for you.

Think about it. What does running a local model require? Hardware!

An 8GB MacBook Air running a 7B model? It stutters like a slideshow — even typing has lag.
A 16GB MacBook Pro barely handles an 8B model; open one more browser tab and it crashes.
A 64GB Mac Studio finally runs smoothly — but guess how much it costs? A fully loaded Mac Studio could buy you how many API credits?

In plain terms, Ollama's "free" is built on your expensive hardware. An ordinary laptop? The experience is hard to describe. You spend money on a computer, waste time fiddling with configurations, and end up unable to run even a mid-tier model — how is that free? It's clearly "hidden costs"!

And let's not even talk about security. This year, Cisco discovered a large number of exposed Ollama servers with vulnerabilities on Shodan. CVE-2024-37032 allows attackers to execute remote code and steal your models. The Hacker News has also reported DoS attacks, model poisoning... Would you dare expose your Ollama server to the public internet? I sure wouldn't.

So you see, what you thought was "free" is actually "costing you money, time, and risk."

---

The Second Trap: Free Tokens Are the Real Treasure — Haven't You Tasted Them Yet?

In the second half of last year, I completely changed my approach: I stopped fighting with local models and started harvesting free API tokens. The result? It worked so well that I wanted to slap myself — why was I so foolish before?

Let's start with Tencent Hunyuan. Scan a QR code in WeChat and you're in. New users get free credits. For daily tasks like processing files, writing emails, and organizing materials, you won't even use them up in a month. Check the official promotions for exact amounts, but it's enough to keep you going for a long time.

Then there's Tongyi Qianwen. Alibaba Cloud's Plus version costs 0.004 yuan per thousand tokens — ridiculously cheap. Plus, registration gives you free credits that can last a whole year with normal use.

Guess how I combine them now?

Daily chats, file sorting, email handling: Tongyi Qianwen Plus — nearly zero cost, incredibly smooth.
Complex tasks, data analysis: Zhipu GLM-4 at 0.1 yuan per thousand tokens — pricier but worth it, though still a bargain compared to GPT-4.
Privacy-sensitive tasks: A small local model on Ollama — no data uploaded, but only for the most critical private information.

Let me do the math for you: Running GPT-4 Turbo for a month can easily burn through hundreds of dollars. Switch to Tongyi Qianwen Plus, and the same usage might not even cost 10 yuan. Ten yuan! The price of a bubble tea, enough for a month of AI!

So, do you still think "free tokens" are just a gimmick?

---

The Third Trap: Choose the Right Model, and Costs Drop by Half — Don't Be Fooled by "Free" Bias

Some might say, "Free tokens are fine, but the model quality isn't good enough!"

I admit it, but it depends on the scenario. Think about it: Do 90% of your daily tasks require top-tier reasoning? Writing emails, summarizing, organizing materials — domestic models are more than enough, and often faster!

I've tested several domestic models:

DeepSeek-V3: Input ¥2 per million tokens, output ¥3 per million, cache hits at just ¥0.2 per million. Good enough for daily use, with insane cost-effectiveness.
Kimi: Strong at long-text processing, with free credits upon registration — a godsend for writing papers or reading reports.
MiniMax: Lightning-fast inference, perfect for quick responses — you type slower than it replies.

Only when you need top-level reasoning is it worth using GPT-4. But how often does that happen? Maybe once a week?

Here's my current setup:


{
 "agents": {
 "defaults": {
 "runtime": {
 "provider": "deepseek",
 "model": "deepseek-v3"
 }
 }
 }
}

For complex tasks, I manually switch to GLM-4. For simple queries, I call a small local model. Monthly API cost? Less than 50 yuan.

See? It's not that "free models are bad" — you just haven't learned how to use them. The counterintuitive truth is: Cheap models combined wisely are a thousand times better than blindly chasing "free local runs."

---

The Fourth Trap: Don't Be Fooled by the Word "Free" — Time Is More Expensive Than Money

Many people's eyes light up at the word "free," only to spend more time than the money they save.

I've seen someone spend three days configuring a local model to save a few yuan in API fees, only to find their computer couldn't handle it. I've also seen people register on a dozen platforms to harvest free tokens, never using up the credits on any of them, and managing them became more exhausting than a full-time job.

The real way to save money isn't "all free" — it's "pay as you go."

Here's my advice — just copy it:

Daily tasks: Use free tokens (Tencent Hunyuan, Tongyi Qianwen) — zero cost, no brainer.
Medium tasks: Use cheap models (DeepSeek, Kimi) — a few cents gets it done, cheaper than a sip of water.
Complex tasks: Use flagship models (GPT-4, Claude) — spend when necessary, but you'll barely use them in a month.
Privacy-sensitive tasks: Use local models (small models on Ollama) — only for the most critical data, don't shove everything locally.

With this combination, costs drop to nearly negligible. And the time you save is enough to read three books, take a course, or spend more time with family.

---

A Final Honest Word

"AI API is so expensive" — that complaint is already outdated.

The free resources out there are enough for smart people to use for a long time. The real barrier has never been money — it's knowing where to get them.

Tencent Hunyuan, Tongyi Qianwen, DeepSeek all offer free credits. Combined, they'll keep you going for a year.

Stop obsessing over Ollama. Change your mindset, and you'll discover: The free AI world is bigger than you think — and the greatest freedom is no longer paying with your time for the word "free."

所有人都以为Ollama免费，其实它比API贵10倍 (English)

所有人都以为Ollama免费，其实它比API贵10倍 (English)

Don't Be Fooled: Ollama Isn't Your Only Free Option — You've Been Screwed by the Word "Free"

The First Trap: Ollama's "Free" Is a Lie — It Hides the Cost in Your Wallet

The Second Trap: Free Tokens Are the Real Treasure — Haven't You Tasted Them Yet?

The Third Trap: Choose the Right Model, and Costs Drop by Half — Don't Be Fooled by "Free" Bias

The Fourth Trap: Don't Be Fooled by the Word "Free" — Time Is More Expensive Than Money

A Final Honest Word

Cael Lee

Ready to get started?