Blog

Guides, tutorials, and insights on AI coding tools and API providers.

· 2 min read

预训练大语言模型的三种微调技术 (English)

> Generated: 2026-06-20 19:48:04 --- Guess what? Recently, several friends came to me asking the same question: “Large language models are so hot right now—how

Read more →
· 6 min read

DeepSeek-R1 技术报告解读 (English)

> Generated: 2026-06-20 19:30:56 --- To be honest, staring at that loading spinner at three in the morning, just waiting for a technical report to finish loadin

Read more →
· 6 min read

杜克大学最新《可解释机器学习》综述论文,80页pdf阐述 (English)

> Generated: 2026-06-20 19:21:41 --- Last week, I almost got scolded to tears by my boss. It was over a bad case in our recommendation system: the system recomm

Read more →
· 6 min read

!大模型LLM推理优化技术 (English)

> Generated: 2026-06-20 19:01:40 --- Okay, I've read your article carefully. Most of the technical points are genuine insights from your own practice—no major f

Read more →
· 6 min read

大模型的幻觉问题调研: LLM Hallucinatio (English)

> Generated: 2026-06-20 18:54:42 --- Here's the English translation, maintaining the original's storytelling style and narrative flow: --- Can you believe it? M

Read more →
· 7 min read

Transformer & Bert 相关问题复盘及 (English)

> Generated: 2026-06-20 18:39:41 --- Have you ever had that kind of interview? After three months of fall recruitment, I was so sick of answering Transformer an

Read more →
· 6 min read

一文带你熟悉lora微调各类参数,轻松上手deepsee (English)

> Generated: 2026-06-20 18:32:49 --- Damn, Fine-Tuning DeepSeek with LoRA Nearly Broke Me! So the other day I got a job—fine-tuning DeepSeek for a psychology co

Read more →
· 6 min read

LoRA微调参数少99.6%,效果反超全量微调 (English)

> Generated: 2026-06-20 18:28:32 --- Brother, I Almost Drove Myself Crazy Just to Save One A800 You have to come back with me to that late night last year

Read more →
· 6 min read

垂直领域大模型的思考 (English)

> Generated: 2026-06-20 18:22:13 --- Have you noticed? Lately, when I scroll through my feed, eight out of ten posts are about ChatGPT, Wenxin Yiyan, or Tongyi

Read more →
· 3 min read

Mixture of ExpertsMoE学习 (English)

> Generated: 2026-06-20 18:10:50 --- Three months ago, I was hammering away at my keyboard, watching a line of text spin on the screen: **How exactly does MoE s

Read more →
· 6 min read

大模型思维链Chain-of-Thought技术原理 (English)

> Generated: 2026-06-20 18:05:02 --- Just now, a friend came running over excitedly and asked me: "Quick, look! This model says it 'thought' for 30 seconds—is t

Read more →
· 3 min read

SFT、RLHF、DPO、IFT — (English)

> Generated: 2026-06-20 17:56:39 --- To be honest with you, now whenever I hear the phrase "DPO is cheap," I get a headache—really, a headache. Last year I spen

Read more →
← Previous 1 ... 21 22 23 24 25 ... 27 Next →