什么是LoRA模型,如何使用和训练LoRA模型?你想要的 (English)

Generated: 2026-06-21 23:08:50

---

You see, the first time I encountered LoRA, I was totally lost.

My mind was stuck on just one question: what’s the real difference between this thing and those gigantic models that take up several gigabytes? I dug through resources, scoured forums, and stumbled into so many pitfalls my knees practically gave out before I finally figured it out. Today, no beating around the bush—from concepts to hands‑on practice, I’m laying out all the hard lessons I’ve learned and the experience I’ve accumulated. Pull up a chair and listen.

What LoRA Actually Is

Don’t let the academic term “Low‑Rank Adaptation” scare you. Put simply, LoRA is a way to patch Stable Diffusion’s big model.

The base model is the skeleton; LoRA is the muscle.

Think about it: during training, you don’t touch the skeleton itself. You just add a few small ropes at the key connection points—low‑rank matrices. When you use it, you hook those ropes up, and the output changes.

This idea was originally created for NLP. Back then, GPT‑3 had hundreds of billions of parameters; retraining it once could cost you an apartment. LoRA’s approach was so clever: you only train those low‑rank matrices that are a few hundred megabytes, plug them into the original model, and they work. Later, the AI art community adapted it for Stable Diffusion with surprisingly good results.

Guess what? On Civitai, there are only about 2,000 SD base models, but over 40,000 others are all small models like LoRA!

The ink‑wash painting style and the Yae Miko IP I’ve been using? All cranked out by LoRA. A LoRA file is usually around 144 MB, compared to the 2 GB+ of an SD base model—it’s like a tiny USB drive going up against a massive hard drive.

But! There’s one huge requirement: LoRA must be used with its base model, and the base model version has to match. If you train a LoRA on SD 1.5, you can’t use it on SDXL—otherwise, you get horror movie material. I’ve said this once, twice, three times—really, if the versions don’t match, the image turns into a nightmare.

How to Use LoRA

Most people use stable‑diffusion‑webui. It’s simple: drop the downloaded .safetensors file into the models/Lora folder, and when generating an image, write in the prompt.

The weight is usually set between 0.6 and 1.0. If you crank it too high, the image falls apart. That hits home for me—the first time I set it to 1.5, the face in the output exploded.

After Civitai got blocked in China, I started using the mirror site aigccafe.com, which is accessible directly from China and has good download speeds. When downloading a LoRA, always take note whether the base model it’s trained on is SD 1.5 or SDXL—don’t mix them up.

Speaking of which, let me add: have you ever spent a bunch of time downloading a LoRA, imported it, and then the face looks like it’s been pixelated? Nine times out of ten, it’s because the base model version was wrong. It’s like plugging a USB drive into the wrong port—right port, instant recognition; wrong port, nothing reads.

Training Your Own LoRA

Alright, here comes the main event. I spent a whole month wrestling with this.

I tried Qiuye’s SD‑Trainer, Zhunijiang’s Cyborg Furnace, and OneTrainer—every step was a tale of tears. Now I’m pouring all my experience out for you.

Step 1: Collecting Data

At first, I tried to be lazy and just grabbed 20 images from the web to start training.

What happened? The model could only reproduce the angles from those few images. Change the lighting, and it would completely break down. At that moment, I spent a good ten minutes yelling at my screen.

Later, I followed the rules strictly:

At least 15 images, preferably 30 to 50. For a test with a real person’s face, I took 40 photos from different angles and expressions, plus 5 full‑body shots.
Image quality comes first. Blurry, too dark, or low‑contrast images get deleted right away. I use feh on Linux to view and delete images; for Windows, I recommend IrfanView.
If you’re training a character, the subject must be clear, and background clutter should be minimized. When I trained a “cyberpunk female warrior,” I deliberately chose images with simple backgrounds—so the model would only learn the character’s features, not some random trash can in the background.

You see, the logic is simple: you want the model to learn you, not the flowerpot in the background.

Step 2: Tagging and Regularization

This is where it’s easiest to step on a mine. Step on it once, and it will teach you a lesson.

Many in the community say fewer tags are better—some even use only a single trigger word. I believed that at first too. Result? The model was severely overfitted; it only recognized those few images. Change the background, and it failed.

Then it clicked: too few tags, and the model is like a student who only memorizes the answer key—if you don’t ask the exact same question, it hands in a blank paper.

So what’s the right approach? First, use wd14 tagger for automatic tagging, then manually check and remove irrelevant tags. But don’t delete all the environment tags—keep some to help the model understand scene associations.

As for regularization, I didn’t use it at all at first.

The result? The trained LoRA, whenever I called that class word (e.g., “1girl”), would only generate the face from the training set, even the clothes were fixed. Frustrating, right?

Later, I followed the Dreambooth approach: I used the base model to generate 200 images of various girls with “1girl” and put them into the regularization folder, then trained. And then? The LoRA became much more stable!

Let me tell you: it’s like teaching a child to recognize faces—if you only show him a photo of your whole family, he’ll think everyone in the world looks like that. Show him photos of many different people, and he’ll learn what a real “human face” is.

Step 3: Choosing Tools and Configuring Parameters

There are three mainstream training tools on the market. I’ve used each one, and I have something to say about each.

Qiuye’s SD‑Trainer

Most beginner‑friendly, fully Chinese interface, one‑click configuration. But I sometimes run into errors, which I usually fix by restarting. It has most hyperparameters built in; if you don’t want to bother, the defaults can produce decent LoRA.

Zhunijiang’s Cyborg Furnace (Saibo Danlu)

My personal recommendation. Cool interface, intuitive operation, and the author updates frequently. It lets you preview the training progress—every 50 steps you get an image, so you can stop and adjust at any time. It has built‑in presets for “character,” “building,” etc., and even the tag editor includes a translation feature. The only pitfall is that the training directory cannot contain Chinese characters, or it will throw a path error.

OneTrainer

If you like to tweak parameters yourself, try this one. It supports scaling from SDXL to SD format, but when I first used it, the scaler didn’t work—turned out it was a version issue, and it worked after updating. OneTrainer has fine‑grained settings, like r (rank), loraalpha, loradropout, all need manual input. I usually start with r=8, alpha=16, dropout=0.1, and increase if I have enough VRAM.

This brings up a counterintuitive truth: you might think more complex means more professional, but in fact, the best tool is the one that suits you. Qiuye is for beginners, Cyborg Furnace for those who like to tinker, OneTrainer for hard‑core experts. Pick according to your own level—it

什么是LoRA模型,如何使用和训练LoRA模型?你想要的 (English)

什么是LoRA模型,如何使用和训练LoRA模型?你想要的 (English)

What LoRA Actually Is

How to Use LoRA

Training Your Own LoRA

Step 1: Collecting Data

Step 2: Tagging and Regularization

Step 3: Choosing Tools and Configuring Parameters

Cael Lee

Ready to get started?