[Tutorial] Pony Diffusion and General Tips (Patreon)
Content
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Introduction
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
This tutorial covers many things that you may want to know about Pony Diffusion.
What is Pony Diffusion?
Pony Diffusion is a flexible checkpoint based on SDXL.
It's trained with a different logic than the usual SDXL derivatives. So it will also work differently.
If you are new to anything related to AI you can look for more basic tutorials on the web or try my old tutorials here: Stable Diffusion Tutorials Collection
DISCLAIMER: You should have a good GPU to run it. You can test it for yourself, but if your computer is freezing or lagging too much you should reduce the resolution or stop using it.
Pony Diffusion link: Civitai
When using LoRA you should find models that are compatible with it.
So remember to use the filters.
There is a lot of models that are also derivatives of Pony Diffusion. You can use the filter above with checkpoint marked to find other styles.
You can also use LoRA for styles.
An example of styles: Civitai
Why use Pony Diffusion instead of other checkpoints?
It's way more accurate at doing what you want than models made based in Stable Diffusion 1.5.
It's flexible, so more people train models for it.
It's free.
Since it's flexible, you can train different styles too.
Produces better hands and feet than any old model.
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Settings
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Pony diffusion uses different prompts than the usual NovelAI derivatives that are trained on Danbooru tags.
A decent way to work with these tags is to use an extension like this one: https://github.com/64617/pony-diffusion-webui#tag-autocomplete
Follow all the steps when installing, because you should activate a different .csv file.
So it uses quality tags like score_9, score_8, or score_8_up, and so on. You should take a look at the previews in the pony diffusion page to see where you should use them.
Pony Diffusion is trained using the same resolutions as SDXL, so 768 - 1024px are the optimal range and size. You can try different resolutions between these numbers. I personally use 1152px whenever I can, but it can produce more errors.
Should you use High Res fix? Most of the time I think it will be better than not using. So I have a default setting that should help.
The more you upscale, the higher should be the denoising strength.
If you upcale by 1.5x, denoising strength of 0.3 should be enough. If you upscale by 2.0 or similar size, maybe a denoising strength of 0.4+ will help.
So a general good setting would be:
Sampling method: Any of your preference, see the pony diffusion page for examples.
Sampling Steps: 30-40
Hires. Fix: ✅
Upscaler: R-ESRGAN 4x+ Anime6B or similar.
Hires Steps: 15-20
Denoising Strength: 0.3 (or 0.4)
Upscale by: 1.5 (or 2.0)
Should I use Clip Skip 1 or 2? If you aren't using any LoRA. I think 1 is better. If you are using LoRA then read the LoRA description, because it can be trained using different clip skip.
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Prompts
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Prompt Guidance and Help:
Positive:
score_9, score_8_up, score_7_up, source_anime, anime screencap,
red hair, blue eyes, long hair, big breasts,
gala dress, strapless dress, collarbone,
1girl, solo,
waving at you, standing,
open smile, looking at you,
indoors, restaurant,
<lora:modern_anime_screencap_v1_0:1>
Negative:
score_6_down, score_5_down, score_4_down, score_3_down, score_2_down, score_1, extra digits,
NOTE: If you wish to make NSFW you should add rating_explicit to the positive prompts, I usually put it in the quality tags.
I separate in lines for organization so it's easy to read and change things.
1. Quality tags.
2. Physical Appearance.
3. Clothing.
4. Characters.
5. Actions.
6. Expressions.
7. Environment.
8. LoRAs
I don't do that for negatives because I only change them depending on what I want to do. For example, if you don't want moles on your character and it's having a mole where you don't want it, you can add "mole" to the negative.
CFG Scale: If you want the style to be stronger and if it's not getting what you want you can increase it. I usually go with 5-7. But you can increase it to 10 if you want.
You can check the images in the post, they are not edited.
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
VAE
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
The VAE I used is a different one than the usual SDXL_VAE. You can test it if you want by downloading here.
You can also download the default VAE on Pony Diffusion page.
There are other VAE on Civitai if you look for it.
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Dimensions
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Another tip I want to point out is that images can be generally Squares, Portraits or Landscapes.
If you want to make a character lying down and with a perspective from the side. You should use Landscape or the AI will have problems making it.
Usually Squares are the best for general purposes. So 1024x1024 or 768x768 should be your go to if you aren't sure.
A standing character will have a better resolution in Portraits.
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
Final Section
▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
I hope this helps you get started with one of the most promising SDXL checkpoints. We will have SD 3.0 soon, so don't get behind it.
If this isn't enough as a good tutorial, you should dig for more on youtube or civitai. (Yes there are tutorials on civitai)