Home Artists Posts Import Register

Content

Hi again!
This is a second part of my basic tutorial series. This is following a request from one my patrons here and I'll cover how to make a fan art nsfw scene with specific parameters. (Sorry for any mistakes or confusion again)
If you want to post your results to my discord you're welcome there (It's free for everyone, but there are some patreon unique channels sorry): Discord Link 

Summary:

1. Explanation on models
2. Prompts Explanation
3. Recommended Negative Embeddings
4. Extra networks
5. Generating the image
6. Final thoughts
7. Troubleshooting

-----

1. Short explanation on types of AI models. (Explanation taken from: Stable Diffusion Art)

1.1. Checkpoints: Models, sometimes called checkpoint files, are pre-trained Stable Diffusion weights intended for generating general or a particular genre of images.

What images a model can generate depends on the data used to train  them. A model won’t be able to generate a cat’s image if there’s never a  cat in the training data. Likewise, if you only train a model with cat  images, it will only generate cats. 

These are the real Stable Diffusion models. They contain all you need to  generate an image. No additional files are required. They are large,  typically 2 – 7 GB.

1.2. Hypernetworks: They are additional network modules added to checkpoint models.  They  are typically 5 – 300 MB. You must use them with a checkpoint model.

1.3. Textual Inversions: Also called embeddings. They are small files defining new keywords to generate new objects or styles. They are small, typically 10 – 100 KB.  You must use them with a checkpoint model.

1.4. LoRA (Low-Rank Adaptation): They are small patch files to checkpoint models for modifying styles.  They are typically 10-200 MB. You must use them with a checkpoint model.

1.5. Extra personal notes (yes, from me, koikoi):
Hypernetworks are mostly not used anymore, don't mind them.
Textual inversions are very nice, but harder to train than LoRA and may lead to worse results if not trained right.
Checkpoints can also be mixed/merged so you will find a lot of these mixes in civitai.
LoRA are the mostly used to tackle on a specific subject like a Character, Style or Action.
There is also LyCoris Locon models which are very similar to LoRA, but they require an additional extension to work.

2. Prompts explanation.

2.1. To explain prompts I must explain that there are basically one big ancestral of all checkpoint models which is Stable Diffusion 1.5 (the most used models are derivatives of this version)
Stable diffusion 1.5 use prompts that are similar to a description. You would need to type things like: A raw photo of a woman, carrying a bouquet of red flowers, walking on the sidewalk with bushes and trees in the background.

However, there was an anime based model that was built on Stable Diffusion 1.5 and they've altered even some of the neural nodes of it to place Danbooru tags and trained with many thousands of anime art images. It's named Novel AI. This works better with tags like: 1girl, holding bouquet, red flowers, walking, sidewal, bushes, trees background

Many models now on the internet are derivatives of these two models. (There are rare exceptions).

The example I've used on the first part is AbyssOrangeMix2 which is a trained model that used NAI as part of its mix.

2.2. About negative prompts:

These are very important, but usually you don't need to change them too much so you can have them saved and alter a bit depending on your needs.

Many type prompts like low quality, worst quality, bad anatomy, low res, bad painting
and you are probably wondering why do I have to type these kind of prompts on negative? Does SD tries to make bad images on purpose?
It happens that SD don't know what good and bad are. But it's trained in a way that bad images are tagged as low quality prompts so if you put this on negative it will avoid making something similar to what it was trained as "low quality".

Anyway, an important secret to get good quality images is to actually tell Stable Diffusion to NOT make bad quality. So there are many Textual Inversion models on the internet and many of them are trained to be placed on Negative prompts so it will help your image to now be bad.

3. Recommended negative prompts for most cases (and their textual inversions):

I use this as negative 90-95% of the time:
ng_deepnegative_v1_75t, (worst quality, low quality:1.3), EasyNegative, bad-hands-5, (bad_prompt_version2:0.7), (painting by bad-artist-anime:0.7), bad-image-v2-39000, extra digits, bad anatomy, text, logo, censored, hair intakes, fewer digits, cropped

I've linked the Textual Inversions on the text for you to click and download them. It's optional, but I recommend them.
Place them on: (stable-diffusion-webui\embeddings) folder.
Remember to reload your UI if you have it open already.
Check if their names are exactly the same as in the prompts, because what I typed in prompts are exactly the same as their files. Example: EasyNegative is EasyNegative.pt.

OBS.: Spaces between commas don't matter, it's just visually easier to read.

4. Show extra networks

There should be a similar button to this bellow your Generate button on your webui.
Click on it.
It will open 4 tabs:
Textual Inversion, Hypernetworks, Checkpoints, Lora.
If your Textual Inversions are not showing there, hit Refresh button.
If they are still not there after loading, check if you downloaded them to the right folder.

Now let's download two Loras for this tutorial. (You are free to do it differently, I'm making this to help an specific demand).

Hange (Attack on Titan) LoRA - By Peithos

Shingeki no Kyojin (Attack on Titan) Anime Style LoRa - by Lykon

Place them in: (\stable-diffusion-webui\models\Lora) folder.

On your "Show extra networks" go to your Lora tab and hit the Refresh button again.
They will show there as 2 buttons.


5. Generating the image

5.1. Before generating images, you must understand that there are 'weights' for your tags/tokens in prompts. You might have noticed that I have (worst quality, low quality:1.3) typed this way on negative prompts.
The parentheses increase the importance of a token, you can also place multiple parentheses on the same thing like (((this))), but becareful it may be too strong and cause weird results.
There is the numeric value following the tokens. The ":1.3" works as a strenght multiplier to make it more important too. Be mindful because this also increase quantity of some tokens.

5.2. I recommend you place my prompts in your webui now and follow the settings I use and don't mind to understand too much of it for now.

Positive: (masterpiece), (best quality), 3D, <lora:aot_style:0.6>, aot style, <lora:HangeAOT:0.8>, HangeAOT, glasses, (eyepatch), (detailed face), 1girl, solo, ponytail, bangs, nose, brown eyes, brown hair, looking at viewer,  NSFW, naked, completely naked, nudity, breasts, (huge breasts:0.75), adult woman, nipples, collarbone, pressing tits, paizuri, (blush), angry, teeth, clenched teeth, tsurime, titjob, 1boy, penis, pov, <lora:povPaizuriLora1MB_povpaizuri:0.7>, roman city, open mouth

Negative: ng_deepnegative_v1_75t, (worst quality, low quality:1.3), EasyNegative, bad-hands-5, (bad_prompt_version2:0.7), (painting by bad-artist-anime:0.7), (normal quality:1.2), bad-image-v2-39000, extra digits, bad anatomy, text, logo, censored, hair intakes, fewer digits, cropped, strabismus, paintings, sketches

Settings:

AbyssOrangemix2_Hard.safetensors as checkpoint.
orangemix.vae.pt as SD VAE
Sampling Method: Euler a
Sampling steps: 28
Hires. fix: Check
Upscaler: R-ESRGAN 4x+Anime6B
Hires steps: 15
Denoising strength: 0.35
Upscale by: 1.5 or 2.0 (Depending on your GPU)
Width: 512
Height: 768 (or 512 if your GPU is weak)
CFG Scale: 7
Seed: 3519665097 (this is to make your image closer to my result, it will not be the same probably).
It should look to something like this:

IMPORTANT: the same rule of names applies to Lora. I'm using <lora:aot_style:0.6>, <lora:HangeAOT:0.8> in positive prompts. These are used to call the Lora. If the name is different (even letter captalization) you should fix by clicking on the button in your Lora tab and replacing it like this using the same weight. 0.6 for aot_style and 0.8 for HangeAOT.
This is probably where you will make mistakes so pay a lot of attention to this part.
The order of these prompts doesn't matter at all, but they should be between "<>" and have their names exactly as their files.

5.3. Now let's hit the Generate button. It will take some time to generate.

If you've generated something like this (without the censorship) then you're good.

6. Final thoughts

I could explain more on why I choose these settings and prompts, but it's better to be on another topic. You can try figuring out on your own. Usually it's better to check the original checkpoints and Lora pages and read their descriptions so you will understand better how they work individually.
For now, understand that order is important for tokens that are not the <Lora> token.
Please check if you did everything that I did on this tutorial.

7. Troubleshooting
7.1. Out of memory: reduce the upscaling or test without Hires. fix. The image will look worse though.
7.2. My image is looking too much different than yours: check if you used the same prompts and settings. Also remember to check my first part of this tutorial and make sure you followed my settings suggestion. Also remember to verify if your Lora is the same name as mine. A way of testing it is if you click on your Lora button it will erase the <lora:aot_style> or <lora:HangeAOT> from your prompts. If it's not erasing and instead is placing another one, then it's different. So replace my original prompt with your own token and use the same weights.
7.3. Your image is good, but it's different: Different GPU cards have different internal seeds and will lead to different results. Also models have different hash codes that will also generate different results. (That's actually cool so you can make more unique images)

Good luck! I'll make a third part if you like this one. (Not today yet)

Comments

No comments found for this post.