AI ART: VAM to ART, image consistency, LORA, ControlNet (Patreon)

Published:

2023-03-23 20:28:27

Imported:

2023-03

Content

Until recently I've been using Easy Diffusion for AI images and that's what I covered in my previous AI art posts. That UI is great and does the job quite well. But here's how to get more from AI images and the recent AI advancements that have been going on.

1. CIVITAI.COM

Best site right now to find AI models & other smaller stuff that I call filters (I'll get into that in a minute). You need an account (free) to see the nsfw content which is most of the stuff. The ones with the tag CHECKPOINT are AI models that go in the models/stable-diffusion folder.

2. MAKE YOUR OWN AI MODELS - VERY EASY

You can make your own models by blending/combining other models (checkpoints). In Easy Diffusion there's a tab Merge Models. You pick two models that you like in A and B. Go to Make multiple variation, and here you can set the percentages of each, like in a recipe (50-50 if you want, or 5% A and 95% B, etc)

By making multiple variations, Easy Diffusion will generate multiple models for you and then you can test them out and see which works better and keep just that and delete the rest. In the example below it will generate 10 models like
- MyBlendName-0.05 (5% of A blended into B)
- MyBlendName-0.14 (14% of A blended into B) etc

3. AUTOMATIC1111

EasyDiffusion is awesome and easy but for more features you might want to try Automatic1111 too. It's also local like Easy Diffusion, but a bit more advanced and has more people developing it and adding more features.

INSTALL STEP 1: CHECK PYTHON
If you already played with AI stuff then you already have python. You might want just to check that you have the latest version: open command prompt and type python --version. If it's lower than 3.10, then just download a more recent version and it will update it: https://www.python.org/downloads/release/python-31010/ (Windows installer 64bit). If you don't have it, you need to install it.

INSTALL STEP2: DOWNLOAD AUTOMATIC1111
Click on the green Code button at the top here and pick Download zip, and unzip that. Then you can run webui-user.bat to start AUTOMATIC1111. First time when you run it, it will install everything it needs and will take a bit.

RUNNING IT:
It will not open a browser tab automatically like Easy Diffusion. Instead webui-user.bat will write in the command prompt when it finishes: Running locally at: "http://127.0.0.1:7860". You have to copy and paste that in a browser to access the web UI. It's always the same url so you can bookmark it.

PERFORMANCE:
If you don't have a crazy expensive gpu, you can edit webui-user.bat with notepad and set the "COMMANDLINE_ARGS" line to: "COMMANDLINE_ARGS=--xformers --medvram". Restart Automatic1111 if already opened. This will prevent crashes due to out of memory.

HOW TO USE IT:

I won't do a full guide, there are lost of them already, videos on youtube etc. But the gist of it is that there are multiple tools that you can work with at the same time, each tab is a separate thing.

WHERE TO ADD STUFF FROM CIVITAI:
- checkpoints go in /models/Stable-diffusion - this are big AI models, usually a few GBs
- LORA go in /models/lora - filters basically, I'll get into that later, usually 50-150mbs
- TEXTUAL-INVERSIONS go in /embeddings - filters basically, but smaller in size, a few kb
- HYPERNETWORKS go in /modes/hypernetworks - filters basically, 85mbs

You can add notes to them (like extra tags from civitai that some have) by adding a .txt file with the same name as the model in the same folder with it. In the txt I think you can have just a single line of text and something short. It will then appear under the model (lora/hypernet/etc) name.

Lora/Textual-inversions/hypernetworks are sort of like addons/filters that make the AI model go in more specific directions and help you achieve better results. You can get them from civitai.com. To use them, there's a button under the generate images button that will show them:

Then they appear nicely in tabs and when you click on them they get added to the text prompt. LORA & Hypernetworks get added like <lora:CostumeBikiniJeans_v1:1>, where :1 means it's at 100%, if you change it to 0.5 it will be at 50% "filter power" so to speak.

For NSFW stuff there are lots of LORAs that help get better and more specific images. You can set photos for them by running some image, selecting it and the hovering over a checkpoint/lora/etc. name and clicking replace preview.

If you get blurry images for some reason,like low-quality with artefacts, you might want to copy https://huggingface.co/stabilityai/sd-vae-ft-mse-original/blob/main/vae-ft-mse-840000-ema-pruned.ckpt (or copy it from /vae in easy diffusion) to models/vae, and in A1111 in settings > stable diffusion > SD Vae enable it. If you have models/checkpoints that don't have "-vae" in their name, I think they'll be blurry without it.

4. IMAGES WITH PROMPTS SAVED IN THEM

A cool thing about the images made with A1111 is that they keep in metadata their settings. So you can drag an image in the PNG Info tab, and it will have there all the settings that it was generated with and you can quickly redo all those settings again. This also happens for the collage photo A1111 generates for each batch of images too.

5. CONTROLNET

This is a very exciting new feature that allows controlling the poses for the generated images. Won't go too in depth about it, you can look it up on youtube if you want to to see it in action or learn more. It's a must-have though! A few years ago it would have been considered magic.

Very quick guide:
- In A1111 in the Extensions tab > Available click Load and search for sd-webui-controlnet, click install and restart A1111 when finished. (close the command prompt and open webui-user.bat again)
- download these https://huggingface.co/webui/ControlNet-modules-safetensors/tree/main and put them in models/ControlNet

In the UI there will be now a ControlNet section on the left side under the prompt stuff. In it you can pick canny and control_canny-fp16, check Low VRAM if you're making very large images, drop an image there and check also Enabled. Now the images generated will try to follow the pose in the provided image.

You can lower the Weight to give more creative freedom to the AI if you want to. Instead of canny, you can also use depth (in both dropdowns) for a different style of pose detection. I think it'a bit slower but better in some regards. Canny is pretty reliable and fast though.

6. VAM PHOTOS

You can use all the above with vam screenshots to make them more realistic & sexy, and do photo sets or game art. In the img2img tab, you can drag an drop a vam image both in the img prompt and the controlnet one.

ControlNet will make sure the pose stays perfect and you don't have to write prompts to help it. It will also help with outlines for clothes and other shapes, which will help with consistency across images.

The image prompt will make sure the colors are kept and stay consistent as well.

Then it's up to the text prompt where you have to add as many details as possible describing the scene, type of clothing, etc. to help the AI do very similar things across different images.

One trick for face consistency across images is to use LORAs (or textual-inversions, hypernetworks) and use a known character at a lower value like 0.1-0.5. Something like <lora:bimboMakeup_V1:0.5> for example. The generated images will then be much more similar since they're not as random as they'd normally but focused on a specific person thanks to the LORA filter.

You can see the attached images as a few examples of that. They're very low-effort though, I did them very quickly and picked the first results and didn't experiment much with the settings. By doing more variations you can then pick the most similar ones and force consistency for smaller details and get much better results.