Stable WarpFusion v0.8.5 Daily - changelog (Patreon)

Published:

2023-04-07 06:31:44

Imported:

2023-09

Content

Quickstart guide if you're new to google colab notebooks: https://docs.google.com/document/d/1gu8qJbRN553SYrYEMScC4i_VA_oThbVZrDVvWkQu3yI

Local+colab install guide: https://github.com/Sxela/WarpFusion/blob/main/README.md

Notebook download: https://www.patreon.com/posts/81167091

Changelog:

New:

add masked diffusion callback
add masked latent guidance
add option to offload model before decoder stage
add fix noise option for latent guidance
add noise, noise scale, fixed noise to masked diffusion
add ControlNet models from https://github.com/lllyasviel/ControlNet
add ControlNet downloads from https://colab.research.google.com/drive/1VRrDqT6xeETfMsfqYuCGhwdxcC2kLd2P
add settings for ControlNet: canny filter ranges, detection size for depth/norm and other models
add vae ckpt load for non-ControlNet models
add selection by number to compare settings cell
add noise to guiding image (init scale, latent scale)
add noise resolution
add guidance function for init scale
add fixed seed option
add separate base model for controlnet support
add smaller controlnet support
add invert mask for masked guidance
add use_scale options to use loss scaler (guidance seems to work faster)
add instruct pix2pix from https://github.com/timothybrooks/instruct-pix2pix
add image_scale_schedule adn template to support instruct pix2pix
add frame range to render a selected range of extracted frames only
add load settings by run number
add model cpu-gpu offload to free some vram

Fixes:

fix frame_range starting not from zero not working thanks to Oleg#8668
add controlnet_preprocessing switch to allow raw input
fix sampler being locked to euler
fix image_resolution error for controlnet models
fix controlnet models not downloading (file not found error)
fix settings not loading with -1 and empty batch folder
fix prettytable requirement
fix blip generationg ccaptions for n-th frame even with a different setting
fix load settings not working for filepath
fix norm colormatch error
fix warp latent mode error
fix prompts not working for loaded settings thanks to Euclidean Plane#1332
fix promts not being loaded from saved settings
fix xformers cell hanging on Overwrite user query
fix sampler not being loaded
fix description_tooltip=turbo_frame_skips_steps error
fix -1 settings not loading in empty folder
fix -1 settings error
fix colormatch offset mode first frame error

Separate base model for ControlNet / small controlnet support

You can now specify any v1.x model checkpoint with any of controlnet_v1.5_* model_version. It will assume that the checkpoint is correct and load it as the base of the controlnet model. It will then look for a small controlnet, and download it if it's not being found.

Masked diffusion

You can now use masked diffusion to stylize masked areas for more steps compared to the whole image. The mask source for now is the consistency mask.

Stable-settings -> Non-GUI -> mask_callback

mask_callback:

0 - off. 0.5-0.7 are good values. Value is a % of actual diffusion steps being made.
Diffuse inconsistent area for only before this % of actual steps, then diffuse whole image. So With 50 steps, 0.5 strength and 0.7 mask_callback you will diffuse the masked area for 50*0.5*0.7 = 17 steps, and the wholde image will be diffused for 50*0.5*(1-0.7) = 8 more steps to smoothen the transition.

cb_noise_upscale_ratio - noise upscale in masked diffusion callback

cb_add_noise_to_latent - add noise to latent in masked diffusion callback

cb_use_start_code - fix noise per frame in masked diffusion callback

cb_fixed_code - fix noise across all animation in masked diffusion callback (overcooks fast af)

cb_norm_latent - normalize latent stats

Masked guidance

Stable-settings -> Non-GUI -> masked_guidance

Use mask for init/latent guidance to ignore inconsistencies and only guide based on the consistent areas

guidance_use_start_code - fix noise per frame for guidance (default - True)

ControlNet

define SD + K functions, load model -> model_version -> control_sd15_*

You can select one of ControlNet models. Be aware that warp settings are vastly different across all of those models.
For example, depth/normal map models work best with style strength 0.9-1.0, lower values tend to just return back the input image. Canny model on the other hand is closer to v1.5 warp settings (style strength 0.5, 0.2 is okay with it)

download_control_model:
check to download the checkpoint file for the selected ControlNet model together with it's conditioning counterpart (like midas ddept estimatior for normal map model)

force_download:
check to force overwrite existing model file

Extra settings for ControlNet were added to the stable-settings cell.

detect_resolution: size of the image being fed into ControlNet companion models like midas. Keep it the same size as you output image, or as high as you can get to it.

low_threshold and high_threshold: canny filter parameters

You can specify the source for ContolNet guidance just like you do for depth model: depth_source parameter in GUI -> diffuse tab.

You can also use video as a source for ControlNet Conitioning.

Just put it here: cond_video_path -> Video Input Settings and select GUI -> diffusion -> depth_source -> cond_video

Models tested: all.

Separate vae support

define SD + K functions, load model -> vae_ckpt

Use this to load a standalone variational autoencoder (vae) file.

Compare settings by number

Propagated the load settings fix to the compare settings cell. You can now specity only run numbers as well.

Additions to latent conditioning

This one is for people familiar the with diffusion process.

Added some options for init scale:
- Add noise corresponding to current diffusion timestep to "ground truth image" that we are comparing out generated result to
- Add noise scaling options so that the noise won't add, well, visible noise to the image :D
- Add criterion selection for init scale loss

Instruct Pix2Pix

Load up a stable -> define SD + K functions, load model -> model_version -> v1_instructpix2pix

gui -> diffusion -> image_scale_schedule
gui -> diffusion -> depth_source

More about it here: https://github.com/timothybrooks/instruct-pix2pix

Firstly, you'll need to download the checkpoint here: http://instruct-pix2pix.eecs.berkeley.edu/instruct-pix2pix-00-22000.ckpt

The settings are different from the usual models. cfg_scale and image_scale are highly interconnected. cfg_scale = 7.5 and image_scale = 1.5 are the defaults. If you wish to change cfg_scale, don't forget to adjust image_scale accordingly, or you will get too close or too far away from your image. Even 0.1 difference matters.

Image conditioning source is defined by depth_source argument (sic!), the default is init, which means the model looks at tet and init video frame and tries to combine both. You can try using stylized frame as the source instead, but this may overcook really fast.
You can use custom image conditioning video with instruct pix2pix as well.

Prompting works a bit different, negative prompts included. Try using instructions, like "turn her head into a pumpkin" instead of the usual mix of keywords.

Frame range

Diffuse! -> frame_range

Allows you render only a selected range of frames. For example, if you have extracted 100 frames, with frame_range = [25,75] you will only render 50 frames starting with frame 25.

Load settings by run number

You can now load default settings or load settings via gui by the numberr of the run, if it's in current batch folder. So if your batchname is stable_warpfusion_0.6.0, you can set default_settings_path to 50 and it will load the settigns from batch folder stable_warpfusion_0.6.0, run #50. You can also set it to -1 to load settings from the latest run.

Model cpu-gpu offload

Automatically offload diffusion and text encoder models to cpu ram before decoding image from latent space. Should save a bit of VRAM at the cost of some speed. Your feedback is appreciated here.

Notebook download: https://www.patreon.com/posts/81167091

A reminder:

Changes in GUI will not be saved into the notebook, but if you run it with new settings, they will be saved to a settings.txt file as usual.

You can load settings in misc tab.

You do not need to rerun the GUI cell after changing its settings.

Local install guide:
https://discord.com/channels/973802253204996116/1067887206125015153/1067888756910215278
https://github.com/Sxela/WarpFusion/blob/main/README.md

Youtube playlist with settings:
https://www.youtube.com/watch?v=wvvcWm4Snmc&list=PL2cEnissQhlCUgjnGrdvYMwUaDkGemLGq

For tech support and other questions please join our discord server:
https://discord.gg/sayE6j2sdP

Discord is the preferred method, because it is nearly impossible to provide any decent help or tech support via Patreon due its limited text formatting and inability to add screenshots or videos to comments or DMs.
Error reports in comments will be deleted and reposted in discord.