Settings and Configs - Token Merging (Patreon)

Published:

2023-04-30 22:18:51

Imported:

2023-05

Content

Token Merging is a setting that appears to control the ability to merge redundant tokens at very little quality loss, and at huge speedup gain. These are the settings that I am using for this feature.

Here is a breakdown of what the settings do:
```
- ratio: The ratio of tokens to merge. I.e., 0.4 would reduce the total number of tokens by 40%. The maximum value for this is 1-(1/(sx*sy)). By default, the max is 0.75 (I recommend <= 0.5 though). Higher values result in more speed-up, but with more visual quality loss.

- max_downsample [1, 2, 4, or 8]: Apply ToMe to layers with at most this amount of downsampling. E.g., 1 only applies to layers with no downsampling (4/15) while 8 applies to all layers (15/15). I recommend a value of 1 or 2.

- sx, sy: The stride for computing dst sets (see paper). A higher stride means you can merge more tokens, but the default of (2, 2) works well in most cases. Doesn't have to divide image size.

- use_rand: Whether or not to allow random perturbations when computing dst sets (see paper). Usually you'd want to leave this on, but if you're having weird artifacts try turning this off.

- merge_attn: Whether or not to merge tokens for attention (recommended).

- merge_crossattn: Whether or not to merge tokens for cross attention (not recommended).

- merge_mlp: Whether or not to merge tokens for the mlp layers (very not recommended).
```