onnx-web

14 KiB

Raw Blame History

Converting Models

This guide describes the process for converting models from various formats, including Dreambooth checkpoints and LoRA weights, to the directories used by diffusers and on to the ONNX models used by onnx-web.

Converting Models

Conversion steps for each type of model

You can start from a diffusers directory, HuggingFace Hub repository, or an SD checkpoint in the form of a .ckpt or .safetensors file:

LoRA weights from kohya-ss/sd-scripts to...
SD or Dreambooth checkpoint to...
diffusers directory or LoRA weights from cloneofsimo/lora to...
ONNX models

Textual inversions can be converted directly to ONNX by merging them with the base model.

One current disadvantage of using ONNX is that LoRA weights must be merged with the base model before being converted, so the final output is roughly the size of the base model. Hopefully this can be reduced in the future (https://github.com/ssube/onnx-web/issues/213).

If you are using the Auto1111 web UI or another tool, you may not need to convert the models to ONNX. In that case, you will not have an extras.json file and should skip the last step.

Converting diffusers models

This is the simplest case and is supported by the conversion script in onnx-web with no additional steps. You can also use the script from the diffusers library.

Add an entry to your extras.json file for each model, using the name of the HuggingFace hub repository or a local path:

    {
      "name": "diffusion-knollingcase",
      "source": "Aybeeceedee/knollingcase"
    },
    {
      "name": "diffusion-openjourney",
      "source": "prompthero/openjourney"
    },

To convert the diffusers model using the diffusers script:

> python3 convert_stable_diffusion_checkpoint_to_onnx.py \
  --model_path="runwayml/stable-diffusion-v1-5" \
  --output_path="~/onnx-web/models/stable-diffusion-onnx-v1-5"

Based on docs and code in:

Converting SD and Dreambooth checkpoints

This works for most of the original SD checkpoints and many Dreambooth models, like those found on Civitai, and is supported by the conversion script in onnx-web with no additional steps. You can also use the script from the diffusers library.

Add an entry to your extras.json file for each model:, :

    {
      "name": "diffusion-stablydiffused-aesthetic-v2-6",
      "source": "civitai://6266?type=Pruned%20Model&format=SafeTensor",
      "format": "safetensors"
    },
    {
      "name": "diffusion-unstable-ink-dream-v6",
      "source": "civitai://5796",
      "format": "safetensors"
    },

For the source, you can use the name of the HuggingFace hub repository, the model's download ID from Civitai (which may not match the display ID), or an HTTPS URL. Make sure to set the format to match the model that you downloaded, usually safetensors. You do not need to download the file ahead of time, but if you have, you can also use a local path.

To convert an SD checkpoint using the diffusers script:

> python3 convert_original_stable_diffusion_to_diffusers.py \
    --checkpoint_path="~/onnx-web/models/.cache/sd-v1-4.ckpt" \
    --dump_path="~/onnx-web/models/stable-diffusion-onnx-v1-5"

Based on docs and code in:

Converting LoRA weights

You can merge one or more sets of LoRA weights into their base models, and then use your extras.json file to convert them into usable ONNX models.

LoRA weights produced by the cloneofsimo/lora repository can be converted to a diffusers directory and from there on to ONNX, while LoRA weights produced by the kohya-ss/sd-scripts repository must be converted to an SD checkpoint, which can be converted into a diffusers directory and finally ONNX models.

Figuring out which script produced the LoRA weights

Weights exported by the two repositories are not compatible with the other and you must use the same scripts that originally created a set of weights to merge them.

If you have a .safetensors file, check the metadata keys:

>>> import safetensors
>>> t = safetensors.safe_open("/home/ssube/lora-weights/jack.safetensors", framework="pt")
>>> print(t.metadata())
{'ss_batch_size_per_device': '1', 'ss_bucket_info': 'null', 'ss_cache_latents': 'True', 'ss_clip_skip': '2', ...}

If they start with lora_, it's probably from the cloneofsimo/lora scripts. If they start with ss_, it's probably from the kohya-ss/sd-scripts scripts.

If you get an error about missing metadata, try the other repository. For example:

  warnings.warn(
Traceback (most recent call last):
  File "/home/ssube/lora/venv/bin/lora_add", line 33, in <module>
    sys.exit(load_entry_point('lora-diffusion', 'console_scripts', 'lora_add')())
  File "/home/ssube/lora/lora_diffusion/cli_lora_add.py", line 201, in main
    fire.Fire(add)
  File "/home/ssube/lora/venv/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/ssube/lora/venv/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/ssube/lora/venv/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/ssube/lora/lora_diffusion/cli_lora_add.py", line 133, in add
    patch_pipe(loaded_pipeline, path_2)
  File "/home/ssube/lora/lora_diffusion/lora.py", line 1012, in patch_pipe
    monkeypatch_or_replace_safeloras(pipe, safeloras)
  File "/home/ssube/lora/lora_diffusion/lora.py", line 800, in monkeypatch_or_replace_safeloras
    loras = parse_safeloras(safeloras)
  File "/home/ssube/lora/lora_diffusion/lora.py", line 565, in parse_safeloras
    raise ValueError(
ValueError: Tensor lora_te_text_model_encoder_layers_0_mlp_fc1.alpha has no metadata - is this a Lora safetensor?

See https://github.com/cloneofsimo/lora/issues/191 for more information.

LoRA weights from cloneofsimo/lora

Download the lora repo and create a virtual environment for it:

> git clone https://github.com/cloneofsimo/lora.git
> python3 -m venv venv
> source venv/bin/activate
> pip3 install -r requirements.txt
> pip3 install accelerate

Download the base model and LoRA weights that you want to merge first, or provide the names of HuggingFace hub repos when you run the lora_add command:

> python3 -m lora_diffusion.cli_lora_add \
    runwayml/stable-diffusion-v1-5 \
    sayakpaul/sd-model-finetuned-lora-t4 \
    ~/onnx-web/models/.cache/diffusion-sd-v1-5-pokemon \
    0.8 \
    --mode upl

The output is a diffusers directory (step 3) and can be converted to ONNX by adding an entry to your extras.json file that matches the output path:

    {
      "name": "diffusion-sd-v1-5-pokemon",
      "source": ".cache/diffusion-sd-v1-5-pokemon"
    },

Based on docs in:

https://github.com/cloneofsimo/lora#merging-full-model-with-lora

LoRA weights from kohya-ss/sd-scripts

Download the sd-scripts repo and create a virtual environment for it:

> git clone https://github.com/kohya-ss/sd-scripts.git
> python3 -m venv venv
> source venv/bin/activate
> pip3 install -r requirements.txt
> pip3 install torch torchvision

Download the base model and LoRA weights that you want to merge, then run the merge_lora.py script:

> python networks/merge_lora.py \
    --sd_model ~/onnx-web/models/.cache/v1-5-pruned-emaonly.safetensors \
    --save_to ~/onnx-web/models/.cache/v1-5-elldreths-vivid-mix.safetensors \
    --models ~/lora-weights/elldreths-vivid-mix.safetensors \
    --ratios 1.0

The output is an SD checkpoint (step 2) and can be converted to ONNX by adding an entry to your extras.json file that matches the --save_to path:

    {
      "name": "diffusion-lora-elldreths-vivid-mix",
      "source": "../models/.cache/v1-5-elldreths-vivid-mix.safetensors",
      "format": "safetensors"
    },

Make sure to set the format key and that it matches the format you used to export the merged model, usually .safetensors.

Based on docs in:

https://github.com/kohya-ss/sd-scripts/blob/main/train_network_README-ja.md#%E3%83%9E%E3%83%BC%E3%82%B8%E3%82%B9%E3%82%AF%E3%83%AA%E3%83%97%E3%83%88%E3%81%AB%E3%81%A4%E3%81%84%E3%81%A6

Converting Textual Inversion embeddings

You can convert Textual Inversion embeddings by merging their weights and tokens into a copy of their base model, which is directly supported by the conversion script in onnx-web with no additional steps.

Textual Inversions may have more than one set of weights, which can be used and controlled separately. Some Textual Inversions provide their own token, but you can set a custom token for any of them.

Figuring out what token a Textual Inversion uses

The base token, without any layer numbers, should be printed to the logs with the string found embedding for token:

[2023-03-08 04:54:00,234] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: found embedding for token <concept>: torch.Size([768])
[2023-03-08 04:54:01,624] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: added 1 tokens
[2023-03-08 04:54:01,814] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: saving tokenizer for textual inversion
[2023-03-08 04:54:01,859] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: saving text encoder for textual inversion

If you have set a custom token, that will be shown instead. If more than one token has been added, they will be numbered following the pattern base-N, starting with 0.

Figuring out how many layers a Textual Inversion uses

Textual Inversions produced by the Stable Conceptualizer notebook only have a single layer, while many others have more than one.

The number of layers is shown in the server logs when the model is converted:

[2023-03-08 04:54:00,234] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: found embedding for token <concept>: torch.Size([768])
[2023-03-08 04:54:01,624] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: added 1 tokens
[2023-03-08 04:54:01,814] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: saving tokenizer for textual inversion
[2023-03-08 04:54:01,859] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: saving text encoder for textual inversion
...
[2023-03-08 04:58:06,378] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: generating 74 layer tokens
[2023-03-08 04:58:06,379] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: found embedding for token ['goblin-0', 'goblin-1', 'goblin-2', 'goblin-3', 'goblin-4', 'goblin-5', 'goblin-6', 'gob
lin-7', 'goblin-8', 'goblin-9', 'goblin-10', 'goblin-11', 'goblin-12', 'goblin-13', 'goblin-14', 'goblin-15', 'goblin-16', 'goblin-17', 'goblin-18', 'goblin-19', 'goblin-20', 'goblin-21', 'goblin-22', 'goblin-23', 'goblin-24', 'goblin-25', 'goblin-26', 'goblin-27', 'goblin-28', 'goblin-29', 'goblin-30', 'goblin-31', 'goblin-32', 'goblin-33', 'goblin-34', 'goblin-35', 'goblin-36', 'goblin-37', 'goblin-38', 'goblin-39', 'goblin-40', 'goblin-41', 'goblin-42', 'goblin-43', 'goblin-44', 'goblin-45', 'goblin-46', 'goblin-47', 'goblin-48', 'goblin-49', 'goblin-50', 'goblin-51', 'goblin-52', 'goblin-53', 'goblin-54', 'goblin-55', 'goblin-56', 'goblin-57', 'goblin-58', 'goblin-59', 'goblin-60', 'goblin-61', 'goblin-62', 'goblin-63', 'goblin-64', 'goblin-65', 'goblin-66', 'goblin-67', 'goblin-68', 'goblin-69', 'goblin-70', 'goblin-71', 'goblin-72', 'goblin-73'] (*): torch.Size([74, 768])
[2023-03-08 04:58:07,685] INFO: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: added 74 tokens
[2023-03-08 04:58:07,874] DEBUG: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: embedding torch.Size([768]) vector for layer goblin-0
[2023-03-08 04:58:07,874] DEBUG: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: embedding torch.Size([768]) vector for layer goblin-1
[2023-03-08 04:58:07,874] DEBUG: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: embedding torch.Size([768]) vector for layer goblin-2
[2023-03-08 04:58:07,875] DEBUG: MainProcess MainThread onnx_web.convert.diffusion.textual_inversion: embedding torch.Size([768]) vector for layer goblin-3

Figuring out the number of layers after the model has been converted currently requires the original tensor file (https://github.com/ssube/onnx-web/issues/212).

14 KiB Raw Blame History