Stable Diffusion within Open WebUI (20240730)

This post details the build as a container of Automatic1111 and its integration as an image generator option for the “Ollama with Open WebUI” installation.

Jul 13, 2024
👉
Linux hosts set up instructions for integrating Automatic1111 Stable Diffusion with Open WebUI (in an installation using Dockge, a self-hosted Docker Compose stacks management, already running Ollama)
 
Revision: 20240730-0 (init: 20240713)
 
This post details the build as a container of Automatic1111 and its integration as an image generator option for the “Ollama with Open WebUI” installation.
 
 
Stable Diffusion is a text-to-image generative model that leverages deep learning techniques to generate highly detailed images based on textual descriptions. The model operates as a latent diffusion model, which involves iteratively adding and then removing noise from an image to generate a final output that matches the given text prompt. This approach allows Stable Diffusion to produce high-quality, photorealistic images and handle complex prompts effectively.
Automatic1111 (often called A1111) is a WebUI for Stable Diffusion with a user-friendly interface allowing users to interact without writing complex command-line instructions. Its intuitive interface enables exploring Stable Diffusion without requiring deep technical expertise.

Stable Diffusion: Automatic1111

To run Automatic1111 using Docker, we will use one of the supported versions as listed by the project at https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Containers
Looking at the recent commit history from those projects, we will use https://github.com/AbdBarho/stable-diffusion-webui-docker

Building the container

We will follow the provided steps to build our own since no official release was found on Docker Hub.
Because we want Dockge to recognize the stack, we first need to create it before populating it. From Dockge’s WebUI we create a new automatic1111 stack and Save it for the time being. We now have an /opt/stacks/automatic1111 directory for our use.
From a terminal:
# /opt might not be readable by our user # let's grant ourselves access to it sudo chown `id -u`:`id -g` /opt/stacks/automatic1111 cd /opt/stacks/automatic1111 # Obtain the master branch from the repo git clone https://github.com/AbdBarho/stable-diffusion-webui-docker.git cd stable-diffusion-webui-docker # Download all the dependencies for the tool (be aware that re-running this command will perform the downloads again) docker compose --profile download up --build # then build Automatic1111 UI with GPU support (but do not run the container) docker compose --profile auto build
During the download stage, the various required downloads are performed. We will see a webui-docker-download container populating the data and services directories with necessary sub-components for the actual build. Of note, the size of the uncompressed downloads was over 10GB.
The build stage will, in turn, use the downloaded content to build a webui-docker-auto container image. If you are curious about what is happening behind the scenes, you can examine the services/AUTOMATIC1111/Dockerfile file. The finished container in this build was also about 10 GB. The built container is named following the auto section image: entry of the docker-compose.yml file present in the stable-diffusion-webui-docker folder. The final container was named sd-auto:78, which, according to the commit history, sets the Automatic1111 version to 1.9.4.
For use within our stack, we will rename it as automatic1111:built. We rely on the fact that this is the last container added to tag it:
docker tag `docker images --format '{{.ID}} {{.CreatedAt}}' | sort -rk 2 | awk 'NR==1{print $1}'` automatic1111:built
We recommend verifying that the latest image id matches the expected sd-auto one running a docker image ls

Adding to Dockge

We will now populate the automatic1111 stack’s compose.yaml file to use our NVIDIA GPU, based of the built container’sdocker-compose.yml (understanding that file location are relative to /opt/stacks/automatic1111), following Docker’s GPU support documentation, and adding some fields to support API calls for integration within Open Webui.
The compose.yaml to use is as follows:
services: stable-diffusion-webui: image: automatic1111:built ports: - 7860:7860 volumes: - ./stable-diffusion-webui-docker/data:/data - ./output:/output - /etc/timezone:/etc/timezone:ro - /etc/localtime:/etc/localtime:ro stop_signal: SIGKILL tty: true restart: unless-stopped environment: - CLI_ARGS=--allow-code --medvram --xformers --enable-insecure-extension-access --api --listen --api-auth user:password deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: - compute - utility
For use with Open WebUI, we added the --api, --listen, and --api-auth entries; specify the user:password values you prefer.
The content of the container’s /data is mapped to the data downloaded during the build’s download stage, expecting it to bring data persistence for future restart of the container. Generated content will be stored at the base of our stack in the output directory.
After selecting Start and after the tool downloads additional content (which might take a few minutes), the Automatic1111 WebUI will be available on http://127.0.0.1:7860/.
We can now test the available model for txt2img (for this use, do not use the inpainting labelled one) with a prompt, such as “dog leaping on a sunny beach”, returning this image:
notion image

Adding models

There are multiple options to obtain models, such as Hugging Face, offering a vast repository of models, datasets, and tools. We will use CivitAI as the site often has example images and the prompt and parameters used to generate them with the models. CivitAI has many models, often in the safetensor format; those cannot contain executable code, they can be shared and loaded across different programming languages and platforms, and support loading only parts of a model.
With our setup, adding a downloaded model to Automatic1111 requires copying the downloaded file(s) into the /opt/stacks/automatic1111/stable-diffusion-webui-docker/data/models/Stable-diffusion/ directory (given that the directory is likely not readable directly by non root users, adapting the sudo cp MODEL.safetensor /opt/stacks/automatic1111/stable-diffusion-webui-docker/data/models/Stable-diffusion/ command will prove useful).
After adding some models, select the “refresh” icon next to the list of “Stable Diffusion Checkpoint” at the top of the WebUI, and generate new content: it is useful to investigate examples from the CivitAI model to copy and test re-using its parameters, such as positive prompt, negative prompt, Guidance, Steps and Samplers, knowing that:
  • “Positive prompt” is the text description of what we want to see in the generated image.
  • “Negative prompt” is the text describing elements we do not want to appear in the image.
  • “Sampling Method” (or “Samplers”) define the algorithm(s) that determine how noise is removed during image generation, affecting the resulting image’s quality and style.
  • “steps” refers to the number of denoising iterations performed to generate the image; each step progressively refines the image, removing noise and adding details to better match the prompt. While higher step generally lead to more detailed and higher-quality images, there are diminishing returns beyond a certain point.
  • the CFG scale (Classifier-Free Guidance scale, also referred to as “guidance scale”) controls how closely the generated image adheres to the input text prompt (higher values = stricter adherence).
notion image

Integrating Automatic1111 with Open WebUI

We will follow the instructions provided at https://docs.openwebui.com/tutorial/images/#automatic1111 to integrate image generation to our existing Open WebUI frontend (running in a docker compose).
When using a reverse proxy with the Automatic1111 container, for example at https://a1111.example.com the steps to follow to enable image generation do not require modifying the Open WebUI’s compose.yml file.
On Open WebUI, as an admin user:
  • Click on your username (bottom left of the WebUI) to access a sub-menu
  • Select the “Admin Panel” option
  • Select the “Settings” tab
  • Select the “Images” menu option
  • In “Image Settings”
    • Select Default (Automatic1111) as the “Image Generation Engine”
    • Turn On the “Image Generation (Experimental)” which will enable new configuration options
  • Set the values of user:password for “AUTOMATIC1111 Api Auth String” to match the --api-auth of your Automatic1111 deployment
  • The “Set Default Model” submenu was enabled by the enabling “Image Generation (Experimental)”. In there, specify the model to use (from the ones added to the Automatic1111 list), the expected image size, and the number of steps for the generation.
  • Make sure to “Save” those settings.
Now when creating a new chat, a new “Generate Image” options will be available:
notion image
As an example, when asking llama3: generate only the positive and negative prompts for stable diffusion: space whales, sci-fi theme, majestic , we are able to use that “Generate Image” button with the generated prompt:
notion image

Integration without a reverse proxy

We will follow some of the steps that were described in the "Ollama with Open WebUI” setup post.
If the containers are all running on the same host, we can use content from the “Setup in a separate compose.yaml” section, and make use of host.docker.internal.
Edit the Open WebUI compose.yaml and add an entry (if it is not already there) for
extra_hosts: - host.docker.internal:host-gateway
Depending on your existing setup (each service in its own compose.yml for example), the content of the file might end up looking as:
services: open-webui: image: ghcr.io/open-webui/open-webui:cuda container_name: open-webui volumes: - ./open-webui:/app/backend/data - /etc/timezone:/etc/timezone:ro - /etc/localtime:/etc/localtime:ro ports: - 3030:8080 restart: unless-stopped environment: - OLLAMA_BASE_URL=http://host.docker.internal:11434 extra_hosts: - host.docker.internal:host-gateway deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: - gpu labels: - "com.centurylinklabs.watchtower.enable=true"
In this case, we have open-webui in its own container on the same host where ollama is running, and using host.docker.internal:host-gateway enables the Open WebUI container to see the ports on the host interface of the Ollama container’s host.
With this setup, we can now follow the steps presented in the previous section, but set the “AUTOMATIC1111 Base URL” to http://host.docker.internal:7860/ (instead of https://a1111.example.com)

A note on GPU memory

By default, Ollama keeps the model loaded in memory for five minutes. If your GPU has a limited amount of memory, it may be difficult to generate an image right after generating a prompt for that image.
It is possible to use the OLLAMA_KEEP_ALIVE environment variable (to be added to the Ollama stack) to reduce this value (of 300s) to a value that is more usable for your setup. For example:
environment: - OLLAMA_KEEP_ALIVE=30

Further reading

Revision History

  • 20240730-0: Added example for Ollama Stack’s compose.yml
  • 20240721-0: Added a note on GPU memory and the use of OLLAMA_KEEP_ALIVE
  • 20240716-0: Fixed path for models location
  • 20240714-0: moved OpenWebUI container to cuda version
  • 20240713-0: Initial release