Stable Diffusion Web UI on Linux

In this tutorial, I will share my experience installing the web-based user interface for Stable Diffusion (SD) on Linux. The web-based user interface is developed by AUTOMATIC1111 and makes it easy for anyone to generate AI images with the Stable Diffusion model without any programming experience.

Stable Diffusion is a deep learning, text-to-image model released in 2022. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt [1]].

While Stable Diffusion can technically be run on a CPU alone, using a suitable GPU can significantly speed up the image generation process. To use the Stable Diffusion web UI, you'll need at least 4GB of vRAM for an AMD or Nvidia GPU.

A more powerful GPU will give you even better performance, but if you don't have a compatible graphics card, don't worry. In this case, SD will automatically run in "CPU Mode," which is slower but still functional. We will also provide instructions for running the SD web UI on a CPU, so you'll have options regardless of your hardware.

1. Hardware

1.1. Laptop Legion 5 Pro 16IAH7H (for generating  AI images with NVIDIA GPU)

  • CPU: 12th Gen Intel(R) Core(TM) i9-12900H
  • GPU: NVIDIA GeForce RTX 3070 Ti 8GB 150 W GPU, 8GB vRAM
  • RAM: 32 GB

1.2. Desktop Dell OPTIPLEX 7920 (for generating  AI images with CPU)

  • CPU: Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz (four CPUs)
  • GPU: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller
  • RAM: 20 GB

2. Stable Diffusion Web User Interface from AUTOMATIC1111 on Linux

Install dependencies:

$ sudo apt install wget git python3 python3-venv

Clone Repository:

$ git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

$ cd stable-diffusion-webui/

2.1 Generating 512x512 AI Images with GPU

Run webui.sh for GPU based Generating AI Images:

$ ./webui.sh --xformers

The script install torch and torch vision. Torch is an open source ML library used for creating deep neural networks and is written in the Lua scripting language.

The script also downloads 1.5 model v1-5-pruned-emaonly.safetensors file during GUI installation and store it to directory ./stable-diffusion-webui/models/.

Note: The Xformers library provides an optional method to accelerate image generation. This enhancement is exclusively available for NVIDIA GPUs, optimizing image generation and reducing VRAM usage. As of January 23, 2023, neither Windows nor Linux users are required to manually build the Xformers library. [2].

Once the installation is complete, use a web browser to access http://127.0.0.1:7860. Enter the text you want to generate an image based on and click Generate.

It takes less than 3 seconds to generate 512x512 image with NVIDIA GeForce RTX 3070 Ti 8GB  using model SD 1.5 (Figure 1).

Figure 1 -  512x512  "Realistic photo of relaxed young short hair woman with Artificial intelligence chip implemented in the head" Generated with Stable Diffusion 1.5 Model Using GPU

Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 2551780738, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly

Time taken: 2.31sTorch active/reserved: 2615/2770 MiB, Sys VRAM: 4681/7958 MiB (58.82%)

Note: Run script ./webui.sh everytime you need to generate images.

2.2 Generating 512x512 AI Images with CPU

In case your GPU does not meet the requirements, use the command below to generate images using the CPU:

$ ./webui.sh --skip-torch-cuda-test --precision full --no-half

It takes about 3 minutes and 10 seconds to generate 512x512 image on Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz (four CPUs) using model SD 1.5 (Figure 2).

Figure 2 - 512x512 Image "Horrific future of mankind dependent on Artificial intelligence" Generated with Stable Diffusion 1.5 Model Using CPU

Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 2044508491, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Version: v1.4.1

Time taken: 3m 9.61s

3. SD Model 2.1 Download

Download Stable Diffusion 2.1 model into the directory ./models/Stable-diffusion/.

$ wget https://huggingface.co/stabilityai/stable-diffusion-2-1/resolve/main/v2-1_768-ema-pruned.ckpt -P models/Stable-diffusion/

Download config file v2-inference-v.yaml into model directory configs/.

$ wget https://raw.githubusercontent.com/Stability-AI/stablediffusion/main/configs/stable-diffusion/v2-inference-v.yaml -O configs/v2-inference-v.yaml

Use a web browser to access http://127.0.0.1:7860. Select v2-1_768-ema-pruned.ckpt (Figure 3) and configure the 768x768 resolution for images (Figure 3).

Figure 3 - Selecting Stable Diffusion with 2.1 Model in Web UI

3.1 Generating 768x768 AI Images with GPU

Note: It takes less than 6 seconds to generate 768x768 image with NVIDIA GeForce RTX 3070 Ti 8GB  using model SD 2.1 (Figure 4).

$ ./webui.sh --xformers

Figure 4 - 768x768  Image "Realistic photo of relaxed young short hair woman with Artificial intelligence chip implemented in the head" Generated with Stable Diffusion 2.1 Model Using GPU

Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 3252341151, Size: 768x768, Model hash: ad2a33c361, Model: v2-1_768-ema-pruned

Time taken: 5.46sTorch active/reserved: 3732/5186 MiB, Sys VRAM: 7246/7958 MiB (91.05%)

3.2 Generating 768x768 AI Images with CPU

$ ./webui.sh --skip-torch-cuda-test --precision full --no-half

It takes almost 11 minutes to generate 768x768 image with SD 2.1 (Figure 5).

Figure 5 - 768x768 Image "Horrific future of mankind dependent on Artificial intelligence" Generated with Stable Diffusion 2.1 Model Using CPU

Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 23258687, Size: 768x768, Model hash: ad2a33c361, Model: v2-1_768-ema-pruned, Version: v1.4.1

Time taken: 10m 32.23s

4. Issues

4. 1 Torch.cuda.OutOfMemoryError: CUDA out of memory

I encountered a torch.cuda.OutOfMemoryError error when trying to generate higher resolution images using a 1.5 model trained for 512x512 resolution and a 2.1 model trained for 768x768 resolution. I attempted to use the --xformers argument in the webui.sh script, but the SD Web UI still complained about the missing xformers module and proceeded without it.

To resolve this issue, the Hugging Face AI community recommended installing xformers for GPU memory efficient attention [3].

However, I was unsuccessful in installing xformers with Python 3.9.x and discovered that the SD web UI was tested to work with Python 3.10.6. Therefore, I compiled and installed Python 3.10.6, but encountered an error message when running webui.sh that complained about lzma. I resolved this issue by installing liblzma-dev and lzma Debian packages with apt packahe manager and recompiling Python 3.10.6 as instructed in a Stack Overflow post [4].

After I started webui.sh script with --xformers argument, SD web UI automatically installed Xformers. As a result, I was able to generate 768x768 images with a 2.1 model using their NVIDIA GeForce RTX 3070 Ti 8GB 150 W GPU without encountering the OutOfMemoryError error.

End.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.