Date: 25.06.2025
Stable Diffusion v1.5 Bad CUDA PyTorch Test
Requirments
- NVIDIA RTX 4070 8GB
 - Workstation 64 GB RAM, 200GB SSD
 - Windows 11
 - Install python 3.11
 - NVIDIA Driver 577
 
Memory on my gaming laptop is not sufficient for running Stable Diffusion v1.5, so I will try to use
device_map="balanced"and specifyoffload_folder="offload"as the directory for model offloading.
Steps
Get the StableDiffusion 1.5
git lfs install
git clone https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5 sd1.5
Preapre python environment for CUDA:
python -m venv .venv_llm_sd1.5
.\.venv_llm_sd1.5\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install transformers accelerate diffusers safetensors
python .\test_cuda_sd1.5.py
Create script test_cuda_sd1.5.py:
from diffusers import StableDiffusionPipeline
import torch
print(torch.cuda.is_available())
pipe = StableDiffusionPipeline.from_pretrained(
    "C:\\Users\\admin\\llm\\sd1.5",
    torch_dtype=torch.float16,
    device_map="balanced",
    offload_folder="offload",
    safety_checker=None,
    feature_extractor=None,
    use_safetensors=True
)
out = pipe(
    prompt= "vodka in the moon", 
    height=512, width=512, guidance_scale=9, num_inference_steps=80)
image = out.images[0]
image.save("test.png", format="PNG")
Open test.png and enjoy the result!
The model will generate image very slow due to limited GPU memory and offloading.