LLM Laboratory

Date: 26.12.2025

Compilation FlashAttention for CUDA 12.8

Hot links

If not exist check archive.org

Requirments

Ubuntu 24.04
PyTorch 2.7.1
Python 3.12
NVIDIA driver 570
CUDA toolkit 12.8.0

Test environment

Workstation 40 GB RAM, 500GB SSD, 750W Power supply
Ubuntu 24.04 LTS HWE Kernel
Install python 3.12

My test environment: HP Z440 + NVIDIA RTX 3090

Ubuntu preparation

sudo apt-get install --install-recommends linux-generic-hwe-24.04
hwe-support-status --verbose
sudo apt dist-upgrade
sudo reboot

Driver setup

Install drivers nvidia-driver-570

sudo apt install nvidia-driver-570 clinfo
sudo reboot

Install CUDA

wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda_12.8.0_570.86.10_linux.run
sudo sh cuda_12.8.0_570.86.10_linux.run --toolkit --samples

echo 'export PATH=/usr/local/cuda-12.8/bin:$PATH' | sudo tee /etc/profile.d/cuda.sh
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH' | sudo tee -a /etc/profile.d/cuda.sh
echo 'export CUDA_HOME=/usr/local/cuda-12.8' | sudo tee -a /etc/profile.d/cuda.sh
source /etc/profile.d/cuda.sh

Check installation

nvidia-smi
clinfo
nvcc --version

Install PyTorch

Preapre python environment for CUDA:

mkdir -p ~/llm && cd ~/llm
python3 -m venv .venv_llm
source ./.venv_llm/bin/activate
python -m pip install --upgrade pip
pip install "torch==2.7.1" "torchvision==0.22.1" "torchaudio==2.7.1" --index-url https://download.pytorch.org/whl/cu128
pip install transformers accelerate

Check PyTorch installation

python3 -c "import torch; print(torch.__version__); print(torch.cuda.is_available());print(torch.cuda.get_device_name(0));"

Build FlashAttention

Install build dependancies

pip install setuptools wheel
pip install packaging ninja

Compile FlashAttention and install to virtalenv

MAX_JOBS=4 pip install "flash-attn==2.6.3" --no-build-isolation

Check BitsAndBytes installation

python -m bitsandbytes

Compilation FlashAttention for CUDA 12.8

Hot links

Requirments

Test environment

Ubuntu preparation

Driver setup

Install PyTorch

Build FlashAttention

It works!