Porting/PyTorch

From RCS Wiki
Jump to navigation Jump to search

Binaries

Debian

PyTorch is packaged in Debian Bookworm and works fine, except for one bug. Debian uses PyTorch 1.13 and Python 3.11; these are not mutually compatible. To patch Debian's PyTorch to work properly with Python 3.11, apply this patch to /usr/lib/python3/dist-packages/torch/distributed/_shard/sharded_tensor/metadata.py:

-    tensor_properties: TensorProperties = field(default=TensorProperties())
+    tensor_properties: TensorProperties = field(default_factory=TensorProperties)

The above patch is derived from this Lightning comment , with a bugfix applied. The code affected by the patch may or may not be triggered depending on which application you're running; Real-ESRGAN doesn't need the patch while Stable Diffusion WebUI does.

Example workflow to install PyTorch and Real-ESRGAN on Debian Bookworm ppc64le:

sudo apt install python3-torch python3-torchvision python3-opencv python3-llvmlite python3-grpcio python3-pip python3-skimage python3-numba
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip3 install --user --break-system-packages basicsr facexlib gfpgan
# Log in and log out again to update $PATH to include ~/.local/bin 
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip3 install --user --break-system-packages -r requirements.txt
pip3 install --user --break-system-packages .
# Real-ESRGAN is now installed.

Stable Diffusion WebUI works fine with Debian's packaged PyTorch and Python, subject to the following caveats:

  • You need an older version of Lightning; checking out this WebUI commit (predates v1.1.0) works fine.
  • You need to patch repositories/CodeFormer/facelib/detection/yolov5face/face_detector.py as follows:
-IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('.'))) >= (1, 9, 0)
+IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('a')[0].split('.'))) >= (1, 9, 0)

It looks like a similar patch is already upstreamed to CodeFormer, but simply using that commit via export CODEFORMER_COMMIT_HASH=07c8cc6f6d9b5aee87046177a8d429ec041da54a doesn't work because that commit makes changes to CodeFormer's vendored BasicSR, which aren't compatible with Stable Diffusion WebUI's usage of upstream BasicSR.

Once you've done this, install the python3-sentencepiece package, and then the following script will launch the WebUI:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
export PIP_USER=1
export PIP_BREAK_SYSTEM_PACKAGES=1
python3 launch.py

Conda

Open Cognitive Environment (Open-CE) provides distro-independent ppc64le binaries of PyTorch and related packages. Example workflow to install PyTorch and Real-ESRGAN on Fedora 38 ppc64le:

sudo dnf install python3.10
sudo dnf install conda
conda create --name pytorch python=3.10
# Close and re-open terminal
conda activate pytorch
conda install -c https://ftp.osuosl.org/pub/open-ce/current/ pytorch-cpu torchvision-cpu py-opencv
conda install -c numba llvmlite
conda install -c conda-forge grpcio
conda install -c conda-forge libstdcxx-ng
conda install pip
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip install basicsr facexlib gfpgan
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip install -r requirements.txt
python setup.py develop
# Real-ESRGAN is now installed.

Example script to run Stable Diffusion WebUI (assuming you've cloned from Git) on Fedora 38 using the above installed PyTorch:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
python launch.py

Docker

MarcusC maintains a Dockerfile that sets up PyTorch 2.0.1 and Stable Diffusion WebUI 1.5.2 on ppc64le.

Finished

In progress

See Also