Difference between revisions of "Porting/PyTorch"

From RCS Wiki
Jump to navigation Jump to search
(→‎Binaries: Add Debian workflow)
(Dockerfile)
 
(20 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
== Debian ==
 
== Debian ==
  
PyTorch is packaged in Debian Bookworm and works fine. Example workflow to install PyTorch and [https://github.com/xinntao/Real-ESRGAN Real-ESRGAN] on Debian Bookworm ppc64le:
+
PyTorch is packaged in Debian Bookworm and works fine, except for one bug. Debian uses PyTorch 1.13 and Python 3.11; these are not mutually compatible. To patch Debian's PyTorch to work properly with Python 3.11, apply this patch to <code>/usr/lib/python3/dist-packages/torch/distributed/_shard/sharded_tensor/metadata.py</code>:
 +
 
 +
-    tensor_properties: TensorProperties = field(default=TensorProperties())
 +
+    tensor_properties: TensorProperties = field(default_factory=TensorProperties)
 +
 
 +
The above patch is derived from [https://github.com/Lightning-AI/lightning/issues/15614#issuecomment-1336194917 this Lightning comment] , with a bugfix applied. The code affected by the patch may or may not be triggered depending on which application you're running; Real-ESRGAN doesn't need the patch while Stable Diffusion WebUI does.
 +
 
 +
Example workflow to install PyTorch and [https://github.com/xinntao/Real-ESRGAN Real-ESRGAN] on Debian Bookworm ppc64le:
  
 
  sudo apt install python3-torch python3-torchvision python3-opencv python3-llvmlite python3-grpcio python3-pip python3-skimage python3-numba
 
  sudo apt install python3-torch python3-torchvision python3-opencv python3-llvmlite python3-grpcio python3-pip python3-skimage python3-numba
Line 13: Line 20:
 
  pip3 install --user --break-system-packages .
 
  pip3 install --user --break-system-packages .
 
  # Real-ESRGAN is now installed.
 
  # Real-ESRGAN is now installed.
 +
 +
[https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI] works fine with Debian's packaged PyTorch and Python, subject to the following caveats:
 +
 +
* You need an older version of Lightning; checking out [https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/009bc9f534a4f6d19ece5b5dafe3421421085fb1 this WebUI commit] (predates v1.1.0) works fine.
 +
* You need to patch <code>repositories/CodeFormer/facelib/detection/yolov5face/face_detector.py</code> as follows:
 +
 +
-IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('.'))) >= (1, 9, 0)
 +
+IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('a')[0].split('.'))) >= (1, 9, 0)
 +
 +
It looks like a similar patch is already [https://github.com/sczhou/CodeFormer/commit/07c8cc6f6d9b5aee87046177a8d429ec041da54a upstreamed to CodeFormer], but simply using that commit via <code>export CODEFORMER_COMMIT_HASH=07c8cc6f6d9b5aee87046177a8d429ec041da54a</code> doesn't work because that commit makes changes to CodeFormer's vendored BasicSR, which aren't compatible with Stable Diffusion WebUI's usage of upstream BasicSR.
 +
 +
Once you've done this, install the <code>python3-sentencepiece</code> package, and then the following script will launch the WebUI:
 +
 +
export TORCH_COMMAND=true
 +
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
 +
export PIP_USER=1
 +
export PIP_BREAK_SYSTEM_PACKAGES=1
 +
python3 launch.py
  
 
== Conda ==
 
== Conda ==
Line 35: Line 60:
 
  # Real-ESRGAN is now installed.
 
  # Real-ESRGAN is now installed.
  
= In progress =
+
Example script to run [https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI] (assuming you've cloned from Git) on Fedora 38 using the above installed PyTorch:
 +
 
 +
export TORCH_COMMAND=true
 +
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
 +
python launch.py
 +
 
 +
= Docker =
 +
 
 +
[[User:MarcusC|MarcusC]] maintains a [https://gist.github.com/zeldin/9f8281632c8792dff84dd5fee6d91ad8 Dockerfile] that sets up PyTorch 2.0.1 and Stable Diffusion WebUI 1.5.2 on ppc64le.
 +
 
 +
= Finished =
  
* [https://github.com/pytorch/pytorch/issues/94912 PR #88607 breaks build for POWER9 CPU]
 
 
* [https://github.com/pytorch/pytorch/issues/97497 PyTorch Build From Source Error: ppc64le + gcc 7.3.1 + cuda 11.8 + python 3.9]
 
* [https://github.com/pytorch/pytorch/issues/97497 PyTorch Build From Source Error: ppc64le + gcc 7.3.1 + cuda 11.8 + python 3.9]
 
* [https://github.com/pytorch/pytorch/pull/98511 fallback to cpu_kernel for VSX]
 
* [https://github.com/pytorch/pytorch/pull/98511 fallback to cpu_kernel for VSX]
 
* [https://github.com/pytorch/pytorch/pull/100168 Add missing conversion functions between half and float for ppc64le]
 
* [https://github.com/pytorch/pytorch/pull/100168 Add missing conversion functions between half and float for ppc64le]
 +
 +
= In progress =
 +
 +
* [https://github.com/pytorch/pytorch/issues/94912 PR #88607 breaks build for POWER9 CPU]
 +
* [https://github.com/pytorch/pytorch/issues/108934 PPC64le: vsx_helpers.h errors]
 +
* [https://github.com/pytorch/pytorch/issues/109777 Wrong vector shift results on PowerPC]
 +
* [https://github.com/pytorch/pytorch/pull/109886 Fix CPU bitwise shifts for out-of-limit values in VSX-vec]
 +
 +
= See Also =
 +
 +
* [[Porting/chaiNNer|chaiNNer]]
 +
* [[Porting/ncnn|ncnn]]
 +
* [[Porting/ONNX|ONNX]]
  
 
[[Category:Ports]]
 
[[Category:Ports]]

Latest revision as of 03:39, 17 October 2023

Binaries

Debian

PyTorch is packaged in Debian Bookworm and works fine, except for one bug. Debian uses PyTorch 1.13 and Python 3.11; these are not mutually compatible. To patch Debian's PyTorch to work properly with Python 3.11, apply this patch to /usr/lib/python3/dist-packages/torch/distributed/_shard/sharded_tensor/metadata.py:

-    tensor_properties: TensorProperties = field(default=TensorProperties())
+    tensor_properties: TensorProperties = field(default_factory=TensorProperties)

The above patch is derived from this Lightning comment , with a bugfix applied. The code affected by the patch may or may not be triggered depending on which application you're running; Real-ESRGAN doesn't need the patch while Stable Diffusion WebUI does.

Example workflow to install PyTorch and Real-ESRGAN on Debian Bookworm ppc64le:

sudo apt install python3-torch python3-torchvision python3-opencv python3-llvmlite python3-grpcio python3-pip python3-skimage python3-numba
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip3 install --user --break-system-packages basicsr facexlib gfpgan
# Log in and log out again to update $PATH to include ~/.local/bin 
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip3 install --user --break-system-packages -r requirements.txt
pip3 install --user --break-system-packages .
# Real-ESRGAN is now installed.

Stable Diffusion WebUI works fine with Debian's packaged PyTorch and Python, subject to the following caveats:

  • You need an older version of Lightning; checking out this WebUI commit (predates v1.1.0) works fine.
  • You need to patch repositories/CodeFormer/facelib/detection/yolov5face/face_detector.py as follows:
-IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('.'))) >= (1, 9, 0)
+IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('a')[0].split('.'))) >= (1, 9, 0)

It looks like a similar patch is already upstreamed to CodeFormer, but simply using that commit via export CODEFORMER_COMMIT_HASH=07c8cc6f6d9b5aee87046177a8d429ec041da54a doesn't work because that commit makes changes to CodeFormer's vendored BasicSR, which aren't compatible with Stable Diffusion WebUI's usage of upstream BasicSR.

Once you've done this, install the python3-sentencepiece package, and then the following script will launch the WebUI:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
export PIP_USER=1
export PIP_BREAK_SYSTEM_PACKAGES=1
python3 launch.py

Conda

Open Cognitive Environment (Open-CE) provides distro-independent ppc64le binaries of PyTorch and related packages. Example workflow to install PyTorch and Real-ESRGAN on Fedora 38 ppc64le:

sudo dnf install python3.10
sudo dnf install conda
conda create --name pytorch python=3.10
# Close and re-open terminal
conda activate pytorch
conda install -c https://ftp.osuosl.org/pub/open-ce/current/ pytorch-cpu torchvision-cpu py-opencv
conda install -c numba llvmlite
conda install -c conda-forge grpcio
conda install -c conda-forge libstdcxx-ng
conda install pip
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip install basicsr facexlib gfpgan
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip install -r requirements.txt
python setup.py develop
# Real-ESRGAN is now installed.

Example script to run Stable Diffusion WebUI (assuming you've cloned from Git) on Fedora 38 using the above installed PyTorch:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
python launch.py

Docker

MarcusC maintains a Dockerfile that sets up PyTorch 2.0.1 and Stable Diffusion WebUI 1.5.2 on ppc64le.

Finished

In progress

See Also