Difference between revisions of "Porting/PyTorch"

From RCS Wiki
Jump to navigation Jump to search
(Add missing conversion functions between half and float for ppc64le)
(Move to subcategory)
 
(26 intermediate revisions by the same user not shown)
Line 1: Line 1:
= In progress =
+
= Binaries =
 +
 
 +
== Debian ==
 +
 
 +
PyTorch is packaged in Debian Bookworm and works fine, except for one bug. Debian uses PyTorch 1.13 and Python 3.11; these are not mutually compatible. To patch Debian's PyTorch to work properly with Python 3.11, apply this patch to <code>/usr/lib/python3/dist-packages/torch/distributed/_shard/sharded_tensor/metadata.py</code>:
 +
 
 +
-    tensor_properties: TensorProperties = field(default=TensorProperties())
 +
+    tensor_properties: TensorProperties = field(default_factory=TensorProperties)
 +
 
 +
The above patch is derived from [https://github.com/Lightning-AI/lightning/issues/15614#issuecomment-1336194917 this Lightning comment] , with a bugfix applied. The code affected by the patch may or may not be triggered depending on which application you're running; Real-ESRGAN doesn't need the patch while Stable Diffusion WebUI does.
 +
 
 +
Example workflow to install PyTorch and [https://github.com/xinntao/Real-ESRGAN Real-ESRGAN] on Debian Bookworm ppc64le:
 +
 
 +
sudo apt install python3-torch python3-torchvision python3-opencv python3-llvmlite python3-grpcio python3-pip python3-skimage python3-numba
 +
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip3 install --user --break-system-packages basicsr facexlib gfpgan
 +
# Log in and log out again to update $PATH to include ~/.local/bin
 +
git clone https://github.com/xinntao/Real-ESRGAN.git
 +
cd Real-ESRGAN
 +
pip3 install --user --break-system-packages -r requirements.txt
 +
pip3 install --user --break-system-packages .
 +
# Real-ESRGAN is now installed.
 +
 
 +
[https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI] works fine with Debian's packaged PyTorch and Python, subject to the following caveats:
 +
 
 +
* You need an older version of Lightning; checking out [https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/009bc9f534a4f6d19ece5b5dafe3421421085fb1 this WebUI commit] (predates v1.1.0) works fine.
 +
* You need to patch <code>repositories/CodeFormer/facelib/detection/yolov5face/face_detector.py</code> as follows:
 +
 
 +
-IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('.'))) >= (1, 9, 0)
 +
+IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('a')[0].split('.'))) >= (1, 9, 0)
 +
 
 +
It looks like a similar patch is already [https://github.com/sczhou/CodeFormer/commit/07c8cc6f6d9b5aee87046177a8d429ec041da54a upstreamed to CodeFormer], but simply using that commit via <code>export CODEFORMER_COMMIT_HASH=07c8cc6f6d9b5aee87046177a8d429ec041da54a</code> doesn't work because that commit makes changes to CodeFormer's vendored BasicSR, which aren't compatible with Stable Diffusion WebUI's usage of upstream BasicSR.
 +
 
 +
Once you've done this, install the <code>python3-sentencepiece</code> package, and then the following script will launch the WebUI:
 +
 
 +
export TORCH_COMMAND=true
 +
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
 +
export PIP_USER=1
 +
export PIP_BREAK_SYSTEM_PACKAGES=1
 +
python3 launch.py
 +
 
 +
== Conda ==
 +
 
 +
[https://github.com/open-ce/open-ce/ Open Cognitive Environment (Open-CE)] provides distro-independent ppc64le binaries of PyTorch and related packages. Example workflow to install PyTorch and [https://github.com/xinntao/Real-ESRGAN Real-ESRGAN] on Fedora 38 ppc64le:
 +
 
 +
sudo dnf install python3.10
 +
sudo dnf install conda
 +
conda create --name pytorch python=3.10
 +
# Close and re-open terminal
 +
conda activate pytorch
 +
conda install -c https://ftp.osuosl.org/pub/open-ce/current/ pytorch-cpu torchvision-cpu py-opencv
 +
conda install -c numba llvmlite
 +
conda install -c conda-forge grpcio
 +
conda install -c conda-forge libstdcxx-ng
 +
conda install pip
 +
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip install basicsr facexlib gfpgan
 +
git clone https://github.com/xinntao/Real-ESRGAN.git
 +
cd Real-ESRGAN
 +
pip install -r requirements.txt
 +
python setup.py develop
 +
# Real-ESRGAN is now installed.
 +
 
 +
Example script to run [https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI] (assuming you've cloned from Git) on Fedora 38 using the above installed PyTorch:
 +
 
 +
export TORCH_COMMAND=true
 +
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
 +
python launch.py
 +
 
 +
= Docker =
 +
 
 +
[[User:MarcusC|MarcusC]] maintains a [https://gist.github.com/zeldin/9f8281632c8792dff84dd5fee6d91ad8 Dockerfile] that sets up PyTorch 2.0.1 and Stable Diffusion WebUI 1.5.2 on ppc64le.
 +
 
 +
= Finished =
  
* [https://github.com/pytorch/pytorch/issues/94912 PR #88607 breaks build for POWER9 CPU]
 
 
* [https://github.com/pytorch/pytorch/issues/97497 PyTorch Build From Source Error: ppc64le + gcc 7.3.1 + cuda 11.8 + python 3.9]
 
* [https://github.com/pytorch/pytorch/issues/97497 PyTorch Build From Source Error: ppc64le + gcc 7.3.1 + cuda 11.8 + python 3.9]
 
* [https://github.com/pytorch/pytorch/pull/98511 fallback to cpu_kernel for VSX]
 
* [https://github.com/pytorch/pytorch/pull/98511 fallback to cpu_kernel for VSX]
 
* [https://github.com/pytorch/pytorch/pull/100168 Add missing conversion functions between half and float for ppc64le]
 
* [https://github.com/pytorch/pytorch/pull/100168 Add missing conversion functions between half and float for ppc64le]
  
[[Category:Ports]]
+
= In progress =
 +
 
 +
* [https://github.com/pytorch/pytorch/issues/94912 PR #88607 breaks build for POWER9 CPU]
 +
* [https://github.com/pytorch/pytorch/issues/108934 PPC64le: vsx_helpers.h errors]
 +
* [https://github.com/pytorch/pytorch/issues/109777 Wrong vector shift results on PowerPC]
 +
* [https://github.com/pytorch/pytorch/pull/109886 Fix CPU bitwise shifts for out-of-limit values in VSX-vec]
 +
 
 +
= See Also =
 +
 
 +
* [[Porting/chaiNNer|chaiNNer]]
 +
* [[Porting/ncnn|ncnn]]
 +
* [[Porting/ONNX|ONNX]]
 +
* [[Porting/OpenCV|OpenCV]]
 +
 
 +
[[Category:Ports/AI]]

Latest revision as of 15:43, 18 May 2025

Binaries

Debian

PyTorch is packaged in Debian Bookworm and works fine, except for one bug. Debian uses PyTorch 1.13 and Python 3.11; these are not mutually compatible. To patch Debian's PyTorch to work properly with Python 3.11, apply this patch to /usr/lib/python3/dist-packages/torch/distributed/_shard/sharded_tensor/metadata.py:

-    tensor_properties: TensorProperties = field(default=TensorProperties())
+    tensor_properties: TensorProperties = field(default_factory=TensorProperties)

The above patch is derived from this Lightning comment , with a bugfix applied. The code affected by the patch may or may not be triggered depending on which application you're running; Real-ESRGAN doesn't need the patch while Stable Diffusion WebUI does.

Example workflow to install PyTorch and Real-ESRGAN on Debian Bookworm ppc64le:

sudo apt install python3-torch python3-torchvision python3-opencv python3-llvmlite python3-grpcio python3-pip python3-skimage python3-numba
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip3 install --user --break-system-packages basicsr facexlib gfpgan
# Log in and log out again to update $PATH to include ~/.local/bin 
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip3 install --user --break-system-packages -r requirements.txt
pip3 install --user --break-system-packages .
# Real-ESRGAN is now installed.

Stable Diffusion WebUI works fine with Debian's packaged PyTorch and Python, subject to the following caveats:

  • You need an older version of Lightning; checking out this WebUI commit (predates v1.1.0) works fine.
  • You need to patch repositories/CodeFormer/facelib/detection/yolov5face/face_detector.py as follows:
-IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('.'))) >= (1, 9, 0)
+IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('a')[0].split('.'))) >= (1, 9, 0)

It looks like a similar patch is already upstreamed to CodeFormer, but simply using that commit via export CODEFORMER_COMMIT_HASH=07c8cc6f6d9b5aee87046177a8d429ec041da54a doesn't work because that commit makes changes to CodeFormer's vendored BasicSR, which aren't compatible with Stable Diffusion WebUI's usage of upstream BasicSR.

Once you've done this, install the python3-sentencepiece package, and then the following script will launch the WebUI:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
export PIP_USER=1
export PIP_BREAK_SYSTEM_PACKAGES=1
python3 launch.py

Conda

Open Cognitive Environment (Open-CE) provides distro-independent ppc64le binaries of PyTorch and related packages. Example workflow to install PyTorch and Real-ESRGAN on Fedora 38 ppc64le:

sudo dnf install python3.10
sudo dnf install conda
conda create --name pytorch python=3.10
# Close and re-open terminal
conda activate pytorch
conda install -c https://ftp.osuosl.org/pub/open-ce/current/ pytorch-cpu torchvision-cpu py-opencv
conda install -c numba llvmlite
conda install -c conda-forge grpcio
conda install -c conda-forge libstdcxx-ng
conda install pip
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip install basicsr facexlib gfpgan
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip install -r requirements.txt
python setup.py develop
# Real-ESRGAN is now installed.

Example script to run Stable Diffusion WebUI (assuming you've cloned from Git) on Fedora 38 using the above installed PyTorch:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
python launch.py

Docker

MarcusC maintains a Dockerfile that sets up PyTorch 2.0.1 and Stable Diffusion WebUI 1.5.2 on ppc64le.

Finished

In progress

See Also