Difference between revisions of "Porting/PyTorch"

From RCS Wiki
Jump to navigation Jump to search
(Tagged: Add missing conversion functions between half and float for ppc64le)
(Finished: PyTorch Build From Source Error: ppc64le + gcc 7.3.1 + cuda 11.8 + python 3.9)
Line 69: Line 69:
  
 
* [https://github.com/pytorch/pytorch/pull/100168 Add missing conversion functions between half and float for ppc64le]
 
* [https://github.com/pytorch/pytorch/pull/100168 Add missing conversion functions between half and float for ppc64le]
 +
* [https://github.com/pytorch/pytorch/issues/97497 PyTorch Build From Source Error: ppc64le + gcc 7.3.1 + cuda 11.8 + python 3.9]
  
 
= In progress =
 
= In progress =
  
 
* [https://github.com/pytorch/pytorch/issues/94912 PR #88607 breaks build for POWER9 CPU]
 
* [https://github.com/pytorch/pytorch/issues/94912 PR #88607 breaks build for POWER9 CPU]
* [https://github.com/pytorch/pytorch/issues/97497 PyTorch Build From Source Error: ppc64le + gcc 7.3.1 + cuda 11.8 + python 3.9]
 
 
* [https://github.com/pytorch/pytorch/pull/98511 fallback to cpu_kernel for VSX]
 
* [https://github.com/pytorch/pytorch/pull/98511 fallback to cpu_kernel for VSX]
  

Revision as of 23:02, 14 October 2023

Binaries

Debian

PyTorch is packaged in Debian Bookworm and works fine, except for one bug. Debian uses PyTorch 1.13 and Python 3.11; these are not mutually compatible. To patch Debian's PyTorch to work properly with Python 3.11, apply this patch to /usr/lib/python3/dist-packages/torch/distributed/_shard/sharded_tensor/metadata.py:

-    tensor_properties: TensorProperties = field(default=TensorProperties())
+    tensor_properties: TensorProperties = field(default_factory=TensorProperties)

The above patch is derived from this Lightning comment , with a bugfix applied. The code affected by the patch may or may not be triggered depending on which application you're running; Real-ESRGAN doesn't need the patch while Stable Diffusion WebUI does.

Example workflow to install PyTorch and Real-ESRGAN on Debian Bookworm ppc64le:

sudo apt install python3-torch python3-torchvision python3-opencv python3-llvmlite python3-grpcio python3-pip python3-skimage python3-numba
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip3 install --user --break-system-packages basicsr facexlib gfpgan
# Log in and log out again to update $PATH to include ~/.local/bin 
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip3 install --user --break-system-packages -r requirements.txt
pip3 install --user --break-system-packages .
# Real-ESRGAN is now installed.

Stable Diffusion WebUI works fine with Debian's packaged PyTorch and Python, subject to the following caveats:

  • You need an older version of Lightning; checking out this WebUI commit works fine.
  • You need to patch repositories/CodeFormer/facelib/detection/yolov5face/face_detector.py as follows:
-IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('.'))) >= (1, 9, 0)
+IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('a')[0].split('.'))) >= (1, 9, 0)

It looks like a similar patch is already upstreamed to CodeFormer, but simply using that commit via export CODEFORMER_COMMIT_HASH=07c8cc6f6d9b5aee87046177a8d429ec041da54a doesn't work because that commit makes changes to CodeFormer's vendored BasicSR, which aren't compatible with Stable Diffusion WebUI's usage of upstream BasicSR.

Once you've done this, install the python3-sentencepiece package, and then the following script will launch the WebUI:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
export PIP_USER=1
export PIP_BREAK_SYSTEM_PACKAGES=1
python3 launch.py

Conda

Open Cognitive Environment (Open-CE) provides distro-independent ppc64le binaries of PyTorch and related packages. Example workflow to install PyTorch and Real-ESRGAN on Fedora 38 ppc64le:

sudo dnf install python3.10
sudo dnf install conda
conda create --name pytorch python=3.10
# Close and re-open terminal
conda activate pytorch
conda install -c https://ftp.osuosl.org/pub/open-ce/current/ pytorch-cpu torchvision-cpu py-opencv
conda install -c numba llvmlite
conda install -c conda-forge grpcio
conda install -c conda-forge libstdcxx-ng
conda install pip
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=true pip install basicsr facexlib gfpgan
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip install -r requirements.txt
python setup.py develop
# Real-ESRGAN is now installed.

Example script to run Stable Diffusion WebUI (assuming you've cloned from Git) on Fedora 38 using the above installed PyTorch:

export TORCH_COMMAND=true
export COMMANDLINE_ARGS='--skip-version-check --skip-torch-cuda-test --no-half'
python launch.py

Finished

In progress

See Also