Difference between revisions of "Porting/ncnn"
Jump to navigation
Jump to search
JeremyRand (talk | contribs) (→Finished: python: document CMAKE_TOOLCHAIN_FILE env var) |
JeremyRand (talk | contribs) (Move to subcategory) |
||
Line 27: | Line 27: | ||
* [[Porting/PyTorch|PyTorch]] | * [[Porting/PyTorch|PyTorch]] | ||
− | [[Category:Ports]] | + | [[Category:Ports/AI]] |
Latest revision as of 15:40, 18 May 2025
Contents
Finished
- Translate x86_64 SSE to ppc64le VSX intrinsics
- VSX toolchains: check for SSE2 support
- Add POWER8 VSX toolchains
- load_param_mem pybind
- test_squeezenet failed under Lubuntu 16.04 PowerPC 32-bit
- support big endian platform, add powerpc ci
- Update POWER Clang version docs
- Update Vulkan dependency docs
- Document libomp-dev dependency
- python: document CMAKE_TOOLCHAIN_FILE env var
In progress
- CI missing for POWER9/Clang
- Replace SSE with native VSX
VSX Targets
When running Real-ESRGAN in ncnn on POWER9, most CPU time (over 81%) is spent inside gemm_transB_packed_tile
in convolution_3x3_winograd.h
, which uses SSE2. This may be a good target for rewriting in native VSX.