Operating System Specific Workarounds/Debian

From RCS Wiki
Revision as of 09:19, 10 February 2023 by JeremyRand (talk | contribs) (→‎Blank VGA: Link to BMC Client Console docs)
Jump to navigation Jump to search

Debian-Specific Issues and Workarounds

Bullseye

Blank VGA

Cause

The required open-source (libre) video driver module (ast) is not included in the installer package

Upstream Bug Report

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990016

Status

Root cause found, untested patch provided.

Workaround

Debian can be installed via the serial console, then switched to VGA mode after installation (i.e. after the ASpeed driver module is available to the kernel). The following instructions assume you have SSH shell access to the BMC:

  • Install
  1. With system power off, connect to the BMC Client Console.
  2. Attach the Debian install media
  3. Power on the system
  4. When the Petitboot screen appears, select the Debian install option you want and press e to edit
  5. Highlight the "Boot arguments" field and append console=hvc0 after quiet. Ensure there is at least one space between quiet and console=hvc0.
  6. Highlight OK and press Enter
  7. Ensure the previously-selected Debian install option is still selected, then press Enter
  8. Install Debian normally
  • Switch from serial to VGA text console
  1. Edit /etc/initramfs-tools/modules and append ast to it
  2. As root, run update-initramfs -u -k all
  3. Edit /etc/default/grub and change the console=hvc0 string to console=tty0
  4. As root, run update-grub
  5. Reboot

Blank output on GPU after blacklisting AST

Cause

There appears to be an oversight in Debian wherein, when the firmware-amd-graphics package is installed, initramfs is not updated automatically to include the amdgpu module and the GPU firmware.

Workaround

  • Manually load amdgpu into initramfs
  1. Append amdgpu to /etc/initramfs-tools/modules
  2. As root, run update-initramfs -u
  3. Edit /etc/default/grub and change GRUB_CMDLINE_LINUX= to GRUB_CMDLINE_LINUX="modprobe.blacklist=ast video=offb:off console=tty0"
  4. Run update-grub as root
  5. Shutdown
  6. With BMC power disconnected, ascertain that the J10109 jumper is set as disabled before booting again

AMD GPUs: OpenGL failure / rectangular boxes instead of fonts in GDM3

Cause

Debian ships a patched Linux kernel that introduces Debian-specific bugs in the AMD GPU driver. One of these bugs impacts all 64k page size systems, including the default kernels for ppc64le, and leads to generic failure of most OpenGL applications, including failure of font rendering in Gnome applications and, most notably, GDM3.

Upstream Bug Report

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990279

Status

Fixed.

Workaround

Apply the patch in the listed bug report (and reproduced below) to the current Debian kernel, then recompile and install:

From 02c987eb2ab0cdfd536d08bf812f4e37d3cc150a Mon Sep 17 00:00:00 2001
From: Huacai Chen <chenhc@lemote.com>
Date: Tue, 30 Mar 2021 23:33:33 +0800
Subject: [PATCH] drm/amdgpu: Set a suitable dev_info.gart_page_size
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

commit f4d3da72a76a9ce5f57bba64788931686a9dc333 upstream.

In Mesa, dev_info.gart_page_size is used for alignment and it was
set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU
driver requires an alignment on CPU pages.  So, for non-4KB page system,
gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE).

Signed-off-by: Rui Wang <wangr@lemote.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Link: https://github.com/loongson-community/linux-stable/commit/caa9c0a1
[Xi: rebased for drm-next, use max_t for checkpatch,
     and reworded commit message.]
Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang>
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1549
Tested-by: Dan Horák <dan@danny.cz>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Salvatore Bonaccorso: Backport to 5.10.y which does not contain
a5a52a43eac0 ("drm/amd/amdgpu/amdgpu_kms: Remove 'struct
drm_amdgpu_info_device dev_info' from the stack") which removes dev_info
from the stack and places it on the heap.]
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index efda38349a03..917b94002f4b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -766,9 +766,9 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 			dev_info.high_va_offset = AMDGPU_GMC_HOLE_END;
 			dev_info.high_va_max = AMDGPU_GMC_HOLE_END | vm_size;
 		}
-		dev_info.virtual_address_alignment = max((int)PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
+		dev_info.virtual_address_alignment = max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
 		dev_info.pte_fragment_size = (1 << adev->vm_manager.fragment_size) * AMDGPU_GPU_PAGE_SIZE;
-		dev_info.gart_page_size = AMDGPU_GPU_PAGE_SIZE;
+		dev_info.gart_page_size = max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
 		dev_info.cu_active_number = adev->gfx.cu_info.number;
 		dev_info.cu_ao_mask = adev->gfx.cu_info.ao_cu_mask;
 		dev_info.ce_ram_size = adev->gfx.ce_ram_size;
-- 
2.33.0

Note that compiling a new Debian kernel as shown requires 24GB or more free disk space, and that you will need to repeat this process every time a Debian kernel security update is released. To reduce load on all impacted users, if you are impacted we strongly recommend you help put pressure on the Debian maintainers to merge the tested and working patch that is languishing in the bug report above.

On a default Bullseye system:

  1. apt-get source linux-image-5.10.0-10-powerpc64le
  2. cd linux-5.10.*
  3. Patch drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
  4. make bindeb-pkg -j32 (adjust -j for number of CPUs to use for build)
  5. cd ..
  6. dpkg -i linux-image-5.10.*_5.10.*-1_ppc64el.deb