Difference between revisions of "Checkstop"
JeremyRand (talk | contribs) (→Diagnosing a Checkstop: Client console) |
JeremyRand (talk | contribs) (Known Checkstop Issues) |
||
Line 22: | Line 22: | ||
While the checkstop occurs, be connected to the [[BMC Client Console]] from another machine. During the subsequent forced reboot, Hostboot will print a log of the checkstop. | While the checkstop occurs, be connected to the [[BMC Client Console]] from another machine. During the subsequent forced reboot, Hostboot will print a log of the checkstop. | ||
+ | |||
+ | = Known Checkstop Issues = | ||
+ | |||
+ | == (NCUFIR[11]) NCU no response to snooped TLBIE == | ||
+ | |||
+ | This is a firmware bug that was already [https://delivery04.dhe.ibm.com/sar/CMA/SFA/09zs6/0/9006-12p-22p-OpenPowerReadme.op920.41.xhtml#__RefHeading___Toc5321_1053759979 fixed by IBM PNOR v2.18]. Unfortunately that fix has not yet been merged by Raptor. |
Revision as of 13:41, 3 March 2023
Contents
Diagnosing a Checkstop
There are a few ways to obtain logs of a checkstop.
nvram
From either the OS or Skiroot, run this as root/sudo after the machine has force-rebooted following the checkstop (but before rebooting again):
nvram --unzip lnx,oops-log
If you're lucky, it will return a log of the most recent checkstop. If you instead get nvram: ERROR: can't decompress text: inflate() returned -3
, then the log in NVRAM is corrupted for some reason, and you'll need to try a different approach.
opal-prd
Before the checkstop occurs, run the following from the OS (this is for Debian; most other distros package it as well; see your distro's documentation for details):
sudo apt install opal-prd
Once installed, if you're lucky, any subsequent checkstops should show up in journalctl
output.
Client Console
While the checkstop occurs, be connected to the BMC Client Console from another machine. During the subsequent forced reboot, Hostboot will print a log of the checkstop.
Known Checkstop Issues
(NCUFIR[11]) NCU no response to snooped TLBIE
This is a firmware bug that was already fixed by IBM PNOR v2.18. Unfortunately that fix has not yet been merged by Raptor.