Difference between revisions of "Troubleshooting/Guard Partition"
(more information about the mechanics behind this required) |
m (use consistent terms) |
||
Line 4: | Line 4: | ||
'''Note:''' | '''Note:''' | ||
− | CPUs being | + | CPUs being guarded out ''might'' not be a rare occurrence. It has been reported [https://www.talospace.com/2020/05/the-case-of-disappearing-core.html here] and [http://tenfourfox.blogspot.com/2018/05/a-semi-review-of-raptor-talos-ii.html here] for example. Which also could mean that it is "dialed-in" to be very safe. More insight into the mechanics in this wiki would be appreciated. |
Revision as of 09:39, 3 June 2022
If some components (e.g. a CPU or some cores on a CPU) are not being detected, they may have been guarded out. This is a mechanism used to allow POWER systems to function when broken components are detected, but if a component is incorrectly detected as broken (or if it really is broken but is later fixed), it can prevent the component from working until the spurious guard entry is manually cleared.
To clear the guard partition (and thereby force the system to try those components again on next boot), issue pflash -P GUARD -c
from the BMC shell.
Note: CPUs being guarded out might not be a rare occurrence. It has been reported here and here for example. Which also could mean that it is "dialed-in" to be very safe. More insight into the mechanics in this wiki would be appreciated.