Difference between revisions of "Debricking the BMC/Watchdog"
(fix bug pointed out by awilfox: make sure and return properly so the stack doesn't get damaged.) |
m (HLandau moved page Talos II/U-Boot Recovery/Watchdog to Debricking the BMC/Watchdog: Not Talos specific anymore) |
(No difference)
|
Revision as of 13:14, 2 March 2019
Once standby power has been applied to the Talos II (i.e. you have plugged it in), the FPGA waits for the power to stabilize, and then signals the BMC to reset.
At this point, a watchdog counter in the FPGA begins running. This is done because of a hardware issue in the ast2500, where very occasionally the chip would get stuck when attempting to start.[1] If the BMC boot gets stuck in the U-Boot phase, the FPGA will force a reset after approximately 40 seconds. This interferes with BMC recovery, as that leaves you with a limited window to run commands.
U-Boot on the Talos II has been slightly modified to disable this watchdog counter immediately before transferring control to the OpenBMC kernel.[2]
Since we will want to spend more than 40 seconds at the U-Boot shell during recovery, we need to disable this watchdog ourselves.
However, general GPIO support is not built into the Talos II U-Boot loader, so the hardware must be manipulated directly.
Using an ARM cross-compiler, we can build a tiny program to do the same thing as the bootm.c code.
Payload creation
/* watchdog.c - minimal code to disable the FPGA watchdog.
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.
* SPDX-License-Identifier: GPL-2.0+
*/
#include <stdint.h>
int main() {
uint32_t* gpio_ctl_reg = 0x1e780084;
uint32_t* gpio_data_reg = 0x1e780080;
*gpio_ctl_reg |= 0x00800000;
*gpio_data_reg &= ~0x00800000;
return 0;
}
Compiling this to an object and then converting it to an s-record will give you a file that can be directly loaded into U-Boot without using special tools.
$ arm-linux-gnueabihf-gcc -ffreestanding -march=armv6 -mfloat-abi=soft -marm -Os -c watchdog.c
$ arm-linux-gnueabihf-objcopy -O srec watchdog.o watchdog.srec
$ cat watchdog.srec
S00E0000746F67676C652E7372656394 S11300001C309FE50000A0E3842093E5022582E3F1 S1130010842083E5802093E50225C2E3802083E5E4 S10B00201EFF2FE10000781E11 S9030000FC
Due to compiler differences, your output may be slightly different than the provided example s-record code. The example was compiled with arm-linux-gnueabihf-gcc (Debian 6.3.0-18) 6.3.0 20170516
.
- Copy the srec data to the clipboard, as we will need to send it to the BMC within a limited time window in a bit.
Main Procedure
To load and execute this code, do the following at the ast#
shell within the watchdog time window:
ast# loads 83000000
- Paste the contents of the srec file into the terminal when prompted. Some summary data will be shown on screen.
(todo: grab and stick the example output in here.)
Run the code using the go
command.
ast# go 83000000
(todo: stick the output in here)
The program will run and then return control to U-Boot. At this point, the watchdog has been disabled, and you can take your time with the rest of the recovery commands. The loaded program code is no longer needed and the memory range can be reused.
Notes
- For reference, the GPIO pin associated with this watchdog is GPIOS7 (GPIO 151, physical pin AA20 on the ast2500). The signal is labelled SEQ_CONT if you are looking at it on the schematics.