https://wiki.raptorcs.com/w/api.php?action=feedcontributions&user=Bdragon&feedformat=atomRCS Wiki - User contributions [en]2024-03-29T02:28:46ZUser contributionsMediaWiki 1.33.1https://wiki.raptorcs.com/w/index.php?title=User:Bdragon/BMCNotes&diff=3403User:Bdragon/BMCNotes2020-12-01T23:34:06Z<p>Bdragon: Working on improvised BMC connection documentation.</p>
<hr />
<div>= How to connect to the BMC serial interface in a pinch =<br />
<br />
== Parts needed: ==<br />
* A null modem cable (9-pin DE9)<br />
* A second machine with a serial port or USB to serial adapter (RS232 style, NOT a TTL UART dongle)<br />
* 3 jumper wires with female 2.54mm sockets on one end (Sold variously as "breadboard" or "prototyping" wires)<br />
<br />
== Procedure: ==<br />
<br />
[[File:Jumper wire preparation.jpg|thumb|Jumper wire preparation]]<br />
<br />
# Trim the socketless end of the jumper wires and strip the insulation back by 1cm (TODO verify) or so.<br />
# For each wire, fold the end of the stripped wire over itself slightly as depicted. This will be used as a makeshift spring contact.<br />
# Insert the prepared bare ends of the jumpers into pins 2 (RxD), 3 (TxD), and 5 (GND) of one end of the null modem cable. (The pin numbers are stamped next to each female socket on the plastic shell. Use a magnifying glass and/or bright light to identify.)<br />
# Connect the female ends of the jumpers to the BMC serial socket. (TODO verify table)<br />
{| class="wikitable"<br />
|-<br />
! Board side !! Cable side<br />
|-<br />
| Pin 3 || Pin 2<br />
|-<br />
| Pin 5 || Pin 3<br />
|-<br />
| Pin 9 || Pin 5<br />
|}<br />
<br />
Connect the other end of the null modem cable to your second machine and set your terminal software for 115200 8N1, no flow control.<br />
[[File:Improvised BMC connection.jpg|thumb|Completed improvised BMC connection]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=File:Improvised_BMC_connection.jpg&diff=3402File:Improvised BMC connection.jpg2020-12-01T23:32:10Z<p>Bdragon: </p>
<hr />
<div>Completed setup for improvised BMC connection.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=File:Jumper_wire_preparation.jpg&diff=3401File:Jumper wire preparation.jpg2020-12-01T23:02:49Z<p>Bdragon: </p>
<hr />
<div>Photo of stripped jumper wire with contact fold completed.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC&diff=3081Debricking the BMC2020-03-07T03:39:24Z<p>Bdragon: quick fix for the socflash instructions for Blackbird.</p>
<hr />
<div>==Purpose==<br />
This guide explains how to debrick the BMC when the BMC has been rendered inoperable, for example due to a defective firmware update.<br />
<br />
==Applicability==<br />
All RCS [[OpenPOWER]] systems.<br />
<br />
==Overview==<br />
There are three means of debricking the BMC:<br />
<br />
* Remove the BMC SPI flash chip and reflash it with a flash programmer<br />
** Note: [https://www.flashrom.org/Flashrom flashrom] versions earlier than 1.1 do not support the BMC flash chip<br />
* Flash new BMC firmware via U-Boot TFTP (requires that U-Boot is still intact on the flash)<br />
* Flash new BMC firmware via serial port (requires proprietary BMC chip vendor tool)<br />
<br />
==Reset persistent storage==<br />
This is applicable if somehow the persistent storage (SSH keys, passwords, IPMI error logs, etc.) has been corrupted, but the read only data (U-boot, kernel, initramfs) are all intact. This is also the easiest and least invasive recovery method if you have forgotten the BMC password.<br />
<br />
From the U-boot prompt on the BMC serial console, run the following (must be run quickly, to avoid watchdog timeouts):<br />
<br />
<code>printenv</code><br />
<br />
Look at the bootargs command, set the same environment variable but insert <code>overlay-filesystem-in-ram</code> before the <code>rw</code> keyword.<br />
<br />
Example for Blackbird HW version 1.01:<br />
<br />
<code>setenv bootargs console=ttyS4,115200n8 root=/dev/ram overlay-filesystem-in-ram rw</code><br />
<br />
Then run <code>boot</code> to continue the boot process.<br />
<br />
This will start the BMC with default settings, but the existing persistent data has not yet been cleared. To clear it, log in as root, then run:<br />
<br />
<code>flash_eraseall /dev/mtd/rwfs</code><br />
<br />
<code>reboot</code><br />
<br />
==Flash new BMC firmware via U-Boot TFTP==<br />
'''Note:''' While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. <br />
IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.<br />
Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.<br />
--><br />
<br />
In the event of a failure when updating the BMC, but with a functioning U-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or the BMC serial port.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot does not load properly, these instructions will not work; you will need to remove and reflash the BMC flash chip externally, or flash new firmware [[#Flash new BMC firmware via serial port|via serial port]].<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root.<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required). The serial port configuration is <tt>115200,8n1</tt>.<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* If you are having trouble with U-Boot resetting while you are trying to run these steps, have a slow network, or you are going to be loading over serial, you can [[Debricking the BMC/Watchdog|disable the FPGA watchdog]].<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?) note: the kernel param for bypassing rwfs is "overlay-filesystem-in-ram". Append it to the existing boot-args before running the bootm command. This can also be used as part of a password reset procedure.<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access) note: I (Bdragon) have successfully done a ram-only boot using cu's built in xmodem support (escape sequence ~X) to do an image transfer into RAM over the BMC serial interface.<br />
* (TODO: Discuss using u-boot's built in cmp tool to perform basic validation of the u-boot image against a second copy loaded into RAM.)<br />
* (TODO: Load recovery images over USB?) note: The onboard USB port is connected to the USB switch after all, so this might be problematic.<br />
* (TODO: Discussion of u-boot memory map) Short version is: flash lives at 0x20000000 and default base address for the memory loading tools is 0x83000000. So add 0x63000000 to any flash address to get the eqivilent address for an image-bmc file loaded into RAM. For example, the bootable image of a loaded image-bmc is at 0x83080000.<br />
<br />
==Flash new BMC firmware via serial port==<br />
<br />
''This method was discovered by Centurion Dan as an alternative to pulling and reflashing the BMC SPI chip after a failed update had corrupted/wiped U-Boot.''<br />
<br />
Tools required:<br />
<br />
* [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]]<br />
* An x86 computer with a serial port (usb to serial works fine) - preferably running linux.<br />
<br />
Software:<br />
* Proprietary SOC Flash Utility from [https://www.aspeedtech.com/support.php Aspeed Technology's Support Page]: at least version [http://upload.aspeedtech.com/SOC/v11800.zip 1.18.00]. Since version 1.20.x Aspeed requires being registered as developer to download this util:<br />
<br />
ASPEED SOC Flash Utility --- The utility has been moved to Document Download Page for ASPEED registered developers to access.<br />
<br />
* BMC Firmware bundle: [[Talos_II/Firmware Firmware]] BMC [[:File:Talos_ii_openbmc_v1.07_bundle.tar.bz2| System Package 1.06 2a92dec044239591244b6ed69c3fac162a6b9ea4]]<br />
<br />
Procedure:<br />
<br />
# Unzip the SOC FLASH Utility on your other computer, and unzip the appropriate SOC Flash Utility bundle for that computer.<br />
# Extract the BMC firmware bundle.<br />
# Run the following command '''./socflash -s option=u comport="4" cs=0 if=image-u-boot gpio_b=S71 gpio_a=S70 option=f'''<br />
#* You can drop the'' '''option=f''' ''for a slower but verified write process''<br />
#* if your serial interface can handle the baudrate 921600 add the parameter:'' ''' baudrate=921600'''<br />
#* if you want to see what is going on, you can strace it by prepending:'' '''strace -e trace=open,close,read,write''' to the command above.<br />
#* NOTE: If you are using updated firmware (Talos II/Lite 2.0 beta firmware or later) or are using a Blackbird, U-boot will shut down access to this interface after about 3 seconds of standby power, so you will need to run the command *immediately* after plugging in the power supply to bypass this.<br />
# Be Patient: it took me about 45 minutes to complete the flash process.<br />
<br />
Notes:<br />
* ''gpio_b=S71'' and ''gpio_a=S70'' are used to turn off the fpga watchdog timer before the flash process and then re-enables it after it's completed.<br />
* On a Blackbird, replace ''gpio_b=S71'' with ''gpio_b=G01'' and ''gpio_a=S70'' with ''gpio_a=G00''. Due to the new HDMI interface, the BMC watchdog GPIO was moved to a different pin on the AST2500.<br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC/Watchdog&diff=3080Debricking the BMC/Watchdog2020-03-07T03:32:54Z<p>Bdragon: Updating GPIO pin in notes for Blackbird. I am too lazy to convert it into an ordinal GPIO number currently.</p>
<hr />
<div>Once standby power has been applied to the Talos II (i.e. you have plugged it in), the FPGA waits for the power to stabilize, and then signals the BMC to reset.<br />
<br />
At this point, a watchdog counter in the FPGA begins running. This is done because of a hardware issue in the ast2500, where very occasionally the chip would get stuck when attempting to start.[https://git.raptorcs.com/git/talos-system-fpga/commit/main.v?id=e90ca898402a250e9d2f6e303e25ddaceb0cf8d6] If the BMC boot gets stuck in the U-Boot phase, the FPGA will force a reset after approximately 40 seconds. This interferes with BMC recovery, as that leaves you with a limited window to run commands.<br />
<br />
U-Boot on the Talos II has been slightly modified to disable this watchdog counter immediately before transferring control to the OpenBMC kernel.[https://git.raptorcs.com/git/talos-obmc-uboot/tree/common/bootm.c?id=cfee45a3ef2a592d130573ce3b7d8bcfe056060b#n707]<br />
<br />
Since we will want to spend more than 40 seconds at the U-Boot shell during recovery, we need to disable this watchdog ourselves.<br />
<br />
However, general GPIO support is not built into the Talos II U-Boot loader, so the hardware must be manipulated directly.<br />
<br />
Using an ARM cross-compiler, we can build a tiny program to do the same thing as the bootm.c code.<br />
<br />
== Payload creation ==<br />
<br />
=== For Talos II: ===<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780084;<br />
uint32_t* gpio_data_reg = 0x1e780080;<br />
<br />
*gpio_ctl_reg |= 0x00800000;<br />
*gpio_data_reg &= ~0x00800000;<br />
return 0;<br />
}<br />
</source><br />
<br />
=== For Blackbird: ===<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780024;<br />
uint32_t* gpio_data_reg = 0x1e780020;<br />
<br />
*gpio_ctl_reg |= 0x00010000;<br />
*gpio_data_reg &= ~0x00010000;<br />
return 0;<br />
}<br />
</source><br />
<br />
Compiling this to an object and then converting it to an s-record will give you a file that can be directly loaded into U-Boot without using special tools.<br />
<br />
<code>$ arm-linux-gnueabihf-gcc -ffreestanding -march=armv6 -mfloat-abi=soft -marm -Os -c watchdog.c</code><br />
<br />
<code>$ arm-linux-gnueabihf-objcopy -O srec watchdog.o watchdog.srec</code><br />
<br />
<code>$ cat watchdog.srec</code><br />
<br />
=== For Talos II: ===<br />
<pre>S01000007761746368646F672E73726563C3<br />
S11300001C309FE50000A0E3842093E5022582E3F1<br />
S1130010842083E5802093E50225C2E3802083E5E4<br />
S10B00201EFF2FE10000781E11<br />
S9030000FC</pre><br />
<br />
=== For Blackbird: ===<br />
<pre>S01000007761746368646F672E73726563C3<br />
S11300001C309FE50000A0E3242093E5012882E34F<br />
S1130010242083E5202093E50128C2E3202083E502<br />
S10B00201EFF2FE10000781E11<br />
S9030000FC</pre><br />
<br />
Due to compiler differences, your output may be slightly different than the provided example s-record code. The example was compiled with <code>arm-linux-gnueabihf-gcc (Debian 8.3.0-2) 8.3.0</code>.<br />
<br />
* Copy the srec data to the clipboard, as we will need to send it to the BMC within a limited time window in a bit.<br />
<br />
== Main Procedure ==<br />
<br />
To load and execute this code, do the following at the <code>ast#</code> shell within the watchdog time window:<br />
<br />
* <code>ast# loads 83000000</code><br />
<br />
* Paste the contents of the srec file into the terminal when prompted. Some summary data will be shown on screen.<br />
<br />
(todo: grab and stick the example output in here.)<br />
<br />
Run the code using the <code>go</code> command.<br />
<br />
* <code>ast# go 83000000</code><br />
<br />
(todo: stick the output in here)<br />
<br />
The program will run and then return control to U-Boot. At this point, the watchdog has been disabled, and you can take your time with the rest of the recovery commands. The loaded program code is no longer needed and the memory range can be reused.<br />
<br />
== Notes ==<br />
<br />
* For reference, the GPIO pin associated with this watchdog (For Talos II / Lite) is GPIOS7 (GPIO 151, physical pin AA20 on the ast2500). The signal is labelled SEQ_CONT if you are looking at it on the schematics.<br />
* On Blackbird, this pin is instead GPIOG0 (GPIO ???, Physical pin A19 on the ast2500). The signal is labelled BMC_BOOT_PHASE if you are looking at it on the schematics.<br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC/Watchdog&diff=3073Debricking the BMC/Watchdog2020-03-01T19:42:35Z<p>Bdragon: Add blackbird program and recompile under Debian Buster.</p>
<hr />
<div>Once standby power has been applied to the Talos II (i.e. you have plugged it in), the FPGA waits for the power to stabilize, and then signals the BMC to reset.<br />
<br />
At this point, a watchdog counter in the FPGA begins running. This is done because of a hardware issue in the ast2500, where very occasionally the chip would get stuck when attempting to start.[https://git.raptorcs.com/git/talos-system-fpga/commit/main.v?id=e90ca898402a250e9d2f6e303e25ddaceb0cf8d6] If the BMC boot gets stuck in the U-Boot phase, the FPGA will force a reset after approximately 40 seconds. This interferes with BMC recovery, as that leaves you with a limited window to run commands.<br />
<br />
U-Boot on the Talos II has been slightly modified to disable this watchdog counter immediately before transferring control to the OpenBMC kernel.[https://git.raptorcs.com/git/talos-obmc-uboot/tree/common/bootm.c?id=cfee45a3ef2a592d130573ce3b7d8bcfe056060b#n707]<br />
<br />
Since we will want to spend more than 40 seconds at the U-Boot shell during recovery, we need to disable this watchdog ourselves.<br />
<br />
However, general GPIO support is not built into the Talos II U-Boot loader, so the hardware must be manipulated directly.<br />
<br />
Using an ARM cross-compiler, we can build a tiny program to do the same thing as the bootm.c code.<br />
<br />
== Payload creation ==<br />
<br />
=== For Talos II: ===<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780084;<br />
uint32_t* gpio_data_reg = 0x1e780080;<br />
<br />
*gpio_ctl_reg |= 0x00800000;<br />
*gpio_data_reg &= ~0x00800000;<br />
return 0;<br />
}<br />
</source><br />
<br />
=== For Blackbird: ===<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780024;<br />
uint32_t* gpio_data_reg = 0x1e780020;<br />
<br />
*gpio_ctl_reg |= 0x00010000;<br />
*gpio_data_reg &= ~0x00010000;<br />
return 0;<br />
}<br />
</source><br />
<br />
Compiling this to an object and then converting it to an s-record will give you a file that can be directly loaded into U-Boot without using special tools.<br />
<br />
<code>$ arm-linux-gnueabihf-gcc -ffreestanding -march=armv6 -mfloat-abi=soft -marm -Os -c watchdog.c</code><br />
<br />
<code>$ arm-linux-gnueabihf-objcopy -O srec watchdog.o watchdog.srec</code><br />
<br />
<code>$ cat watchdog.srec</code><br />
<br />
=== For Talos II: ===<br />
<pre>S01000007761746368646F672E73726563C3<br />
S11300001C309FE50000A0E3842093E5022582E3F1<br />
S1130010842083E5802093E50225C2E3802083E5E4<br />
S10B00201EFF2FE10000781E11<br />
S9030000FC</pre><br />
<br />
=== For Blackbird: ===<br />
<pre>S01000007761746368646F672E73726563C3<br />
S11300001C309FE50000A0E3242093E5012882E34F<br />
S1130010242083E5202093E50128C2E3202083E502<br />
S10B00201EFF2FE10000781E11<br />
S9030000FC</pre><br />
<br />
Due to compiler differences, your output may be slightly different than the provided example s-record code. The example was compiled with <code>arm-linux-gnueabihf-gcc (Debian 8.3.0-2) 8.3.0</code>.<br />
<br />
* Copy the srec data to the clipboard, as we will need to send it to the BMC within a limited time window in a bit.<br />
<br />
== Main Procedure ==<br />
<br />
To load and execute this code, do the following at the <code>ast#</code> shell within the watchdog time window:<br />
<br />
* <code>ast# loads 83000000</code><br />
<br />
* Paste the contents of the srec file into the terminal when prompted. Some summary data will be shown on screen.<br />
<br />
(todo: grab and stick the example output in here.)<br />
<br />
Run the code using the <code>go</code> command.<br />
<br />
* <code>ast# go 83000000</code><br />
<br />
(todo: stick the output in here)<br />
<br />
The program will run and then return control to U-Boot. At this point, the watchdog has been disabled, and you can take your time with the rest of the recovery commands. The loaded program code is no longer needed and the memory range can be reused.<br />
<br />
== Notes ==<br />
<br />
* For reference, the GPIO pin associated with this watchdog is GPIOS7 (GPIO 151, physical pin AA20 on the ast2500). The signal is labelled SEQ_CONT if you are looking at it on the schematics.<br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Desktop_Roadmap&diff=2506Desktop Roadmap2019-06-02T05:23:53Z<p>Bdragon: making a note regarding BMC password change</p>
<hr />
<div>This page is currently a very hasty list of the roadmap needed to make the Talos an "everyday common user's" machine.<br />
<br />
For convenience, unfinished tasks have been grouped into three categories: "Urgently Needed", "Somewhat Needed", and "Would Be Nice" in descending order of importance. <br />
<br />
="Urgently Needed"=<br />
* <del>"Safe By Default" Randomly generated BMC Passphrase with password written down on a sheet of cardboard in the package. </del><br />
* <del>''Rationale:'' even some of our users have had trouble with this. The default insecure password with the BMC could result in an instant compromise of the machine and require full flashing of all persistent firmware components in the event the computer is accidentally plugged into the network and the power at the same time. This completely innocent mistake could be fatal and recovering from it difficult. The threat model of a randomly determined BMC Passphrase would be if the user accidentally plugs the computer into the untrusted internet against a passive adversary that will simply try the default passwords, similar to how the Mirai Botnet operated.</del> <br />
** It appears that as of the Blackbird launch, each board gets an individual randomized BMC password! It's printed on a slip of paper in the box!<br />
* "[[Talos II Beginner's Quick Start Guide]]" in Talos User's Manual<br />
''Rationale:'' nontechnical users may have difficulty with the complicated procedure to remotely access and set the BMC password from a trustworthy system.<br />
* "Hole Pattern Template" <br />
''Rationale:'' A reusable cardboard or a fold-out paper template in the manual for seeing which standoffs to install and not to install would be really helpful to avoid the "scraped resistor" problem that have plagued a couple builders.<br />
* Firefox Just in Time compiler for Javascript<br />
<br />
="Somewhat Needed"=<br />
* Tor Browser Bundle with safe configuration defaults<br />
<br />
=Would Be Nice=<br />
* "Easy Build" Script for building Unreal Tournament 4 for nontechnical users? <br />
* Android Builder for building smartphone OSes? <br />
* Cryptsetup (dm-crypt) and verity in Petitboot for firmware-based full disk encryption?<br />
* FreeCAD? (May or may not be upstreamed yet?)<br />
* Maybe open up a discussion on the feasibility of allowing the changing of the default BMC password through the petitboot? Is this even possible?<br />
** It might be possible to do it over IPMI from Petitboot or other host OS.<br />
<br />
=Done=<br />
* Chromium With Just In Time JavaScript<br />
* Electron with Just In Time JavaScript <br />
* AMDGPU Kernel DMA Patches (Possibly upstreamed?)<br />
* Firefox Quantum running stably (Not upstreamed yet)<br />
* Office Suite (LibreOffice, TeXStudio<br />
* Libre Games (SuperTuxKart, Chromium BSU, Super Tux, Tux Racer, Blob Wars, Open Transit Tycoon, Open Roller Coaster Tycoon, etc)<br />
* Unreal Tournament 4 Tested and working and demonstrated. <br />
* OBS (Needs to be upstreamed?)<br />
* Thunderbird Stable (still hasn't made it to some distros yet, stay posted.)</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Desktop_Roadmap&diff=2505Desktop Roadmap2019-06-02T05:16:42Z<p>Bdragon: whoops, edited the wrong bit.</p>
<hr />
<div>This page is currently a very hasty list of the roadmap needed to make the Talos an "everyday common user's" machine.<br />
<br />
For convenience, unfinished tasks have been grouped into three categories: "Urgently Needed", "Somewhat Needed", and "Would Be Nice" in descending order of importance. <br />
<br />
="Urgently Needed"=<br />
* <del>"Safe By Default" Randomly generated BMC Passphrase with password written down on a sheet of cardboard in the package. </del><br />
* <del>''Rationale:'' even some of our users have had trouble with this. The default insecure password with the BMC could result in an instant compromise of the machine and require full flashing of all persistent firmware components in the event the computer is accidentally plugged into the network and the power at the same time. This completely innocent mistake could be fatal and recovering from it difficult. The threat model of a randomly determined BMC Passphrase would be if the user accidentally plugs the computer into the untrusted internet against a passive adversary that will simply try the default passwords, similar to how the Mirai Botnet operated.</del> <br />
** It appears that as of the Blackbird launch, each board gets an individual randomized BMC password! It's printed on a slip of paper in the box!<br />
* "[[Talos II Beginner's Quick Start Guide]]" in Talos User's Manual<br />
''Rationale:'' nontechnical users may have difficulty with the complicated procedure to remotely access and set the BMC password from a trustworthy system.<br />
* "Hole Pattern Template" <br />
''Rationale:'' A reusable cardboard or a fold-out paper template in the manual for seeing which standoffs to install and not to install would be really helpful to avoid the "scraped resistor" problem that have plagued a couple builders.<br />
* Firefox Just in Time compiler for Javascript<br />
<br />
="Somewhat Needed"=<br />
* Tor Browser Bundle with safe configuration defaults<br />
<br />
=Would Be Nice=<br />
* "Easy Build" Script for building Unreal Tournament 4 for nontechnical users? <br />
* Android Builder for building smartphone OSes? <br />
* Cryptsetup (dm-crypt) and verity in Petitboot for firmware-based full disk encryption?<br />
* FreeCAD? (May or may not be upstreamed yet?)<br />
* Maybe open up a discussion on the feasibility of allowing the changing of the default BMC password through the petitboot? Is this even possible?<br />
<br />
=Done=<br />
* Chromium With Just In Time JavaScript<br />
* Electron with Just In Time JavaScript <br />
* AMDGPU Kernel DMA Patches (Possibly upstreamed?)<br />
* Firefox Quantum running stably (Not upstreamed yet)<br />
* Office Suite (LibreOffice, TeXStudio<br />
* Libre Games (SuperTuxKart, Chromium BSU, Super Tux, Tux Racer, Blob Wars, Open Transit Tycoon, Open Roller Coaster Tycoon, etc)<br />
* Unreal Tournament 4 Tested and working and demonstrated. <br />
* OBS (Needs to be upstreamed?)<br />
* Thunderbird Stable (still hasn't made it to some distros yet, stay posted.)</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Desktop_Roadmap&diff=2504Desktop Roadmap2019-06-02T05:14:09Z<p>Bdragon: /* Would Be Nice */ updating for recent change regarding BMC password</p>
<hr />
<div>This page is currently a very hasty list of the roadmap needed to make the Talos an "everyday common user's" machine.<br />
<br />
For convenience, unfinished tasks have been grouped into three categories: "Urgently Needed", "Somewhat Needed", and "Would Be Nice" in descending order of importance. <br />
<br />
="Urgently Needed"=<br />
* "Safe By Default" Randomly generated BMC Passphrase with password written down on a sheet of cardboard in the package. <br />
''Rationale:'' even some of our users have had trouble with this. The default insecure password with the BMC could result in an instant compromise of the machine and require full flashing of all persistent firmware components in the event the computer is accidentally plugged into the network and the power at the same time. This completely innocent mistake could be fatal and recovering from it difficult. The threat model of a randomly determined BMC Passphrase would be if the user accidentally plugs the computer into the untrusted internet against a passive adversary that will simply try the default passwords, similar to how the Mirai Botnet operated. <br />
* "[[Talos II Beginner's Quick Start Guide]]" in Talos User's Manual<br />
''Rationale:'' nontechnical users may have difficulty with the complicated procedure to remotely access and set the BMC password from a trustworthy system.<br />
* "Hole Pattern Template" <br />
''Rationale:'' A reusable cardboard or a fold-out paper template in the manual for seeing which standoffs to install and not to install would be really helpful to avoid the "scraped resistor" problem that have plagued a couple builders.<br />
* Firefox Just in Time compiler for Javascript<br />
<br />
="Somewhat Needed"=<br />
* Tor Browser Bundle with safe configuration defaults<br />
<br />
=Would Be Nice=<br />
* "Easy Build" Script for building Unreal Tournament 4 for nontechnical users? <br />
* Android Builder for building smartphone OSes? <br />
* Cryptsetup (dm-crypt) and verity in Petitboot for firmware-based full disk encryption?<br />
* FreeCAD? (May or may not be upstreamed yet?)<br />
* <del>Maybe open up a discussion on the feasibility of allowing the changing of the default BMC password through the petitboot? Is this even possible?</del><br />
** It appears that as of the Blackbird launch, each board gets an individual randomized BMC password! It's printed on a slip of paper in the box!<br />
<br />
=Done=<br />
* Chromium With Just In Time JavaScript<br />
* Electron with Just In Time JavaScript <br />
* AMDGPU Kernel DMA Patches (Possibly upstreamed?)<br />
* Firefox Quantum running stably (Not upstreamed yet)<br />
* Office Suite (LibreOffice, TeXStudio<br />
* Libre Games (SuperTuxKart, Chromium BSU, Super Tux, Tux Racer, Blob Wars, Open Transit Tycoon, Open Roller Coaster Tycoon, etc)<br />
* Unreal Tournament 4 Tested and working and demonstrated. <br />
* OBS (Needs to be upstreamed?)<br />
* Thunderbird Stable (still hasn't made it to some distros yet, stay posted.)</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Add_GPU_Firmware_To_BOOTKERNFW&diff=2474Add GPU Firmware To BOOTKERNFW2019-05-19T20:05:09Z<p>Bdragon: add a procedure for backing up the BOOTKERNFW.</p>
<hr />
<div><br />
=Purpose=<br />
This guide explains how to add GPU (or other) firmware files to the BOOTKERNFW partition of the boot firmware flash.<br />
This allows Linux kernel drivers which require firmware blobs to function correctly within the Skiroot bootloader environment.<br />
<br />
For example, the <tt>amdgpu</tt> driver requires AMD firmware blobs to bring up AMD GPUs. Copying these files into the firmware partition enables attached displays to be brought up when showing the boot menu.<br />
<br />
=Background=<br />
Many cards require firmware to be loaded at startup time to initialize. On an X86 system, this is usually accomplished by bootstrap code called an "Option ROM", or by extra code in the BIOS.<br />
<br />
On OpenPOWER, this is not a possibility, due to the Option ROM code being platform-specific, as well as the OpenPOWER security stance being that trusting random bootstrap code on cards is a bad idea.<br />
<br />
Therefore, for things like video cards to work, it is necessary for the card's firmware to be loaded in before the card can be used. Linux video drivers in petitboot are capable of doing this loading, however, the actual firmware itself needs to be available so the drivers actually have something to load.<br />
<br />
So, on RCS OpenPOWER systems, there is a special area of flash set aside for user-provided firmware files, called BOOTKERNFW. Due to the limited size of this partition, the user is required to make decisions as to what firmware files to install.<br />
<br />
The exact size of BOOTKERNFW may vary depending on the firmware version you are on, but the standard size currently is 1,966,080 bytes (0x1E0000 hex, approximately 1.8MB.)<br />
<br />
=Applicability=<br />
All RCS OpenPOWER systems.<br />
<br />
=Instructions=<br />
<br />
==Step 1. Generate firmware partition image==<br />
Boot into an OS. A Linux environment is assumed. You will need the '''mksquashfs''' tool available.<br />
<br />
For Debian, you can install '''mksquashfs''' using the following command:<br />
$ apt install squashfs-tools<br />
<br />
Create a directory with the required firmware files available:<br />
$ mkdir /tmp/firmware<br />
$ # ... (copy required files into /tmp/firmware) ...<br />
<br />
The directory structure under <tt>/tmp/firmware</tt> gets mounted at <tt>/lib/firmware</tt>. Here is an example of the required directory structure:<br />
/tmp/firmware/radeon/PITCAIRN_pfp.bin<br />
/tmp/firmware/amdgpu/polaris10_mc.bin<br />
<br />
You can obtain these firmware files from most Linux distros (they tend to be installed in <tt>/lib/firmware</tt>), or from the [https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ linux-firmware repository].<br />
<br />
Having generated the correct directory structure, generate the image:<br />
$ cd /tmp/firmware<br />
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory<br />
<br />
<tt>firmware.bin</tt> is now a SquashFS image which you will copy to a firmware flash partition. You will need to access this image while in the [[Skiroot]] environment, so you may wish to copy it to e.g. <tt>/boot</tt> or another partition which you can easily access from [[Skiroot]].<br />
<br />
==Step 2. Copy firmware partition image to flash==<br />
<br />
=== Method 1: From Petitboot ===<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
Locate the firmware.bin file you prepared earlier. Petitboot mounts all filesystems it can recognize in folders under /var/petitboot/mnt/dev/, named after disk device partitions. In this example, we will assume that the file was at the path <tt>/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin</tt>.<br />
# ls /var/petitboot/mnt/dev/<br />
# ls /var/petitboot/mnt/dev/nvme0n1p2/boot/<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition. If you are on an old PNOR version that does not have pflash available, switch to following one of the other flashing methods.<br />
# pflash -P BOOTKERNFW -e -p /var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 2: From the BMC ===<br />
On your local machine, scp the firmware.bin to /tmp/ on the BMC.<br />
$ scp /tmp/firmware.bin root@bmc-ip-address:/tmp/firmware.bin<br />
<br />
Power off the host. (optional, but recommended)<br />
<br />
Log into the BMC on port 22.<br />
$ ssh root@bmc-ip-address<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition.<br />
# pflash -P BOOTKERNFW -e -p /tmp/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Power the host back up.<br />
<br />
On the host, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 3: Old Petitboot method, DANGEROUS ===<br />
WARNING: This method is error-prone and should only be done from the petitboot shell, NEVER on the BMC.<br />
<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
IMPORTANT: ACCIDENTALLY PERFORMING THESE INSTRUCTIONS ON THE BMC INSTEAD OF THE PETITBOOT CONSOLE MAY DAMAGE YOUR BMC FIRMWARE!<br />
<br />
Make sure you can see the <tt>BOOTKERNFW</tt> partition (should return: <tt>mtd5: 000e0000 00010000 "BOOTKERNFW"</tt>):<br />
# cat /proc/mtd | grep BOOTKERNFW<br />
<br />
Find the flash_erase (mine was in /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/):<br />
# find / -name flash_erase<br />
<br />
Erase <tt>/dev/mtd5</tt>:<br />
# /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/flash_erase /dev/mtd5 0 0<br />
<br />
Flash <tt>/dev/mtd5</tt>:<br />
# dd if=/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin of=/dev/mtd5 bs=64k<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
==Notes==<br />
If you are still getting an error message about not being able to find VGA BIOS, it may be because your GPU is being initialized before <tt>/lib/firmware</tt> is mounted. <br />
You can check this by running <tt>dmesg</tt> from the [[Skiroot]] shell. For example, my GPU was trying to initialize at around T+6.5 seconds and the <tt>mtd</tt> device wasn't actually mounted until T+7.25.<br />
I was able to successfully load the firmware after after <tt>/dev/mtd5</tt> was mounted by running <tt>rmmod amdgpu && modprobe amdgpu</tt> from the shell. However, I did experience stability issues in Debian Stretch with <tt>amdgpu</tt>. However, if I let [[Skiroot]] fail to load the firmware and boot anyway, the <tt>radeon</tt> driver works fine.<br />
<br />
e.g.<br />
<pre><br />
$ dmesg<br />
...<br />
[ 6.412518] [drm] radeon kernel modesetting enabled.<br />
[ 6.412848] pci 0033:00:00.0: enabling device (0105 -> 0107)<br />
[ 6.412867] radeon 0033:01:00.0: enabling device (0140 -> 0142)<br />
[ 6.413077] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6819 0x1043:0x0431 0x00).<br />
[ 6.413086] radeon: No suitable DMA available<br />
[ 6.413150] [drm:radeon_device_init] *ERROR* Unable to find PCI I/O BAR<br />
[ 6.533321] [drm:radeon_atombios_init] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO<br />
[ 6.533415] ATOM BIOS: 6819.15.17.0.0.AS01<br />
[ 6.533434] [drm] GPU not posted. posting now...<br />
[ 6.541203] radeon 0033:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)<br />
[ 6.541208] radeon 0033:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF<br />
[ 6.541211] [drm] Detected VRAM RAM=2048M, BAR=256M<br />
[ 6.541213] [drm] RAM width 256bits DDR<br />
[ 6.541295] [TTM] Zone kernel: Available graphics memory: 50118240 kiB<br />
[ 6.541297] [TTM] Zone dma32: Available graphics memory: 2097152 kiB<br />
[ 6.541299] [TTM] Initializing pool allocator<br />
[ 6.541371] [drm] radeon: 2048M of VRAM memory ready<br />
[ 6.541374] [drm] radeon: 2048M of GTT memory ready.<br />
[ 6.541390] [drm] Loading pitcairn Microcode<br />
[ 6.541429] radeon 0033:01:00.0: Direct firmware load for radeon/pitcairn_pfp.bin failed with error -2<br />
[ 6.541456] radeon 0033:01:00.0: Direct firmware load for radeon/PITCAIRN_pfp.bin failed with error -2<br />
[ 6.541459] si_cp: Failed to load firmware "radeon/PITCAIRN_pfp.bin"<br />
[ 6.541560] [drm:si_init] *ERROR* Failed to load firmware!<br />
[ 6.541653] radeon 0033:01:00.0: Fatal error during GPU init<br />
[ 6.541772] [drm] radeon: finishing device.<br />
...<br />
[ 7.123614] 6 ofpart partitions found on MTD device flash<br />
[ 7.123617] Creating 6 MTD partitions on "flash":<br />
[ 7.123621] 0x000000000000-0x000004000000 : "PNOR"<br />
[ 7.123773] 0x000001b21000-0x000003a21000 : "BOOTKERNEL"<br />
[ 7.123868] 0x000003b44000-0x000003b68000 : "CAPP"<br />
[ 7.123961] 0x000003b88000-0x000003b89000 : "VERSION"<br />
[ 7.124056] 0x000003b89000-0x000003bc9000 : "IMA_CATALOG"<br />
[ 7.124149] 0x000003f10000-0x000003ff0000 : "BOOTKERNFW"<br />
...<br />
$ rmmod amdgpu && modprobe amdgpu<br />
$ dmesg<br />
...<br />
[ 2727.836343] [drm] amdgpu kernel modesetting enabled.<br />
[ 2727.836900] amdgpu 0033:01:00.0: SI support provided by radeon.<br />
[ 2727.836905] amdgpu 0033:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.<br />
</pre><br />
<br />
==Backing up BOOTKERNFW==<br />
If you are using one of the prebuilt workstations that came with a GPU card preinstalled, you may wish to backup your BOOTKERNFW so you have a known-good version to go back to, as they come with the firmware preloaded into BOOTKERNFW and removing it may cause graphical display to stop working until you reload appropriate firmware into BOOTKERNFW.<br />
<br />
This backup can be taken from either the BMC or the Petitboot shell. The procedure is the same on either.<br />
# cd /tmp<br />
# pflash -P BOOTKERNFW -r BOOTKERNFW.bin<br />
<br />
From another machine, copy the backup to a permanent location.<br />
$ scp root@bmc-ip-address:/tmp/BOOTKERNFW.bin .<br />
<br />
==Erasing BOOTKERNFW==<br />
If you change your mind and decide to stop using BOOTKERNFW, you can erase it from either the BMC or the Petitboot shell. The procedure is the same on either.<br />
# pflash -P BOOTKERNFW -e<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
At this point you are back to an empty BOOTKERNFW partition.<br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Add_GPU_Firmware_To_BOOTKERNFW&diff=2473Add GPU Firmware To BOOTKERNFW2019-05-19T19:47:48Z<p>Bdragon: add stats on bootkernfw size</p>
<hr />
<div><br />
=Purpose=<br />
This guide explains how to add GPU (or other) firmware files to the BOOTKERNFW partition of the boot firmware flash.<br />
This allows Linux kernel drivers which require firmware blobs to function correctly within the Skiroot bootloader environment.<br />
<br />
For example, the <tt>amdgpu</tt> driver requires AMD firmware blobs to bring up AMD GPUs. Copying these files into the firmware partition enables attached displays to be brought up when showing the boot menu.<br />
<br />
=Background=<br />
Many cards require firmware to be loaded at startup time to initialize. On an X86 system, this is usually accomplished by bootstrap code called an "Option ROM", or by extra code in the BIOS.<br />
<br />
On OpenPOWER, this is not a possibility, due to the Option ROM code being platform-specific, as well as the OpenPOWER security stance being that trusting random bootstrap code on cards is a bad idea.<br />
<br />
Therefore, for things like video cards to work, it is necessary for the card's firmware to be loaded in before the card can be used. Linux video drivers in petitboot are capable of doing this loading, however, the actual firmware itself needs to be available so the drivers actually have something to load.<br />
<br />
So, on RCS OpenPOWER systems, there is a special area of flash set aside for user-provided firmware files, called BOOTKERNFW. Due to the limited size of this partition, the user is required to make decisions as to what firmware files to install.<br />
<br />
The exact size of BOOTKERNFW may vary depending on the firmware version you are on, but the standard size currently is 1,966,080 bytes (0x1E0000 hex, approximately 1.8MB.)<br />
<br />
=Applicability=<br />
All RCS OpenPOWER systems.<br />
<br />
=Instructions=<br />
<br />
==Step 1. Generate firmware partition image==<br />
Boot into an OS. A Linux environment is assumed. You will need the '''mksquashfs''' tool available.<br />
<br />
For Debian, you can install '''mksquashfs''' using the following command:<br />
$ apt install squashfs-tools<br />
<br />
Create a directory with the required firmware files available:<br />
$ mkdir /tmp/firmware<br />
$ # ... (copy required files into /tmp/firmware) ...<br />
<br />
The directory structure under <tt>/tmp/firmware</tt> gets mounted at <tt>/lib/firmware</tt>. Here is an example of the required directory structure:<br />
/tmp/firmware/radeon/PITCAIRN_pfp.bin<br />
/tmp/firmware/amdgpu/polaris10_mc.bin<br />
<br />
You can obtain these firmware files from most Linux distros (they tend to be installed in <tt>/lib/firmware</tt>), or from the [https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ linux-firmware repository].<br />
<br />
Having generated the correct directory structure, generate the image:<br />
$ cd /tmp/firmware<br />
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory<br />
<br />
<tt>firmware.bin</tt> is now a SquashFS image which you will copy to a firmware flash partition. You will need to access this image while in the [[Skiroot]] environment, so you may wish to copy it to e.g. <tt>/boot</tt> or another partition which you can easily access from [[Skiroot]].<br />
<br />
==Step 2. Copy firmware partition image to flash==<br />
<br />
=== Method 1: From Petitboot ===<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
Locate the firmware.bin file you prepared earlier. Petitboot mounts all filesystems it can recognize in folders under /var/petitboot/mnt/dev/, named after disk device partitions. In this example, we will assume that the file was at the path <tt>/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin</tt>.<br />
# ls /var/petitboot/mnt/dev/<br />
# ls /var/petitboot/mnt/dev/nvme0n1p2/boot/<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition. If you are on an old PNOR version that does not have pflash available, switch to following one of the other flashing methods.<br />
# pflash -P BOOTKERNFW -e -p /var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 2: From the BMC ===<br />
On your local machine, scp the firmware.bin to /tmp/ on the BMC.<br />
$ scp /tmp/firmware.bin root@bmc-ip-address:/tmp/firmware.bin<br />
<br />
Power off the host. (optional, but recommended)<br />
<br />
Log into the BMC on port 22.<br />
$ ssh root@bmc-ip-address<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition.<br />
# pflash -P BOOTKERNFW -e -p /tmp/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Power the host back up.<br />
<br />
On the host, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 3: Old Petitboot method, DANGEROUS ===<br />
WARNING: This method is error-prone and should only be done from the petitboot shell, NEVER on the BMC.<br />
<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
IMPORTANT: ACCIDENTALLY PERFORMING THESE INSTRUCTIONS ON THE BMC INSTEAD OF THE PETITBOOT CONSOLE MAY DAMAGE YOUR BMC FIRMWARE!<br />
<br />
Make sure you can see the <tt>BOOTKERNFW</tt> partition (should return: <tt>mtd5: 000e0000 00010000 "BOOTKERNFW"</tt>):<br />
# cat /proc/mtd | grep BOOTKERNFW<br />
<br />
Find the flash_erase (mine was in /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/):<br />
# find / -name flash_erase<br />
<br />
Erase <tt>/dev/mtd5</tt>:<br />
# /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/flash_erase /dev/mtd5 0 0<br />
<br />
Flash <tt>/dev/mtd5</tt>:<br />
# dd if=/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin of=/dev/mtd5 bs=64k<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
==Notes==<br />
If you are still getting an error message about not being able to find VGA BIOS, it may be because your GPU is being initialized before <tt>/lib/firmware</tt> is mounted. <br />
You can check this by running <tt>dmesg</tt> from the [[Skiroot]] shell. For example, my GPU was trying to initialize at around T+6.5 seconds and the <tt>mtd</tt> device wasn't actually mounted until T+7.25.<br />
I was able to successfully load the firmware after after <tt>/dev/mtd5</tt> was mounted by running <tt>rmmod amdgpu && modprobe amdgpu</tt> from the shell. However, I did experience stability issues in Debian Stretch with <tt>amdgpu</tt>. However, if I let [[Skiroot]] fail to load the firmware and boot anyway, the <tt>radeon</tt> driver works fine.<br />
<br />
e.g.<br />
<pre><br />
$ dmesg<br />
...<br />
[ 6.412518] [drm] radeon kernel modesetting enabled.<br />
[ 6.412848] pci 0033:00:00.0: enabling device (0105 -> 0107)<br />
[ 6.412867] radeon 0033:01:00.0: enabling device (0140 -> 0142)<br />
[ 6.413077] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6819 0x1043:0x0431 0x00).<br />
[ 6.413086] radeon: No suitable DMA available<br />
[ 6.413150] [drm:radeon_device_init] *ERROR* Unable to find PCI I/O BAR<br />
[ 6.533321] [drm:radeon_atombios_init] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO<br />
[ 6.533415] ATOM BIOS: 6819.15.17.0.0.AS01<br />
[ 6.533434] [drm] GPU not posted. posting now...<br />
[ 6.541203] radeon 0033:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)<br />
[ 6.541208] radeon 0033:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF<br />
[ 6.541211] [drm] Detected VRAM RAM=2048M, BAR=256M<br />
[ 6.541213] [drm] RAM width 256bits DDR<br />
[ 6.541295] [TTM] Zone kernel: Available graphics memory: 50118240 kiB<br />
[ 6.541297] [TTM] Zone dma32: Available graphics memory: 2097152 kiB<br />
[ 6.541299] [TTM] Initializing pool allocator<br />
[ 6.541371] [drm] radeon: 2048M of VRAM memory ready<br />
[ 6.541374] [drm] radeon: 2048M of GTT memory ready.<br />
[ 6.541390] [drm] Loading pitcairn Microcode<br />
[ 6.541429] radeon 0033:01:00.0: Direct firmware load for radeon/pitcairn_pfp.bin failed with error -2<br />
[ 6.541456] radeon 0033:01:00.0: Direct firmware load for radeon/PITCAIRN_pfp.bin failed with error -2<br />
[ 6.541459] si_cp: Failed to load firmware "radeon/PITCAIRN_pfp.bin"<br />
[ 6.541560] [drm:si_init] *ERROR* Failed to load firmware!<br />
[ 6.541653] radeon 0033:01:00.0: Fatal error during GPU init<br />
[ 6.541772] [drm] radeon: finishing device.<br />
...<br />
[ 7.123614] 6 ofpart partitions found on MTD device flash<br />
[ 7.123617] Creating 6 MTD partitions on "flash":<br />
[ 7.123621] 0x000000000000-0x000004000000 : "PNOR"<br />
[ 7.123773] 0x000001b21000-0x000003a21000 : "BOOTKERNEL"<br />
[ 7.123868] 0x000003b44000-0x000003b68000 : "CAPP"<br />
[ 7.123961] 0x000003b88000-0x000003b89000 : "VERSION"<br />
[ 7.124056] 0x000003b89000-0x000003bc9000 : "IMA_CATALOG"<br />
[ 7.124149] 0x000003f10000-0x000003ff0000 : "BOOTKERNFW"<br />
...<br />
$ rmmod amdgpu && modprobe amdgpu<br />
$ dmesg<br />
...<br />
[ 2727.836343] [drm] amdgpu kernel modesetting enabled.<br />
[ 2727.836900] amdgpu 0033:01:00.0: SI support provided by radeon.<br />
[ 2727.836905] amdgpu 0033:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.<br />
</pre><br />
<br />
==Erasing BOOTKERNFW==<br />
If you change your mind and decide to stop using BOOTKERNFW, you can erase it from either the BMC or the Petitboot shell. The procedure is the same on either.<br />
# pflash -P BOOTKERNFW -e<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
At this point you are back to an empty BOOTKERNFW partition.<br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Add_GPU_Firmware_To_BOOTKERNFW&diff=2472Add GPU Firmware To BOOTKERNFW2019-05-19T19:41:25Z<p>Bdragon: Add section on erasing BOOTKERNFW.</p>
<hr />
<div><br />
=Purpose=<br />
This guide explains how to add GPU (or other) firmware files to the BOOTKERNFW partition of the boot firmware flash.<br />
This allows Linux kernel drivers which require firmware blobs to function correctly within the Skiroot bootloader environment.<br />
<br />
For example, the <tt>amdgpu</tt> driver requires AMD firmware blobs to bring up AMD GPUs. Copying these files into the firmware partition enables attached displays to be brought up when showing the boot menu.<br />
<br />
=Background=<br />
Many cards require firmware to be loaded at startup time to initialize. On an X86 system, this is usually accomplished by bootstrap code called an "Option ROM", or by extra code in the BIOS.<br />
<br />
On OpenPOWER, this is not a possibility, due to the Option ROM code being platform-specific, as well as the OpenPOWER security stance being that trusting random bootstrap code on cards is a bad idea.<br />
<br />
Therefore, for things like video cards to work, it is necessary for the card's firmware to be loaded in before the card can be used. Linux video drivers in petitboot are capable of doing this loading, however, the actual firmware itself needs to be available so the drivers actually have something to load.<br />
<br />
So, on RCS OpenPOWER systems, there is a special area of flash set aside for user-provided firmware files, called BOOTKERNFW. Due to the limited size of this partition, the user is required to make decisions as to what firmware files to install.<br />
<br />
=Applicability=<br />
All RCS OpenPOWER systems.<br />
<br />
=Instructions=<br />
<br />
==Step 1. Generate firmware partition image==<br />
Boot into an OS. A Linux environment is assumed. You will need the '''mksquashfs''' tool available.<br />
<br />
For Debian, you can install '''mksquashfs''' using the following command:<br />
$ apt install squashfs-tools<br />
<br />
Create a directory with the required firmware files available:<br />
$ mkdir /tmp/firmware<br />
$ # ... (copy required files into /tmp/firmware) ...<br />
<br />
The directory structure under <tt>/tmp/firmware</tt> gets mounted at <tt>/lib/firmware</tt>. Here is an example of the required directory structure:<br />
/tmp/firmware/radeon/PITCAIRN_pfp.bin<br />
/tmp/firmware/amdgpu/polaris10_mc.bin<br />
<br />
You can obtain these firmware files from most Linux distros (they tend to be installed in <tt>/lib/firmware</tt>), or from the [https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ linux-firmware repository].<br />
<br />
Having generated the correct directory structure, generate the image:<br />
$ cd /tmp/firmware<br />
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory<br />
<br />
<tt>firmware.bin</tt> is now a SquashFS image which you will copy to a firmware flash partition. You will need to access this image while in the [[Skiroot]] environment, so you may wish to copy it to e.g. <tt>/boot</tt> or another partition which you can easily access from [[Skiroot]].<br />
<br />
==Step 2. Copy firmware partition image to flash==<br />
<br />
=== Method 1: From Petitboot ===<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
Locate the firmware.bin file you prepared earlier. Petitboot mounts all filesystems it can recognize in folders under /var/petitboot/mnt/dev/, named after disk device partitions. In this example, we will assume that the file was at the path <tt>/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin</tt>.<br />
# ls /var/petitboot/mnt/dev/<br />
# ls /var/petitboot/mnt/dev/nvme0n1p2/boot/<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition. If you are on an old PNOR version that does not have pflash available, switch to following one of the other flashing methods.<br />
# pflash -P BOOTKERNFW -e -p /var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 2: From the BMC ===<br />
On your local machine, scp the firmware.bin to /tmp/ on the BMC.<br />
$ scp /tmp/firmware.bin root@bmc-ip-address:/tmp/firmware.bin<br />
<br />
Power off the host. (optional, but recommended)<br />
<br />
Log into the BMC on port 22.<br />
$ ssh root@bmc-ip-address<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition.<br />
# pflash -P BOOTKERNFW -e -p /tmp/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Power the host back up.<br />
<br />
On the host, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 3: Old Petitboot method, DANGEROUS ===<br />
WARNING: This method is error-prone and should only be done from the petitboot shell, NEVER on the BMC.<br />
<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
IMPORTANT: ACCIDENTALLY PERFORMING THESE INSTRUCTIONS ON THE BMC INSTEAD OF THE PETITBOOT CONSOLE MAY DAMAGE YOUR BMC FIRMWARE!<br />
<br />
Make sure you can see the <tt>BOOTKERNFW</tt> partition (should return: <tt>mtd5: 000e0000 00010000 "BOOTKERNFW"</tt>):<br />
# cat /proc/mtd | grep BOOTKERNFW<br />
<br />
Find the flash_erase (mine was in /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/):<br />
# find / -name flash_erase<br />
<br />
Erase <tt>/dev/mtd5</tt>:<br />
# /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/flash_erase /dev/mtd5 0 0<br />
<br />
Flash <tt>/dev/mtd5</tt>:<br />
# dd if=/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin of=/dev/mtd5 bs=64k<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
==Notes==<br />
If you are still getting an error message about not being able to find VGA BIOS, it may be because your GPU is being initialized before <tt>/lib/firmware</tt> is mounted. <br />
You can check this by running <tt>dmesg</tt> from the [[Skiroot]] shell. For example, my GPU was trying to initialize at around T+6.5 seconds and the <tt>mtd</tt> device wasn't actually mounted until T+7.25.<br />
I was able to successfully load the firmware after after <tt>/dev/mtd5</tt> was mounted by running <tt>rmmod amdgpu && modprobe amdgpu</tt> from the shell. However, I did experience stability issues in Debian Stretch with <tt>amdgpu</tt>. However, if I let [[Skiroot]] fail to load the firmware and boot anyway, the <tt>radeon</tt> driver works fine.<br />
<br />
e.g.<br />
<pre><br />
$ dmesg<br />
...<br />
[ 6.412518] [drm] radeon kernel modesetting enabled.<br />
[ 6.412848] pci 0033:00:00.0: enabling device (0105 -> 0107)<br />
[ 6.412867] radeon 0033:01:00.0: enabling device (0140 -> 0142)<br />
[ 6.413077] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6819 0x1043:0x0431 0x00).<br />
[ 6.413086] radeon: No suitable DMA available<br />
[ 6.413150] [drm:radeon_device_init] *ERROR* Unable to find PCI I/O BAR<br />
[ 6.533321] [drm:radeon_atombios_init] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO<br />
[ 6.533415] ATOM BIOS: 6819.15.17.0.0.AS01<br />
[ 6.533434] [drm] GPU not posted. posting now...<br />
[ 6.541203] radeon 0033:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)<br />
[ 6.541208] radeon 0033:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF<br />
[ 6.541211] [drm] Detected VRAM RAM=2048M, BAR=256M<br />
[ 6.541213] [drm] RAM width 256bits DDR<br />
[ 6.541295] [TTM] Zone kernel: Available graphics memory: 50118240 kiB<br />
[ 6.541297] [TTM] Zone dma32: Available graphics memory: 2097152 kiB<br />
[ 6.541299] [TTM] Initializing pool allocator<br />
[ 6.541371] [drm] radeon: 2048M of VRAM memory ready<br />
[ 6.541374] [drm] radeon: 2048M of GTT memory ready.<br />
[ 6.541390] [drm] Loading pitcairn Microcode<br />
[ 6.541429] radeon 0033:01:00.0: Direct firmware load for radeon/pitcairn_pfp.bin failed with error -2<br />
[ 6.541456] radeon 0033:01:00.0: Direct firmware load for radeon/PITCAIRN_pfp.bin failed with error -2<br />
[ 6.541459] si_cp: Failed to load firmware "radeon/PITCAIRN_pfp.bin"<br />
[ 6.541560] [drm:si_init] *ERROR* Failed to load firmware!<br />
[ 6.541653] radeon 0033:01:00.0: Fatal error during GPU init<br />
[ 6.541772] [drm] radeon: finishing device.<br />
...<br />
[ 7.123614] 6 ofpart partitions found on MTD device flash<br />
[ 7.123617] Creating 6 MTD partitions on "flash":<br />
[ 7.123621] 0x000000000000-0x000004000000 : "PNOR"<br />
[ 7.123773] 0x000001b21000-0x000003a21000 : "BOOTKERNEL"<br />
[ 7.123868] 0x000003b44000-0x000003b68000 : "CAPP"<br />
[ 7.123961] 0x000003b88000-0x000003b89000 : "VERSION"<br />
[ 7.124056] 0x000003b89000-0x000003bc9000 : "IMA_CATALOG"<br />
[ 7.124149] 0x000003f10000-0x000003ff0000 : "BOOTKERNFW"<br />
...<br />
$ rmmod amdgpu && modprobe amdgpu<br />
$ dmesg<br />
...<br />
[ 2727.836343] [drm] amdgpu kernel modesetting enabled.<br />
[ 2727.836900] amdgpu 0033:01:00.0: SI support provided by radeon.<br />
[ 2727.836905] amdgpu 0033:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.<br />
</pre><br />
<br />
==Erasing BOOTKERNFW==<br />
If you change your mind and decide to stop using BOOTKERNFW, you can erase it from either the BMC or the Petitboot shell. The procedure is the same on either.<br />
# pflash -P BOOTKERNFW -e<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
At this point you are back to an empty BOOTKERNFW partition.<br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Add_GPU_Firmware_To_BOOTKERNFW&diff=2471Add GPU Firmware To BOOTKERNFW2019-05-19T19:27:59Z<p>Bdragon: /* Step 2. Copy firmware partition image to flash */</p>
<hr />
<div><br />
=Purpose=<br />
This guide explains how to add GPU (or other) firmware files to the BOOTKERNFW partition of the boot firmware flash.<br />
This allows Linux kernel drivers which require firmware blobs to function correctly within the Skiroot bootloader environment.<br />
<br />
For example, the <tt>amdgpu</tt> driver requires AMD firmware blobs to bring up AMD GPUs. Copying these files into the firmware partition enables attached displays to be brought up when showing the boot menu.<br />
<br />
=Background=<br />
Many cards require firmware to be loaded at startup time to initialize. On an X86 system, this is usually accomplished by bootstrap code called an "Option ROM", or by extra code in the BIOS.<br />
<br />
On OpenPOWER, this is not a possibility, due to the Option ROM code being platform-specific, as well as the OpenPOWER security stance being that trusting random bootstrap code on cards is a bad idea.<br />
<br />
Therefore, for things like video cards to work, it is necessary for the card's firmware to be loaded in before the card can be used. Linux video drivers in petitboot are capable of doing this loading, however, the actual firmware itself needs to be available so the drivers actually have something to load.<br />
<br />
So, on RCS OpenPOWER systems, there is a special area of flash set aside for user-provided firmware files, called BOOTKERNFW. Due to the limited size of this partition, the user is required to make decisions as to what firmware files to install.<br />
<br />
=Applicability=<br />
All RCS OpenPOWER systems.<br />
<br />
=Instructions=<br />
<br />
==Step 1. Generate firmware partition image==<br />
Boot into an OS. A Linux environment is assumed. You will need the '''mksquashfs''' tool available.<br />
<br />
For Debian, you can install '''mksquashfs''' using the following command:<br />
$ apt install squashfs-tools<br />
<br />
Create a directory with the required firmware files available:<br />
$ mkdir /tmp/firmware<br />
$ # ... (copy required files into /tmp/firmware) ...<br />
<br />
The directory structure under <tt>/tmp/firmware</tt> gets mounted at <tt>/lib/firmware</tt>. Here is an example of the required directory structure:<br />
/tmp/firmware/radeon/PITCAIRN_pfp.bin<br />
/tmp/firmware/amdgpu/polaris10_mc.bin<br />
<br />
You can obtain these firmware files from most Linux distros (they tend to be installed in <tt>/lib/firmware</tt>), or from the [https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ linux-firmware repository].<br />
<br />
Having generated the correct directory structure, generate the image:<br />
$ cd /tmp/firmware<br />
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory<br />
<br />
<tt>firmware.bin</tt> is now a SquashFS image which you will copy to a firmware flash partition. You will need to access this image while in the [[Skiroot]] environment, so you may wish to copy it to e.g. <tt>/boot</tt> or another partition which you can easily access from [[Skiroot]].<br />
<br />
==Step 2. Copy firmware partition image to flash==<br />
<br />
=== Method 1: From Petitboot ===<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
Locate the firmware.bin file you prepared earlier. Petitboot mounts all filesystems it can recognize in folders under /var/petitboot/mnt/dev/, named after disk device partitions. In this example, we will assume that the file was at the path <tt>/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin</tt>.<br />
# ls /var/petitboot/mnt/dev/<br />
# ls /var/petitboot/mnt/dev/nvme0n1p2/boot/<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition. If you are on an old PNOR version that does not have pflash available, switch to following one of the other flashing methods.<br />
# pflash -P BOOTKERNFW -e -p /var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 2: From the BMC ===<br />
On your local machine, scp the firmware.bin to /tmp/ on the BMC.<br />
$ scp /tmp/firmware.bin root@bmc-ip-address:/tmp/firmware.bin<br />
<br />
Power off the host. (optional, but recommended)<br />
<br />
Log into the BMC on port 22.<br />
$ ssh root@bmc-ip-address<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition.<br />
# pflash -P BOOTKERNFW -e -p /tmp/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Power the host back up.<br />
<br />
On the host, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 3: Old Petitboot method, DANGEROUS ===<br />
WARNING: This method is error-prone and should only be done from the petitboot shell, NEVER on the BMC.<br />
<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
IMPORTANT: ACCIDENTALLY PERFORMING THESE INSTRUCTIONS ON THE BMC INSTEAD OF THE PETITBOOT CONSOLE MAY DAMAGE YOUR BMC FIRMWARE!<br />
<br />
Make sure you can see the <tt>BOOTKERNFW</tt> partition (should return: <tt>mtd5: 000e0000 00010000 "BOOTKERNFW"</tt>):<br />
# cat /proc/mtd | grep BOOTKERNFW<br />
<br />
Find the flash_erase (mine was in /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/):<br />
# find / -name flash_erase<br />
<br />
Erase <tt>/dev/mtd5</tt>:<br />
# /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/flash_erase /dev/mtd5 0 0<br />
<br />
Flash <tt>/dev/mtd5</tt>:<br />
# dd if=/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin of=/dev/mtd5 bs=64k<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
==Notes==<br />
If you are still getting an error message about not being able to find VGA BIOS, it may be because your GPU is being initialized before <tt>/lib/firmware</tt> is mounted. <br />
You can check this by running <tt>dmesg</tt> from the [[Skiroot]] shell. For example, my GPU was trying to initialize at around T+6.5 seconds and the <tt>mtd</tt> device wasn't actually mounted until T+7.25.<br />
I was able to successfully load the firmware after after <tt>/dev/mtd5</tt> was mounted by running <tt>rmmod amdgpu && modprobe amdgpu</tt> from the shell. However, I did experience stability issues in Debian Stretch with <tt>amdgpu</tt>. However, if I let [[Skiroot]] fail to load the firmware and boot anyway, the <tt>radeon</tt> driver works fine.<br />
<br />
e.g.<br />
<pre><br />
$ dmesg<br />
...<br />
[ 6.412518] [drm] radeon kernel modesetting enabled.<br />
[ 6.412848] pci 0033:00:00.0: enabling device (0105 -> 0107)<br />
[ 6.412867] radeon 0033:01:00.0: enabling device (0140 -> 0142)<br />
[ 6.413077] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6819 0x1043:0x0431 0x00).<br />
[ 6.413086] radeon: No suitable DMA available<br />
[ 6.413150] [drm:radeon_device_init] *ERROR* Unable to find PCI I/O BAR<br />
[ 6.533321] [drm:radeon_atombios_init] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO<br />
[ 6.533415] ATOM BIOS: 6819.15.17.0.0.AS01<br />
[ 6.533434] [drm] GPU not posted. posting now...<br />
[ 6.541203] radeon 0033:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)<br />
[ 6.541208] radeon 0033:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF<br />
[ 6.541211] [drm] Detected VRAM RAM=2048M, BAR=256M<br />
[ 6.541213] [drm] RAM width 256bits DDR<br />
[ 6.541295] [TTM] Zone kernel: Available graphics memory: 50118240 kiB<br />
[ 6.541297] [TTM] Zone dma32: Available graphics memory: 2097152 kiB<br />
[ 6.541299] [TTM] Initializing pool allocator<br />
[ 6.541371] [drm] radeon: 2048M of VRAM memory ready<br />
[ 6.541374] [drm] radeon: 2048M of GTT memory ready.<br />
[ 6.541390] [drm] Loading pitcairn Microcode<br />
[ 6.541429] radeon 0033:01:00.0: Direct firmware load for radeon/pitcairn_pfp.bin failed with error -2<br />
[ 6.541456] radeon 0033:01:00.0: Direct firmware load for radeon/PITCAIRN_pfp.bin failed with error -2<br />
[ 6.541459] si_cp: Failed to load firmware "radeon/PITCAIRN_pfp.bin"<br />
[ 6.541560] [drm:si_init] *ERROR* Failed to load firmware!<br />
[ 6.541653] radeon 0033:01:00.0: Fatal error during GPU init<br />
[ 6.541772] [drm] radeon: finishing device.<br />
...<br />
[ 7.123614] 6 ofpart partitions found on MTD device flash<br />
[ 7.123617] Creating 6 MTD partitions on "flash":<br />
[ 7.123621] 0x000000000000-0x000004000000 : "PNOR"<br />
[ 7.123773] 0x000001b21000-0x000003a21000 : "BOOTKERNEL"<br />
[ 7.123868] 0x000003b44000-0x000003b68000 : "CAPP"<br />
[ 7.123961] 0x000003b88000-0x000003b89000 : "VERSION"<br />
[ 7.124056] 0x000003b89000-0x000003bc9000 : "IMA_CATALOG"<br />
[ 7.124149] 0x000003f10000-0x000003ff0000 : "BOOTKERNFW"<br />
...<br />
$ rmmod amdgpu && modprobe amdgpu<br />
$ dmesg<br />
...<br />
[ 2727.836343] [drm] amdgpu kernel modesetting enabled.<br />
[ 2727.836900] amdgpu 0033:01:00.0: SI support provided by radeon.<br />
[ 2727.836905] amdgpu 0033:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.<br />
</pre><br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Add_GPU_Firmware_To_BOOTKERNFW&diff=2470Add GPU Firmware To BOOTKERNFW2019-05-19T19:26:56Z<p>Bdragon: /* Background */ s/POWER/OpenPOWER/</p>
<hr />
<div><br />
=Purpose=<br />
This guide explains how to add GPU (or other) firmware files to the BOOTKERNFW partition of the boot firmware flash.<br />
This allows Linux kernel drivers which require firmware blobs to function correctly within the Skiroot bootloader environment.<br />
<br />
For example, the <tt>amdgpu</tt> driver requires AMD firmware blobs to bring up AMD GPUs. Copying these files into the firmware partition enables attached displays to be brought up when showing the boot menu.<br />
<br />
=Background=<br />
Many cards require firmware to be loaded at startup time to initialize. On an X86 system, this is usually accomplished by bootstrap code called an "Option ROM", or by extra code in the BIOS.<br />
<br />
On OpenPOWER, this is not a possibility, due to the Option ROM code being platform-specific, as well as the OpenPOWER security stance being that trusting random bootstrap code on cards is a bad idea.<br />
<br />
Therefore, for things like video cards to work, it is necessary for the card's firmware to be loaded in before the card can be used. Linux video drivers in petitboot are capable of doing this loading, however, the actual firmware itself needs to be available so the drivers actually have something to load.<br />
<br />
So, on RCS OpenPOWER systems, there is a special area of flash set aside for user-provided firmware files, called BOOTKERNFW. Due to the limited size of this partition, the user is required to make decisions as to what firmware files to install.<br />
<br />
=Applicability=<br />
All RCS OpenPOWER systems.<br />
<br />
=Instructions=<br />
<br />
==Step 1. Generate firmware partition image==<br />
Boot into an OS. A Linux environment is assumed. You will need the '''mksquashfs''' tool available.<br />
<br />
For Debian, you can install '''mksquashfs''' using the following command:<br />
$ apt install squashfs-tools<br />
<br />
Create a directory with the required firmware files available:<br />
$ mkdir /tmp/firmware<br />
$ # ... (copy required files into /tmp/firmware) ...<br />
<br />
The directory structure under <tt>/tmp/firmware</tt> gets mounted at <tt>/lib/firmware</tt>. Here is an example of the required directory structure:<br />
/tmp/firmware/radeon/PITCAIRN_pfp.bin<br />
/tmp/firmware/amdgpu/polaris10_mc.bin<br />
<br />
You can obtain these firmware files from most Linux distros (they tend to be installed in <tt>/lib/firmware</tt>), or from the [https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ linux-firmware repository].<br />
<br />
Having generated the correct directory structure, generate the image:<br />
$ cd /tmp/firmware<br />
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory<br />
<br />
<tt>firmware.bin</tt> is now a SquashFS image which you will copy to a firmware flash partition. You will need to access this image while in the [[Skiroot]] environment, so you may wish to copy it to e.g. <tt>/boot</tt> or another partition which you can easily access from [[Skiroot]].<br />
<br />
==Step 2. Copy firmware partition image to flash==<br />
This method can be done from either Petitboot or the BMC.<br />
<br />
=== Method 1: From Petitboot ===<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
Locate the firmware.bin file you prepared earlier. Petitboot mounts all filesystems it can recognize in folders under /var/petitboot/mnt/dev/, named after disk device partitions. In this example, we will assume that the file was at the path <tt>/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin</tt>.<br />
# ls /var/petitboot/mnt/dev/<br />
# ls /var/petitboot/mnt/dev/nvme0n1p2/boot/<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition. If you are on an old PNOR version that does not have pflash available, switch to following one of the other flashing methods.<br />
# pflash -P BOOTKERNFW -e -p /var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 2: From the BMC ===<br />
On your local machine, scp the firmware.bin to /tmp/ on the BMC.<br />
$ scp /tmp/firmware.bin root@bmc-ip-address:/tmp/firmware.bin<br />
<br />
Power off the host. (optional, but recommended)<br />
<br />
Log into the BMC on port 22.<br />
$ ssh root@bmc-ip-address<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition.<br />
# pflash -P BOOTKERNFW -e -p /tmp/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Power the host back up.<br />
<br />
On the host, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 3: Old Petitboot method, DANGEROUS ===<br />
WARNING: This method is error-prone and should only be done from the petitboot shell, NEVER on the BMC.<br />
<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
IMPORTANT: ACCIDENTALLY PERFORMING THESE INSTRUCTIONS ON THE BMC INSTEAD OF THE PETITBOOT CONSOLE MAY DAMAGE YOUR BMC FIRMWARE!<br />
<br />
Make sure you can see the <tt>BOOTKERNFW</tt> partition (should return: <tt>mtd5: 000e0000 00010000 "BOOTKERNFW"</tt>):<br />
# cat /proc/mtd | grep BOOTKERNFW<br />
<br />
Find the flash_erase (mine was in /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/):<br />
# find / -name flash_erase<br />
<br />
Erase <tt>/dev/mtd5</tt>:<br />
# /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/flash_erase /dev/mtd5 0 0<br />
<br />
Flash <tt>/dev/mtd5</tt>:<br />
# dd if=/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin of=/dev/mtd5 bs=64k<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
==Notes==<br />
If you are still getting an error message about not being able to find VGA BIOS, it may be because your GPU is being initialized before <tt>/lib/firmware</tt> is mounted. <br />
You can check this by running <tt>dmesg</tt> from the [[Skiroot]] shell. For example, my GPU was trying to initialize at around T+6.5 seconds and the <tt>mtd</tt> device wasn't actually mounted until T+7.25.<br />
I was able to successfully load the firmware after after <tt>/dev/mtd5</tt> was mounted by running <tt>rmmod amdgpu && modprobe amdgpu</tt> from the shell. However, I did experience stability issues in Debian Stretch with <tt>amdgpu</tt>. However, if I let [[Skiroot]] fail to load the firmware and boot anyway, the <tt>radeon</tt> driver works fine.<br />
<br />
e.g.<br />
<pre><br />
$ dmesg<br />
...<br />
[ 6.412518] [drm] radeon kernel modesetting enabled.<br />
[ 6.412848] pci 0033:00:00.0: enabling device (0105 -> 0107)<br />
[ 6.412867] radeon 0033:01:00.0: enabling device (0140 -> 0142)<br />
[ 6.413077] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6819 0x1043:0x0431 0x00).<br />
[ 6.413086] radeon: No suitable DMA available<br />
[ 6.413150] [drm:radeon_device_init] *ERROR* Unable to find PCI I/O BAR<br />
[ 6.533321] [drm:radeon_atombios_init] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO<br />
[ 6.533415] ATOM BIOS: 6819.15.17.0.0.AS01<br />
[ 6.533434] [drm] GPU not posted. posting now...<br />
[ 6.541203] radeon 0033:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)<br />
[ 6.541208] radeon 0033:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF<br />
[ 6.541211] [drm] Detected VRAM RAM=2048M, BAR=256M<br />
[ 6.541213] [drm] RAM width 256bits DDR<br />
[ 6.541295] [TTM] Zone kernel: Available graphics memory: 50118240 kiB<br />
[ 6.541297] [TTM] Zone dma32: Available graphics memory: 2097152 kiB<br />
[ 6.541299] [TTM] Initializing pool allocator<br />
[ 6.541371] [drm] radeon: 2048M of VRAM memory ready<br />
[ 6.541374] [drm] radeon: 2048M of GTT memory ready.<br />
[ 6.541390] [drm] Loading pitcairn Microcode<br />
[ 6.541429] radeon 0033:01:00.0: Direct firmware load for radeon/pitcairn_pfp.bin failed with error -2<br />
[ 6.541456] radeon 0033:01:00.0: Direct firmware load for radeon/PITCAIRN_pfp.bin failed with error -2<br />
[ 6.541459] si_cp: Failed to load firmware "radeon/PITCAIRN_pfp.bin"<br />
[ 6.541560] [drm:si_init] *ERROR* Failed to load firmware!<br />
[ 6.541653] radeon 0033:01:00.0: Fatal error during GPU init<br />
[ 6.541772] [drm] radeon: finishing device.<br />
...<br />
[ 7.123614] 6 ofpart partitions found on MTD device flash<br />
[ 7.123617] Creating 6 MTD partitions on "flash":<br />
[ 7.123621] 0x000000000000-0x000004000000 : "PNOR"<br />
[ 7.123773] 0x000001b21000-0x000003a21000 : "BOOTKERNEL"<br />
[ 7.123868] 0x000003b44000-0x000003b68000 : "CAPP"<br />
[ 7.123961] 0x000003b88000-0x000003b89000 : "VERSION"<br />
[ 7.124056] 0x000003b89000-0x000003bc9000 : "IMA_CATALOG"<br />
[ 7.124149] 0x000003f10000-0x000003ff0000 : "BOOTKERNFW"<br />
...<br />
$ rmmod amdgpu && modprobe amdgpu<br />
$ dmesg<br />
...<br />
[ 2727.836343] [drm] amdgpu kernel modesetting enabled.<br />
[ 2727.836900] amdgpu 0033:01:00.0: SI support provided by radeon.<br />
[ 2727.836905] amdgpu 0033:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.<br />
</pre><br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Add_GPU_Firmware_To_BOOTKERNFW&diff=2469Add GPU Firmware To BOOTKERNFW2019-05-19T19:24:49Z<p>Bdragon: Write safer method for flashing.</p>
<hr />
<div><br />
=Purpose=<br />
This guide explains how to add GPU (or other) firmware files to the BOOTKERNFW partition of the boot firmware flash.<br />
This allows Linux kernel drivers which require firmware blobs to function correctly within the Skiroot bootloader environment.<br />
<br />
For example, the <tt>amdgpu</tt> driver requires AMD firmware blobs to bring up AMD GPUs. Copying these files into the firmware partition enables attached displays to be brought up when showing the boot menu.<br />
<br />
=Background=<br />
Many cards require firmware to be loaded at startup time to initialize. On an X86 system, this is usually accomplished by bootstrap code called an "Option ROM", or by extra code in the BIOS.<br />
<br />
On POWER, this is not a possibility, due to the Option ROM code being platform-specific, as well as the POWER security stance being that trusting random bootstrap code on cards is a bad idea.<br />
<br />
Therefore, for things like video cards to work, it is necessary for the card's firmware to be loaded in before the card can be used. Linux video drivers in petitboot are capable of doing this loading, however, the actual firmware itself needs to be available so the drivers actually have something to load.<br />
<br />
So, on RCS OpenPOWER systems, there is a special area of flash set aside for user-provided firmware files, called BOOTKERNFW. Due to the limited size of this partition, the user is required to make decisions as to what firmware files to install.<br />
<br />
=Applicability=<br />
All RCS OpenPOWER systems.<br />
<br />
=Instructions=<br />
<br />
==Step 1. Generate firmware partition image==<br />
Boot into an OS. A Linux environment is assumed. You will need the '''mksquashfs''' tool available.<br />
<br />
For Debian, you can install '''mksquashfs''' using the following command:<br />
$ apt install squashfs-tools<br />
<br />
Create a directory with the required firmware files available:<br />
$ mkdir /tmp/firmware<br />
$ # ... (copy required files into /tmp/firmware) ...<br />
<br />
The directory structure under <tt>/tmp/firmware</tt> gets mounted at <tt>/lib/firmware</tt>. Here is an example of the required directory structure:<br />
/tmp/firmware/radeon/PITCAIRN_pfp.bin<br />
/tmp/firmware/amdgpu/polaris10_mc.bin<br />
<br />
You can obtain these firmware files from most Linux distros (they tend to be installed in <tt>/lib/firmware</tt>), or from the [https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ linux-firmware repository].<br />
<br />
Having generated the correct directory structure, generate the image:<br />
$ cd /tmp/firmware<br />
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory<br />
<br />
<tt>firmware.bin</tt> is now a SquashFS image which you will copy to a firmware flash partition. You will need to access this image while in the [[Skiroot]] environment, so you may wish to copy it to e.g. <tt>/boot</tt> or another partition which you can easily access from [[Skiroot]].<br />
<br />
==Step 2. Copy firmware partition image to flash==<br />
This method can be done from either Petitboot or the BMC.<br />
<br />
=== Method 1: From Petitboot ===<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
Locate the firmware.bin file you prepared earlier. Petitboot mounts all filesystems it can recognize in folders under /var/petitboot/mnt/dev/, named after disk device partitions. In this example, we will assume that the file was at the path <tt>/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin</tt>.<br />
# ls /var/petitboot/mnt/dev/<br />
# ls /var/petitboot/mnt/dev/nvme0n1p2/boot/<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition. If you are on an old PNOR version that does not have pflash available, switch to following one of the other flashing methods.<br />
# pflash -P BOOTKERNFW -e -p /var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 2: From the BMC ===<br />
On your local machine, scp the firmware.bin to /tmp/ on the BMC.<br />
$ scp /tmp/firmware.bin root@bmc-ip-address:/tmp/firmware.bin<br />
<br />
Power off the host. (optional, but recommended)<br />
<br />
Log into the BMC on port 22.<br />
$ ssh root@bmc-ip-address<br />
<br />
Use the <tt>pflash</tt> tool to flash the firmware.bin to the BOOTKERNFW partition.<br />
# pflash -P BOOTKERNFW -e -p /tmp/firmware.bin<br />
About to erase 0x03e10000..0x03ff0000 !<br />
WARNING ! This will modify your HOST flash chip content !<br />
Enter "yes" to confirm:<br />
<br />
The <tt>pflash</tt> tool will prompt you to confirm that you are about to modify your HOST flash chip content. Answer "yes" and press Enter.<br />
Erasing...<br />
[==================================================] 100% ETA:0s <br />
About to program "bootkern.img" at 0x03e10000..0x03ff0000 !<br />
Programming & Verifying...<br />
[==================================================] 100% ETA:0s <br />
Updating actual size in partition header...<br />
<br />
Power the host back up.<br />
<br />
On the host, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
=== Method 3: Old Petitboot method, DANGEROUS ===<br />
WARNING: This method is error-prone and should only be done from the petitboot shell, NEVER on the BMC.<br />
<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
IMPORTANT: ACCIDENTALLY PERFORMING THESE INSTRUCTIONS ON THE BMC INSTEAD OF THE PETITBOOT CONSOLE MAY DAMAGE YOUR BMC FIRMWARE!<br />
<br />
Make sure you can see the <tt>BOOTKERNFW</tt> partition (should return: <tt>mtd5: 000e0000 00010000 "BOOTKERNFW"</tt>):<br />
# cat /proc/mtd | grep BOOTKERNFW<br />
<br />
Find the flash_erase (mine was in /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/):<br />
# find / -name flash_erase<br />
<br />
Erase <tt>/dev/mtd5</tt>:<br />
# /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/flash_erase /dev/mtd5 0 0<br />
<br />
Flash <tt>/dev/mtd5</tt>:<br />
# dd if=/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin of=/dev/mtd5 bs=64k<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
==Notes==<br />
If you are still getting an error message about not being able to find VGA BIOS, it may be because your GPU is being initialized before <tt>/lib/firmware</tt> is mounted. <br />
You can check this by running <tt>dmesg</tt> from the [[Skiroot]] shell. For example, my GPU was trying to initialize at around T+6.5 seconds and the <tt>mtd</tt> device wasn't actually mounted until T+7.25.<br />
I was able to successfully load the firmware after after <tt>/dev/mtd5</tt> was mounted by running <tt>rmmod amdgpu && modprobe amdgpu</tt> from the shell. However, I did experience stability issues in Debian Stretch with <tt>amdgpu</tt>. However, if I let [[Skiroot]] fail to load the firmware and boot anyway, the <tt>radeon</tt> driver works fine.<br />
<br />
e.g.<br />
<pre><br />
$ dmesg<br />
...<br />
[ 6.412518] [drm] radeon kernel modesetting enabled.<br />
[ 6.412848] pci 0033:00:00.0: enabling device (0105 -> 0107)<br />
[ 6.412867] radeon 0033:01:00.0: enabling device (0140 -> 0142)<br />
[ 6.413077] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6819 0x1043:0x0431 0x00).<br />
[ 6.413086] radeon: No suitable DMA available<br />
[ 6.413150] [drm:radeon_device_init] *ERROR* Unable to find PCI I/O BAR<br />
[ 6.533321] [drm:radeon_atombios_init] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO<br />
[ 6.533415] ATOM BIOS: 6819.15.17.0.0.AS01<br />
[ 6.533434] [drm] GPU not posted. posting now...<br />
[ 6.541203] radeon 0033:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)<br />
[ 6.541208] radeon 0033:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF<br />
[ 6.541211] [drm] Detected VRAM RAM=2048M, BAR=256M<br />
[ 6.541213] [drm] RAM width 256bits DDR<br />
[ 6.541295] [TTM] Zone kernel: Available graphics memory: 50118240 kiB<br />
[ 6.541297] [TTM] Zone dma32: Available graphics memory: 2097152 kiB<br />
[ 6.541299] [TTM] Initializing pool allocator<br />
[ 6.541371] [drm] radeon: 2048M of VRAM memory ready<br />
[ 6.541374] [drm] radeon: 2048M of GTT memory ready.<br />
[ 6.541390] [drm] Loading pitcairn Microcode<br />
[ 6.541429] radeon 0033:01:00.0: Direct firmware load for radeon/pitcairn_pfp.bin failed with error -2<br />
[ 6.541456] radeon 0033:01:00.0: Direct firmware load for radeon/PITCAIRN_pfp.bin failed with error -2<br />
[ 6.541459] si_cp: Failed to load firmware "radeon/PITCAIRN_pfp.bin"<br />
[ 6.541560] [drm:si_init] *ERROR* Failed to load firmware!<br />
[ 6.541653] radeon 0033:01:00.0: Fatal error during GPU init<br />
[ 6.541772] [drm] radeon: finishing device.<br />
...<br />
[ 7.123614] 6 ofpart partitions found on MTD device flash<br />
[ 7.123617] Creating 6 MTD partitions on "flash":<br />
[ 7.123621] 0x000000000000-0x000004000000 : "PNOR"<br />
[ 7.123773] 0x000001b21000-0x000003a21000 : "BOOTKERNEL"<br />
[ 7.123868] 0x000003b44000-0x000003b68000 : "CAPP"<br />
[ 7.123961] 0x000003b88000-0x000003b89000 : "VERSION"<br />
[ 7.124056] 0x000003b89000-0x000003bc9000 : "IMA_CATALOG"<br />
[ 7.124149] 0x000003f10000-0x000003ff0000 : "BOOTKERNFW"<br />
...<br />
$ rmmod amdgpu && modprobe amdgpu<br />
$ dmesg<br />
...<br />
[ 2727.836343] [drm] amdgpu kernel modesetting enabled.<br />
[ 2727.836900] amdgpu 0033:01:00.0: SI support provided by radeon.<br />
[ 2727.836905] amdgpu 0033:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.<br />
</pre><br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Add_GPU_Firmware_To_BOOTKERNFW&diff=2468Add GPU Firmware To BOOTKERNFW2019-05-19T18:48:27Z<p>Bdragon: add some background information.</p>
<hr />
<div><br />
=Purpose=<br />
This guide explains how to add GPU (or other) firmware files to the BOOTKERNFW partition of the boot firmware flash.<br />
This allows Linux kernel drivers which require firmware blobs to function correctly within the Skiroot bootloader environment.<br />
<br />
For example, the <tt>amdgpu</tt> driver requires AMD firmware blobs to bring up AMD GPUs. Copying these files into the firmware partition enables attached displays to be brought up when showing the boot menu.<br />
<br />
=Background=<br />
Many cards require firmware to be loaded at startup time to initialize. On an X86 system, this is usually accomplished by bootstrap code called an "Option ROM", or by extra code in the BIOS.<br />
<br />
On POWER, this is not a possibility, due to the Option ROM code being platform-specific, as well as the POWER security stance being that trusting random bootstrap code on cards is a bad idea.<br />
<br />
Therefore, for things like video cards to work, it is necessary for the card's firmware to be loaded in before the card can be used. Linux video drivers in petitboot are capable of doing this loading, however, the actual firmware itself needs to be available so the drivers actually have something to load.<br />
<br />
So, on RCS OpenPOWER systems, there is a special area of flash set aside for user-provided firmware files, called BOOTKERNFW. Due to the limited size of this partition, the user is required to make decisions as to what firmware files to install.<br />
<br />
=Applicability=<br />
All RCS OpenPOWER systems.<br />
<br />
=Instructions=<br />
<br />
==Step 1. Generate firmware partition image==<br />
Boot into an OS. A Linux environment is assumed. You will need the '''mksquashfs''' tool available.<br />
<br />
For Debian, you can install '''mksquashfs''' using the following command:<br />
$ apt install squashfs-tools<br />
<br />
Create a directory with the required firmware files available:<br />
$ mkdir /tmp/firmware<br />
$ # ... (copy required files into /tmp/firmware) ...<br />
<br />
The directory structure under <tt>/tmp/firmware</tt> gets mounted at <tt>/lib/firmware</tt>. Here is an example of the required directory structure:<br />
/tmp/firmware/radeon/PITCAIRN_pfp.bin<br />
/tmp/firmware/amdgpu/polaris10_mc.bin<br />
<br />
You can obtain these firmware files from most Linux distros (they tend to be installed in <tt>/lib/firmware</tt>), or from the [https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ linux-firmware repository].<br />
<br />
Having generated the correct directory structure, generate the image:<br />
$ cd /tmp/firmware<br />
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory<br />
<br />
<tt>firmware.bin</tt> is now a SquashFS image which you will copy to a firmware flash partition. You will need to access this image while in the [[Skiroot]] environment, so you may wish to copy it to e.g. <tt>/boot</tt> or another partition which you can easily access from [[Skiroot]].<br />
<br />
==Step 2. Copy firmware partition image to flash==<br />
Reboot into the [[Skiroot]] environment, attaching a display using the onboard VGA or HDMI port if necessary. When the [[Petitboot]] bootloader appears, select &#8220;Exit to Shell&#8221;.<br />
<br />
Make sure you can see the <tt>BOOTKERNFW</tt> partition (should return: <tt>mtd5: 000e0000 00010000 "BOOTKERNFW"</tt>):<br />
# cat /proc/mtd | grep BOOTKERNFW<br />
<br />
Find the flash_erase (mine was in /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/):<br />
# find / -name flash_erase<br />
<br />
Erase <tt>/dev/mtd5</tt>:<br />
# /var/petitboot/mnt/dev/nvme0n1p2/usr/sbin/flash_erase /dev/mtd5 0 0<br />
<br />
Flash <tt>/dev/mtd5</tt>:<br />
# dd if=/var/petitboot/mnt/dev/nvme0n1p2/boot/firmware.bin of=/dev/mtd5 bs=64k<br />
<br />
Reboot the system:<br />
# reboot<br />
<br />
When the system reboots, exit into the shell at the [[Petitboot]] menu again and check to see if the firmware made it as expected:<br />
# ls /lib/firmware/<br />
<br />
==Notes==<br />
If you are still getting an error message about not being able to find VGA BIOS, it may be because your GPU is being initialized before <tt>/lib/firmware</tt> is mounted. <br />
You can check this by running <tt>dmesg</tt> from the [[Skiroot]] shell. For example, my GPU was trying to initialize at around T+6.5 seconds and the <tt>mtd</tt> device wasn't actually mounted until T+7.25.<br />
I was able to successfully load the firmware after after <tt>/dev/mtd5</tt> was mounted by running <tt>rmmod amdgpu && modprobe amdgpu</tt> from the shell. However, I did experience stability issues in Debian Stretch with <tt>amdgpu</tt>. However, if I let [[Skiroot]] fail to load the firmware and boot anyway, the <tt>radeon</tt> driver works fine.<br />
<br />
e.g.<br />
<pre><br />
$ dmesg<br />
...<br />
[ 6.412518] [drm] radeon kernel modesetting enabled.<br />
[ 6.412848] pci 0033:00:00.0: enabling device (0105 -> 0107)<br />
[ 6.412867] radeon 0033:01:00.0: enabling device (0140 -> 0142)<br />
[ 6.413077] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6819 0x1043:0x0431 0x00).<br />
[ 6.413086] radeon: No suitable DMA available<br />
[ 6.413150] [drm:radeon_device_init] *ERROR* Unable to find PCI I/O BAR<br />
[ 6.533321] [drm:radeon_atombios_init] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO<br />
[ 6.533415] ATOM BIOS: 6819.15.17.0.0.AS01<br />
[ 6.533434] [drm] GPU not posted. posting now...<br />
[ 6.541203] radeon 0033:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)<br />
[ 6.541208] radeon 0033:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF<br />
[ 6.541211] [drm] Detected VRAM RAM=2048M, BAR=256M<br />
[ 6.541213] [drm] RAM width 256bits DDR<br />
[ 6.541295] [TTM] Zone kernel: Available graphics memory: 50118240 kiB<br />
[ 6.541297] [TTM] Zone dma32: Available graphics memory: 2097152 kiB<br />
[ 6.541299] [TTM] Initializing pool allocator<br />
[ 6.541371] [drm] radeon: 2048M of VRAM memory ready<br />
[ 6.541374] [drm] radeon: 2048M of GTT memory ready.<br />
[ 6.541390] [drm] Loading pitcairn Microcode<br />
[ 6.541429] radeon 0033:01:00.0: Direct firmware load for radeon/pitcairn_pfp.bin failed with error -2<br />
[ 6.541456] radeon 0033:01:00.0: Direct firmware load for radeon/PITCAIRN_pfp.bin failed with error -2<br />
[ 6.541459] si_cp: Failed to load firmware "radeon/PITCAIRN_pfp.bin"<br />
[ 6.541560] [drm:si_init] *ERROR* Failed to load firmware!<br />
[ 6.541653] radeon 0033:01:00.0: Fatal error during GPU init<br />
[ 6.541772] [drm] radeon: finishing device.<br />
...<br />
[ 7.123614] 6 ofpart partitions found on MTD device flash<br />
[ 7.123617] Creating 6 MTD partitions on "flash":<br />
[ 7.123621] 0x000000000000-0x000004000000 : "PNOR"<br />
[ 7.123773] 0x000001b21000-0x000003a21000 : "BOOTKERNEL"<br />
[ 7.123868] 0x000003b44000-0x000003b68000 : "CAPP"<br />
[ 7.123961] 0x000003b88000-0x000003b89000 : "VERSION"<br />
[ 7.124056] 0x000003b89000-0x000003bc9000 : "IMA_CATALOG"<br />
[ 7.124149] 0x000003f10000-0x000003ff0000 : "BOOTKERNFW"<br />
...<br />
$ rmmod amdgpu && modprobe amdgpu<br />
$ dmesg<br />
...<br />
[ 2727.836343] [drm] amdgpu kernel modesetting enabled.<br />
[ 2727.836900] amdgpu 0033:01:00.0: SI support provided by radeon.<br />
[ 2727.836905] amdgpu 0033:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.<br />
</pre><br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=2419Compiling Firmware2019-05-04T20:33:34Z<p>Bdragon: /* Building the firmware */ add git submodule update because people tend to follow this list of instructions after switching branches.</p>
<hr />
<div>==Purpose==<br />
The following steps can be used to compile and update the firmware on [[Raptor Computing Systems|Raptor CS]]'s [[OpenPOWER]] systems, such as the [[Talos II]] or [[Blackbird]].<br />
<br />
==Applicability==<br />
All RCS [[OpenPOWER]] systems.<br />
<br />
==Requirements==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
===Building on Debian===<br />
The build system (op-build) has been primarily tested using Debian Stretch. Ensure you install the following packages:<br />
<br />
# Packages needed for OpenPOWER Firmware builds<br />
$ sudo apt install cscope ctags libz-dev libexpat-dev python texinfo build-essential g++ git bison flex unzip libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc wget bc rsync<br />
<br />
# Packages needed for OpenBMC builds<br />
$ sudo apt install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
<br />
===Building on other Linux distributions===<br />
If you are on a different distribution, such as Fedora 28, a Debian chroot is recommended:<br />
$ sudo yum install debootstrap dpkg<br />
$ sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
$ sudo mount -t proc none debian-chroot/proc/<br />
$ sudo mount -o bind /sys/ debian-chroot/sys/<br />
$ sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
<br />
Enter the chroot and install the needed packages:<br />
$ sudo chroot debian-chroot/<br />
# apt install software-properties-common locales<br />
<br />
# Packages needed for OpenPOWER Firmware builds<br />
$ sudo apt install cscope ctags libz-dev libexpat-dev python texinfo build-essential g++ git bison flex unzip libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc wget bc rsync<br />
<br />
# Packages needed for OpenBMC builds<br />
$ sudo apt install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
<br />
Also create a user inside the chroot to build under:<br />
$ useradd -m build-user -s /bin/bash<br />
$ su build-user<br />
$ cd<br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from any terminal:<br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
<br />
==Building the OpenPOWER Firmware==<br />
===Downloading the sources===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public Git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
git clone -b raptor-v1.05 --recursive https://scm.raptorcs.com/scm/git/talos-op-build<br />
<br />
'''Note:''' The <tt>master</tt> branch is often in a non-functional state. The latest firmware branch (<tt>raptor-v1.05</tt> at the time of this update) should be used.<br />
<br />
===Building the firmware===<br />
Before building the firmware, check the <tt>README.md</tt> file to ensure that all needed packages are installed.<br />
<br />
The firmware can then be built using the following commands:<br />
$ cd talos-op-build<br />
$ git submodule update<br />
$ . op-build-env<br />
$ op-build talos_defconfig<br />
$ op-build<br />
<br />
You can pass <tt>-j&lt;num-cores&gt;</tt> to perform a parallel build (<tt>op-build</tt> invokes <tt>make</tt>), though this may result in very high memory usage.<br />
<br />
If the build completes successfully, the final firmware image is at <tt>output/images/talos.pnor</tt>.<br />
<br />
===Rebuilding an individual package===<br />
To rebuild an individual package (such as Hostboot) and recreate the <tt>talos.pnor</tt> image, run:<br />
$ op-build <em>pkgname</em>-rebuild openpower-pnor-rebuild<br />
where <tt><em>pkgname</em></tt> is the name of the package to rebuild.<br />
<br />
For example:<br />
$ op-build hostboot-rebuild openpower-pnor-rebuild<br />
<br />
Note when recompiling hostboot into a PNOR image with openpower-pnor-rebuild, it is usually recommended to force a machine XML rebuild as well:<br />
<nowiki>$ rm -rf output/build/machine-xml-*<br />
$ rm -rf output/build/hostboot-*<br />
$ ./op-build openpower-pnor-rebuild</nowiki><br />
<br />
==Installing the OpenPOWER firmware==<br />
===Transfer image to BMC===<br />
Copy the firmware to the BMC:<br />
$ scp ./output/images/talos.pnor root@$TALOS_BMC_ADDR:/tmp/<br />
<br />
===Establish BMC sessions===<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during boot:<br />
$ ssh -p 2200 root@$TALOS_BMC_ADDR<br />
<br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second session, get a shell on the BMC via SSH:<br />
$ ssh root@$TALOS_BMC_ADDR<br />
root@talos:~#<br />
<br />
'''Ensure the system is off''' before proceeding:<br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
<br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
root@talos:~# obmcutil chassisoff<br />
<br />
===Running the firmware temporarily===<br />
You can test the firmware without installing it, though this requires [https://gerrit.openbmc-project.xyz/#/c/openbmc/mboxbridge/+/14384/ rebuilding OpenBMC to use a modified <tt>mboxd</tt> binary].<br />
<br />
First, stop <tt>mboxd</tt>:<br />
root@talos:~# systemctl stop mboxd<br />
<br />
Restart <tt>mboxd</tt> with the additional <tt>-b</tt> argument:<br />
root@talos:~# mboxd -f 64M -w 1M -b /tmp/talos.pnor<br />
<br />
You can now test the new firmware image by starting the machine:<br />
root@talos:~# obmcutil poweron<br />
<br />
When you have finished testing the image, stop the machine:<br />
root@talos:~# obmcutil poweroff<br />
<br />
'''Note:''' Ensure the machine is off before proceeding. Verify this by running <tt>obmcutil state</tt>.<br />
<br />
Finally, terminate <tt>mboxd</tt> and restart the normal <tt>mboxd</tt>:<br />
root@talos:~# systemctl start mboxd<br />
<br />
===Flashing the firmware===<br />
Ensure the system is off.<br />
<br />
Perform the update:<br />
root@talos:~# pflash -E -p /tmp/talos.pnor<br />
<br />
Start the machine:<br />
root@talos:~# obmcutil poweron<br />
<br />
'''Note:''' The machine may reboot multiple times when first booted after a firmware update. This is normal; do not interrupt the process.<br />
<br />
==Troubleshooting the OpenPOWER Firmware==<br />
<br />
===General advice===<br />
<br />
;Always upgrade PNOR and BMC together<br />
:Many mismatched PNOR/BMC version combinations lead to weird failures.<br />
<br />
;Try downgrading the PNOR+BMC firmware<br />
:Firmware package 1.04 seems the most reliable at updating the SBE SEEPROM inside the POWER9 chip package.<br />
<br />
;Always use processor socket 0 for SBE updates<br />
:The BMC firmware and/or FSI driver seem to either forget to update the SBE SEEPROM in the second CPU socket, leading to a boot with only CPU 0 active. When you get a brand new chip you need to install it in CPU socket 0 leaving socket 1 empty, wait for the double-reboot to update the SEEPROM, and then you can move that chip to socket 1 if you like.<br />
<br />
;Try unplugging the HSF fan power during SBE update<br />
:Not kidding about this. The BMC is insanely complicated &mdash; it's got an entire operating system in there for some reason. It even has systemd. The BMC's systemd often gets into a funky loop restarting <tt>hwmon</tt> over and over and over, interrupting the SBE SEEPROM reflash every time it does this. Unplugging the PROC0 HSF 4-pin connector gets it to fail hard (due to inability to read the tachometer) and stay failed so the SBE update can proceed. Ugly as this is, it's easier than trying to figure out what systemd thinks it's doing.<br />
<br />
===SBE_MASTER_VERSION_DOWNLEVEL===<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000<br />
<br />
The machine needs to be reset to finish the update procedure:<br />
root@talos:~# obmcutil chassisoff<br />
root@talos:~# systemctl stop xyz.openbmc_project.State.Host.service<br />
root@talos:~# systemctl start xyz.openbmc_project.State.Host.service<br />
root@talos:~# obmcutil poweron<br />
<br />
The update should now complete as expected.<br />
<br />
A [https://github.com/open-power/sbe/issues/7 bug report] is open to track this issue.<br />
<br />
===internal compiler error: Killed===<br />
Building the Hostboot source code requires a large amount of RAM. If your machine runs out, you may see an error similar to the following:<br />
powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)<br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
op-build -j4<br />
* Increase the swap space (not recommended)<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Downloading the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public Git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
$ git clone -b raptor-v{{CURRENT_BMC_VERSION}} https://git.raptorcs.com/git/talos-openbmc<br />
<br />
=== Building the firmware ===<br />
Ensure that all needed support packages are installed. See the <tt>README.md</tt> for information on needed packages.<br />
<br />
The firmware can then be built using the following commands:<br />
$ cd talos-openbmc<br />
$ export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
$ . openbmc-env<br />
$ bitbake obmc-phosphor-image<br />
<br />
The resulting firmware image can then be found in the <tt>tmp/deploy/images/talos/</tt> directory.<br />
<br />
'''Note:''' If <tt>mboxd</tt> fails to build, you may need to [https://github.com/openbmc/openbmc/issues/2780 patch <tt>mboxd.bb</tt>].<br /><br />
'''Note:''' If building newer versions of the firmware, TEMPLATECONF has changed to TEMPLATECONF=meta-rcs/meta-talos/conf. This should be set before running <code>. open-env</code>. If not, do a git clean and start over with the new TEMPLATECONF.<br />
<br />
===Installing the firmware===<br />
Once firmware has been built, the resulting <tt>image-kernel</tt> and <tt>image-rofs</tt> binaries must be copied to <tt>/run/initramfs/</tt> on the BMC:<br />
$ scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@$TALOS_BMC_ADDR:/run/initramfs/<br />
<br />
Once the images have been transferred, reboot the BMC. The new firmware files will be detected and automatically applied.<br />
root@talos:~# reboot<br />
<br />
The reboot may take some time. Once complete, you will be able to log back in via SSH.<br />
<br />
===Recovering from failed firmware updates===<br />
See [[Debricking the BMC]].<br />
<br />
[[Category:Guides]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=POWER9_Hardware_Compatibility_List/PCIe_Devices&diff=2148POWER9 Hardware Compatibility List/PCIe Devices2019-03-12T05:24:32Z<p>Bdragon: /* Working */ adding SSD7101A-1</p>
<hr />
<div><br />
==Compatibility rules==<br />
In general, any PCIe device will work providing that an open source driver is available for your operating system. There are some exceptions:<br />
<br />
* '''Hardware bugs.''' POWER does not permit errant DMA accesses. If a device tries to access areas of host memory which it is not permitted to access, the device is shut down immediately. This is dissimilar to x86 platforms, which simply silently ignore such attempts. Some badly designed I/O devices have bugs causing them to attempt DMA accesses to random areas of host memory; these devices are unlikely to function correctly on POWER systems unless a workaround is available.<br />
* '''I/O space.''' Starting with [[POWER9]], access to the legacy PCI I/O space is no longer supported; devices or drivers which rely on this will not function. The legacy I/O space has been deprecated for as long as PCIe has existed; generally this will only affect very old PCIe devices which use PCIe to PCI bridge chips to attach old PCI devices to the bus. A small subset of these devices may require legacy I/O space support.<br />
* '''Incomplete memory addressing support.''' The PCIe architecture specifies a 64-bit address space. Some I/O devices try to economize on this by only implementing e.g. 40 bits for their addressing, rendering them incapable of addressing host memory which lies above address 2<sup>40</sup>. (Firmware patches to work around this are pending.)<br />
* '''Bifurcation limits.''' Arbitrary PCIe lane bifurcation is not supported. Devices which split a PCIe slot into multiple connectors (for example, PCIe to M.2 adaptors) will not work unless they have a PCIe switch chip, although the first connector will generally work.<br />
<br />
==SAS/SATA Storage Controllers ==<br />
===Working===<br />
* IOCrest SI-PEX40062 (Chipset: Marvell 88SE9235, PCI id 1B4B:9235)<br />
* Kouwell PE-115H (Chipset: Marvell 88SE9130, PCI id 1b4b:9130)<br />
* LSI 9300/9200 SAS HBAs<br />
** May require updating to IT firmware on a x86 machine<br />
* [[PM8068]]-based SAS HBAs <br />
* Supermicro AOC-SLG3-4E2P 4-port OCuLink adapter<br />
* Jmicron JMB 363 SATA PCIe card. SATA ports work with Petitboot.<br />
<br />
===Non-working===<br />
* AXAGON PCES-SA2 (ASMedia chipset)<br />
* SuperMicro AOC-SASLP-MV8 (mvsas driver)<br />
<br />
==NICs==<br />
===Working===<br />
* Broadcom [[BCM5719]]<br />
* Chelsio T6225-SO-CR<br />
* Mellanox ConnectX-6 EN 200Gb/s Adapter Card ''(supports [[CAPI]])''<br />
<br />
===Non-working===<br />
* Mellanox ConnectX IB QDR (mlx4 driver)<br />
<br />
==NVMe Drives==<br />
* Samsung 950 PRO (with M.2 to PCIe adapter)<br />
* Samsung 960 EVO / PRO (with M.2 to PCIe adapter)<br />
* Samsung 970 PRO (with M.2 to PCIe adapter)<br />
* Intel Optane 900P NVMe XPoint PCIe<br />
* Intel Optane 905P NVMe XPoint PCIe AIC<br />
* WD Black PCIe (with M.2 to PCIe adapter)<br />
* MyDigitalSSD BPX 480GB (with M.2 to PCIe adapter)<br />
<br />
==PCIe to M.2 Adapters==<br />
===Working===<br />
* [http://ableconn.com/products_2.php?gid=62 Ableconn PEXM2-SSD M.2 NGFF PCIe SSD to PCI Express 3.0 x4 Host Adapter Card (M.2 to PCIe adapter)]<br />
* [http://www.delock.com/produkte/G_89370/merkmale.html Delock PCI Express x4 Card > 1 x internal NVMe M.2 Key M 80 mm - Low Profile Form Factor]<br />
* [https://www.newegg.com/Product/Product.aspx?Item=9SIA4RE5AU2769 JEYI SK4 M.2 NVMe(M Key) SSD to PCI-E 3.0 x4 Adapter Converter Card]<br />
* [https://www.newegg.com/Product/Product.aspx?Item=N82E16815124167 SYBA SI-PEX40110 M.2 PCI-e To PCI-e 3.0 x4]<br />
* [http://highpoint-tech.com/USA_new/series-ssd7101a-1-overview.htm HighPoint SSD7101A-1] 4x M.2 PCIe to PCIe 3.0 x16 (based on PLX PEX8747 PCIe switch)<br />
** Works without special drivers as a PCIe switch. NVMEs are detected and work just fine. Petitboot is able to boot attached NVMEs with no problems. Tested in FreeBSD. -- [[User:Bdragon|Bdragon]] ([[User talk:Bdragon|talk]])<br />
* [http://highpoint-tech.com/USA_new/series-ssd7102-overview.htm HighPoint SSD 7102] 4x M.2 PCIe to PCIe 3.0 x16 (with PCIe switch)<br />
<br />
===Partially working===<br />
* [https://www.amazon.com/gp/product/B074WV4ZN4 Aplicata Quad M.2 NVMe SSD PCIe x16 Adapter] (no PCIe switch; only lowest slot works)<br />
<br />
== Graphics Cards ==<br />
<br />
No display? Check out the [[Troubleshooting/GPU|GPU Troubleshooting]] page.<br />
<br />
=== AMD ===<br />
<br />
All AMD GPUs currently have DMA issues (limited to 32-bit, which can cause crashes) with the current Talos II firmware.<br />
This is expected to be fixed in future firmware updates.<br />
<br />
* AMD Radeon HD 5850 - Must disable onboard VGA first. Currently has issues with only using 32-bit DMA.<br />
* AMD Radeon HD 6450 - Works with default settings (kernel: radeon, X: modesetting or radeon), tested in BE mode<br />
* AMD Radeon HD 6850 - Disable AST VGA with jumper. 32 bit.<br />
* AMD Radeon HD 7850 - Disabled onboard VGA. Using amdgpu is highly unstable, radeon driver is usable but has issues with only using 32-bit DMA.<br />
* AMD Radeon HD 7950 - Must disable onboard VGA first. Currently has issues with only using 32-bit DMA.<br />
* AMD Radeon R5 220<br />
* AMD Radeon R5 230 - Works in BE mode (use <code>Option "AccelMethod" "EXA"</code> for Xorg)<br />
* AMD Radeon R7 240<br />
* Radeon R9 290X<br />
* AMD Radeon Pro WX7100 (Polaris10 core) - Available pre-installed on Talos II workstation, server, and desktop configurations.<br />
* AMD Radeon Pro WX5100<br />
* AMD Radeon Pro WX4100 (Polaris11 core) - May need at least linux 4.16 in order to get Xorg to work.<br />
* AMD RX Vega 56 - Works with Debian Buster with amdgpu. Requires patches to work, somewhat unstable but usable. Cannot use AST Integrated VGA and AMDGPU at the same time without causing conflict. Not tested at this moment for use in petitboot or firmware. <br />
<br />
The core name is important when storing the firmware into the BOOTKERNFW partition in PNOR for use by skiroot.<br />
<br />
=== NVIDIA ===<br />
* NVIDIA Corporation G96 [GeForce 9500 GT] (rev a1) - Works in petitboot if onboard VGA is disabled. Currently has issues with only using 32-bit DMA. No firmware needed.<br />
* NVIDIA RTX 2070 - usable for compute, but not 3D acceleration; integrated by Raptor as part of the Talos II PowerAI Development System configuration<br />
<br />
=== Other ===<br />
* EVGA 100-U2-UV12-A1 UV Plus USB VGA Adapter - DisplayLink Based - Petitboot shows up without loading firmware. Not tested in OS.<br />
<br />
== Sound Cards ==<br />
<br />
* Creative Sound Blaster Audigy FX SB1570 PCIe 5.1 Sound Card<br />
* Creative Sound Blaster X-Fi Xtreme Fidelity PCIe Audio Sound Card (SB0880)<br />
* AMD Radeon HD 5850 and 7950 (HDMI audio)<br />
* [http://www.vantecusa.com/products_detail.php?p_id=156&p_name=+USB+Stereo+Audio+Adapter&pc_id=9&pc_name=Adapters&pt_id=3&pt_name=Audio+%2B++Video#tab-1 VANTEC NBA-120U (USB)]<br />
* [http://mackie.com/products/onyx-blackjack Mackie Onyx Blackjack (USB) Recording Interface]<br />
* RME HDSPe AIO (FreeBSD tested)<br />
* Leveraged Sabrent Bluetooth 4.0 USB adapter (model BT-UB40) to connect to wireless Bluetooth headphones, specifically Bose Quiet Comfort 35.<br />
<br />
==USB Host Controllers==<br />
===Working===<br />
* Insignia USB 3.0 PCI-e NS-PCCUP53 V1.0 (Chipset: NEC D720202)<br />
* AGAXO PCEU-23R (Chipset: Renesas uPD720202, PCI id 1912:0015)<br />
* Terminus Technology Inc. FE 2.1 7-port Hub<br />
* [https://www.sonnettech.com/product/allegroprousb3pcie.html Sonnet Allegro Pro USB 3.0 PCIe USB3-PRO-4PM-E] (Chipset: Four [http://www.frescologic.com/product/single/fl1100ex/ Fresco Logic FL1100EX] controllers behind one [https://www.broadcom.com/products/pcie-switches-bridges/pcie-switches/pex8608 PLX PEX 8608] switch)<br />
<br />
===Non-working===<br />
In general, USB3 host controllers based on ASMedia chipsets are known to be problematic, due to ASMedia hardware or firmware bugs causing errant DMA accesses to invalid regions of host memory.<br />
<br />
* AXAGON PCEU-43V - chipset Via VL805 - PCI id 1106:3483<br />
* StarTech PEXUSB314A2V - 2x ASM1142 host controllers and a PCIe switch<br />
* QNINE USB 3.1 Gen2 (Type-A and Type-C) - ASM1142<br />
** It's based on the the same reference design as all the other cheap ASM1142 cards, so there's a good chance those won't work either.<br />
* Rosewill RC-509 - ASM1142<br />
* Ableconn PU31-2C-2 - ASM2142<br />
<br />
==TV Tuners==<br />
* Hauppauge WinTV-quadHD<br />
* Hauppauge WinTV HVR-850 (2040:7240) - ATSC - using Kaffeine</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Category:Schematics&diff=1599Category:Schematics2018-09-16T15:41:33Z<p>Bdragon: Add a note about the circuit diagram being found on the Recovery DVD, just in case anyone comes here looking for it.</p>
<hr />
<div>Full mainboard circuit diagram is only available to Talos II or Talos II Lite owners, and may be found on the Recovery DVD included with your system or mainboard.<br />
<br />
[[Category:Documentation]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Talos_II&diff=1598Talos II2018-09-16T15:15:17Z<p>Bdragon: /* Systems */ Add the full production entry-level developer system SKU.</p>
<hr />
<div>'''Talos™ II''' is [[Raptor Computing Systems|Raptor Computing Systems]]' next-generation [[POWER9|POWER9]] platform. Focusing on security and performance, it is a dual socket [[PowerNV|PowerNV]] system that is available in desktop, workstation, and server form factors, or as bare mainboard.<br />
<br />
Talos II is the successor to the proposed [[Talos I|Talos I]] system.<br />
<br />
== Mainboard ==<br />
<br />
=== Specifications ===<br />
<br />
[[File:Talos_ii_rev_3_7_block_diagram.png|thumb|T2P9D01 Block Diagram]]<br />
<br />
{| class="wikitable sortable"<br />
! style="text-align:left;"| Mainboard Part #<br />
! style="text-align:left;"| Form Factor<br />
! style="text-align:left;"| CPU Type<br />
! style="text-align:left;"| Networking<br />
! style="text-align:left;"| Storage Controller<br />
|-<br />
|T2P9D01<br />
|[[EATX|EATX]]<br />
|[[POWER9|POWER9]] [[Sforza|Sforza]]<br />
|2x GbE (Broadcom [[BCM5719|BCM5719]])<br />
|SAS (optional, Microsemi [[PM8068|PM8068]])<br />
|}<br />
<br />
=== T2P9D01 Configuration SKUs ===<br />
<br />
[[File:Talos_ii_rev_1.00_non-sas.png|thumb|TL2MB1 without optional SAS controller]]<br />
<br />
{| class="wikitable sortable"<br />
! style="text-align:left;"| SKU<br />
! style="text-align:left;"| Sockets<br />
! style="text-align:left;"| RAM slots<br />
! style="text-align:left;"| PCIe 4.0 slots<br />
! style="text-align:left;"| SAS Controller<br />
|-<br />
|TL1MB1<br />
|1<br />
|8<br />
|1 x16, 1 x8<br />
|Does not change SKU<br />
|-<br />
|TL2MB1<br />
|2<br />
|16<br />
|3 x16, 2 x8<br />
|Does not change SKU<br />
|}<br />
<br />
[[:File:T2P9D01 users guide version 1 0.pdf|User's Guide for T2P9D01]] is available.<br />
<br />
== Systems ==<br />
<br />
In addition to being available as a mainboard, prebuilt systems are also available.<br />
<br />
{| class="wikitable sortable"<br />
! style="text-align:left;"| SKU<br />
! style="text-align:left;"| Name<br />
|-<br />
|TL2WK2<br />
|Talos II Secure Workstation<br />
|-<br />
|TL2SV1<br />
|Talos II 2U Rack Mount Development Platform<br />
|-<br />
|TL2SV2<br />
|Talos II 4U Rack Mount Development Platform<br />
|-<br />
|TL2DS1<br />
|Talos II Desktop Development System<br />
|-<br />
|TLSDS1<br />
|Talos II Special Developer System (DD2.1 stepping CPU)<br />
|-<br />
|TLSDS3<br />
|Talos II Entry-Level Developer System<br />
|-<br />
|TL1BC1<br />
|Talos II Lite Base Single CPU Board and Chassis<br />
|}<br />
<br />
== Energy consumption ==<br />
<br />
NOTE: numbers are approximate and may vary. System power measured at wall with standard instruments. CPU power reported by [[OCC|OCC]].<br />
<br />
{| class="wikitable sortable"<br />
! style="text-align:left;"| Component<br />
! style="text-align:left;"| Design Power<br />
! style="text-align:left;"| Power Use (Idle)<br />
! style="text-align:left;"| Power Use (Full Load)<br />
! style="text-align:left;"| Additional Information<br />
|-<br />
|4-core CPU (DD2.1, 3.1/3.7GHz)<br />
|90W<br />
|31W<br />
|58W<br />
|CPU held at 50°C die temperature. WoF not yet boosting to full power.<br />
|-<br />
|PM8068 SAS controller<br />
|23W<br />
|<br />
|<br />
|-<br />
|Desktop Development System (TL2DS1)<br />
|<br />
|111W<br />
|192W<br />
|2x 4-core CPUs, 2x 16GB RAM, 1x 1TB NVMe drive, integrated PM8068 SAS. 80% efficiency PSU.<br />
|}<br />
<br />
== How To ==<br />
* [[Talos II/Configure Power Restore States|Configure Power Restore States]]<br />
* [[Talos II/Add GPU Firmware To BOOTKERNFW|Add GPU Firmware To BOOTKERNFW]]<br />
* [[:File:TalosII_SystemAssembly_nashimus_v3.mp4|System Assembly - SC747TG-R1400B-SQ Chassis]]<br />
<br />
== See also ==<br />
<br />
* [[Talos II/Hardware Compatibility List|Talos II Hardware Compatibility List]]<br />
* [[Talos_II/Firmware|Talos II Firmware Updates]]<br />
* [[Troubleshooting/GPU|Troubleshooting/GPU]]<br />
* [[Troubleshooting/BMC Power]]<br />
* [[Talos I|Talos I]]<br />
<br />
== External Links ==<br />
<br />
* [https://raptorcs.com/TALOSII/ Raptor Computing Systems page for Talos II] - information on currently available systems and ordering a Talos™ II machine<br />
<br />
[[Category:Raptor Computing Systems (RCS) Platforms]]</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC/Watchdog&diff=1517Debricking the BMC/Watchdog2018-08-30T21:34:59Z<p>Bdragon: fix bug pointed out by awilfox: make sure and return properly so the stack doesn't get damaged.</p>
<hr />
<div>Once standby power has been applied to the Talos II (i.e. you have plugged it in), the FPGA waits for the power to stabilize, and then signals the BMC to reset.<br />
<br />
At this point, a watchdog counter in the FPGA begins running. This is done because of a hardware issue in the ast2500, where very occasionally the chip would get stuck when attempting to start.[https://git.raptorcs.com/git/talos-system-fpga/commit/main.v?id=e90ca898402a250e9d2f6e303e25ddaceb0cf8d6] If the BMC boot gets stuck in the U-Boot phase, the FPGA will force a reset after approximately 40 seconds. This interferes with BMC recovery, as that leaves you with a limited window to run commands.<br />
<br />
U-Boot on the Talos II has been slightly modified to disable this watchdog counter immediately before transferring control to the OpenBMC kernel.[https://git.raptorcs.com/git/talos-obmc-uboot/tree/common/bootm.c?id=cfee45a3ef2a592d130573ce3b7d8bcfe056060b#n707]<br />
<br />
Since we will want to spend more than 40 seconds at the U-Boot shell during recovery, we need to disable this watchdog ourselves.<br />
<br />
However, general GPIO support is not built into the Talos II U-Boot loader, so the hardware must be manipulated directly.<br />
<br />
Using an ARM cross-compiler, we can build a tiny program to do the same thing as the bootm.c code.<br />
<br />
== Payload creation ==<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780084;<br />
uint32_t* gpio_data_reg = 0x1e780080;<br />
<br />
*gpio_ctl_reg |= 0x00800000;<br />
*gpio_data_reg &= ~0x00800000;<br />
return 0;<br />
}<br />
</source><br />
<br />
Compiling this to an object and then converting it to an s-record will give you a file that can be directly loaded into U-Boot without using special tools.<br />
<br />
<code>$ arm-linux-gnueabihf-gcc -ffreestanding -march=armv6 -mfloat-abi=soft -marm -Os -c watchdog.c</code><br />
<br />
<code>$ arm-linux-gnueabihf-objcopy -O srec watchdog.o watchdog.srec</code><br />
<br />
<code>$ cat watchdog.srec</code><br />
<br />
<pre>S00E0000746F67676C652E7372656394<br />
S11300001C309FE50000A0E3842093E5022582E3F1<br />
S1130010842083E5802093E50225C2E3802083E5E4<br />
S10B00201EFF2FE10000781E11<br />
S9030000FC</pre><br />
<br />
Due to compiler differences, your output may be slightly different than the provided example s-record code. The example was compiled with <code>arm-linux-gnueabihf-gcc (Debian 6.3.0-18) 6.3.0 20170516</code>.<br />
<br />
* Copy the srec data to the clipboard, as we will need to send it to the BMC within a limited time window in a bit.<br />
<br />
== Main Procedure ==<br />
<br />
To load and execute this code, do the following at the <code>ast#</code> shell within the watchdog time window:<br />
<br />
* <code>ast# loads 83000000</code><br />
<br />
* Paste the contents of the srec file into the terminal when prompted. Some summary data will be shown on screen.<br />
<br />
(todo: grab and stick the example output in here.)<br />
<br />
Run the code using the <code>go</code> command.<br />
<br />
* <code>ast# go 83000000</code><br />
<br />
(todo: stick the output in here)<br />
<br />
The program will run and then return control to U-Boot. At this point, the watchdog has been disabled, and you can take your time with the rest of the recovery commands. The loaded program code is no longer needed and the memory range can be reused.<br />
<br />
== Notes ==<br />
<br />
* For reference, the GPIO pin associated with this watchdog is GPIOS7 (GPIO 151, physical pin AA20 on the ast2500). The signal is labelled SEQ_CONT if you are looking at it on the schematics.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC&diff=1516Debricking the BMC2018-08-30T20:11:28Z<p>Bdragon: Remove a todo that is now done :)</p>
<hr />
<div>While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. <br />
IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.<br />
Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.<br />
--><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* If you are having trouble with U-Boot resetting while you are trying to run these steps, have a slow network, or you are going to be loading over serial, you can [[Talos_II/U-Boot_Recovery/Watchdog|disable the FPGA watchdog]].<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?) note: the kernel param for bypassing rwfs is "overlay-filesystem-in-ram". Append it to the existing boot-args before running the bootm command. This can also be used as part of a password reset procedure.<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access) note: I (Bdragon) have successfully done a ram-only boot using cu's built in xmodem support (escape sequence ~X) to do an image transfer into RAM over the BMC serial interface.<br />
* (TODO: Discuss using u-boot's built in cmp tool to perform basic validation of the u-boot image against a second copy loaded into RAM.)<br />
* (TODO: Load recovery images over USB?) note: The onboard USB port is connected to the USB switch after all, so this might be problematic.<br />
* (TODO: Discussion of u-boot memory map) Short version is: flash lives at 0x20000000 and default base address for the memory loading tools is 0x83000000. So add 0x63000000 to any flash address to get the eqivilent address for an image-bmc file loaded into RAM. For example, the bootable image of a loaded image-bmc is at 0x83080000.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC&diff=1515Debricking the BMC2018-08-30T20:07:44Z<p>Bdragon: add more reasons to use the watchdog disable procedure.</p>
<hr />
<div>While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. <br />
IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.<br />
Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.<br />
--><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* If you are having trouble with U-Boot resetting while you are trying to run these steps, have a slow network, or you are going to be loading over serial, you can [[Talos_II/U-Boot_Recovery/Watchdog|disable the FPGA watchdog]].<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?) note: the kernel param for bypassing rwfs is "overlay-filesystem-in-ram". Append it to the existing boot-args before running the bootm command. This can also be used as part of a password reset procedure.<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access) note: I (Bdragon) have successfully done a ram-only boot using cu's built in xmodem support (escape sequence ~X) to do an image transfer into RAM over the BMC serial interface.<br />
* (TODO: Discuss using u-boot's built in cmp tool to perform basic validation of the u-boot image against a second copy loaded into RAM.)<br />
* (TODO: Write a u-boot standalone application to disable the AST watchdog, and write instructions for loading and executing it from the u-boot shell (the "go" command), to work around the cold-boot watchdog issue.)<br />
* (TODO: Load recovery images over USB?) note: The onboard USB port is connected to the USB switch after all, so this might be problematic.<br />
* (TODO: Discussion of u-boot memory map) Short version is: flash lives at 0x20000000 and default base address for the memory loading tools is 0x83000000. So add 0x63000000 to any flash address to get the eqivilent address for an image-bmc file loaded into RAM. For example, the bootable image of a loaded image-bmc is at 0x83080000.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC/Watchdog&diff=1514Debricking the BMC/Watchdog2018-08-30T20:05:23Z<p>Bdragon: /* Notes */ note the signal name too.</p>
<hr />
<div>Once standby power has been applied to the Talos II (i.e. you have plugged it in), the FPGA waits for the power to stabilize, and then signals the BMC to reset.<br />
<br />
At this point, a watchdog counter in the FPGA begins running. This is done because of a hardware issue in the ast2500, where very occasionally the chip would get stuck when attempting to start.[https://git.raptorcs.com/git/talos-system-fpga/commit/main.v?id=e90ca898402a250e9d2f6e303e25ddaceb0cf8d6] If the BMC boot gets stuck in the U-Boot phase, the FPGA will force a reset after approximately 40 seconds. This interferes with BMC recovery, as that leaves you with a limited window to run commands.<br />
<br />
U-Boot on the Talos II has been slightly modified to disable this watchdog counter immediately before transferring control to the OpenBMC kernel.[https://git.raptorcs.com/git/talos-obmc-uboot/tree/common/bootm.c?id=cfee45a3ef2a592d130573ce3b7d8bcfe056060b#n707]<br />
<br />
Since we will want to spend more than 40 seconds at the U-Boot shell during recovery, we need to disable this watchdog ourselves.<br />
<br />
However, general GPIO support is not built into the Talos II U-Boot loader, so the hardware must be manipulated directly.<br />
<br />
Using an ARM cross-compiler, we can build a tiny program to do the same thing as the bootm.c code.<br />
<br />
== Payload creation ==<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780084;<br />
uint32_t* gpio_data_reg = 0x1e780080;<br />
<br />
*gpio_ctl_reg |= 0x00800000;<br />
*gpio_data_reg &= ~0x00800000;<br />
}<br />
</source><br />
<br />
Compiling this to an object and then converting it to an s-record will give you a file that can be directly loaded into U-Boot without using special tools.<br />
<br />
<code>$ arm-linux-gnueabihf-gcc -ffreestanding -march=armv6 -mfloat-abi=soft -marm -Os -c watchdog.c</code><br />
<br />
<code>$ arm-linux-gnueabihf-objcopy -O srec watchdog.o watchdog.srec</code><br />
<br />
<code>$ cat watchdog.srec</code><br />
<br />
<pre>S00E0000746F67676C652E7372656394<br />
S113000018309FE5842093E5022582E3842083E56C<br />
S1130010802093E50225C2E3802083E51EFF2FE1C3<br />
S10700200000781E42<br />
S9030000FC</pre><br />
<br />
Due to compiler differences, your output may be slightly different than the provided example s-record code. The example was compiled with <code>arm-linux-gnueabihf-gcc (Debian 6.3.0-18) 6.3.0 20170516</code>.<br />
<br />
* Copy the srec data to the clipboard, as we will need to send it to the BMC within a limited time window in a bit.<br />
<br />
== Main Procedure ==<br />
<br />
To load and execute this code, do the following at the <code>ast#</code> shell within the watchdog time window:<br />
<br />
* <code>ast# loads 83000000</code><br />
<br />
* Paste the contents of the srec file into the terminal when prompted. Some summary data will be shown on screen.<br />
<br />
(todo: grab and stick the example output in here.)<br />
<br />
Run the code using the <code>go</code> command.<br />
<br />
* <code>ast# go 83000000</code><br />
<br />
(todo: stick the output in here)<br />
<br />
The program will run and then return control to U-Boot. At this point, the watchdog has been disabled, and you can take your time with the rest of the recovery commands. The loaded program code is no longer needed and the memory range can be reused.<br />
<br />
== Notes ==<br />
<br />
* For reference, the GPIO pin associated with this watchdog is GPIOS7 (GPIO 151, physical pin AA20 on the ast2500). The signal is labelled SEQ_CONT if you are looking at it on the schematics.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC/Watchdog&diff=1513Debricking the BMC/Watchdog2018-08-30T20:03:44Z<p>Bdragon: Add a bunch of misc notes, define sections.</p>
<hr />
<div>Once standby power has been applied to the Talos II (i.e. you have plugged it in), the FPGA waits for the power to stabilize, and then signals the BMC to reset.<br />
<br />
At this point, a watchdog counter in the FPGA begins running. This is done because of a hardware issue in the ast2500, where very occasionally the chip would get stuck when attempting to start.[https://git.raptorcs.com/git/talos-system-fpga/commit/main.v?id=e90ca898402a250e9d2f6e303e25ddaceb0cf8d6] If the BMC boot gets stuck in the U-Boot phase, the FPGA will force a reset after approximately 40 seconds. This interferes with BMC recovery, as that leaves you with a limited window to run commands.<br />
<br />
U-Boot on the Talos II has been slightly modified to disable this watchdog counter immediately before transferring control to the OpenBMC kernel.[https://git.raptorcs.com/git/talos-obmc-uboot/tree/common/bootm.c?id=cfee45a3ef2a592d130573ce3b7d8bcfe056060b#n707]<br />
<br />
Since we will want to spend more than 40 seconds at the U-Boot shell during recovery, we need to disable this watchdog ourselves.<br />
<br />
However, general GPIO support is not built into the Talos II U-Boot loader, so the hardware must be manipulated directly.<br />
<br />
Using an ARM cross-compiler, we can build a tiny program to do the same thing as the bootm.c code.<br />
<br />
== Payload creation ==<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780084;<br />
uint32_t* gpio_data_reg = 0x1e780080;<br />
<br />
*gpio_ctl_reg |= 0x00800000;<br />
*gpio_data_reg &= ~0x00800000;<br />
}<br />
</source><br />
<br />
Compiling this to an object and then converting it to an s-record will give you a file that can be directly loaded into U-Boot without using special tools.<br />
<br />
<code>$ arm-linux-gnueabihf-gcc -ffreestanding -march=armv6 -mfloat-abi=soft -marm -Os -c watchdog.c</code><br />
<br />
<code>$ arm-linux-gnueabihf-objcopy -O srec watchdog.o watchdog.srec</code><br />
<br />
<code>$ cat watchdog.srec</code><br />
<br />
<pre>S00E0000746F67676C652E7372656394<br />
S113000018309FE5842093E5022582E3842083E56C<br />
S1130010802093E50225C2E3802083E51EFF2FE1C3<br />
S10700200000781E42<br />
S9030000FC</pre><br />
<br />
Due to compiler differences, your output may be slightly different than the provided example s-record code. The example was compiled with <code>arm-linux-gnueabihf-gcc (Debian 6.3.0-18) 6.3.0 20170516</code>.<br />
<br />
* Copy the srec data to the clipboard, as we will need to send it to the BMC within a limited time window in a bit.<br />
<br />
== Main Procedure ==<br />
<br />
To load and execute this code, do the following at the <code>ast#</code> shell within the watchdog time window:<br />
<br />
* <code>ast# loads 83000000</code><br />
<br />
* Paste the contents of the srec file into the terminal when prompted. Some summary data will be shown on screen.<br />
<br />
(todo: grab and stick the example output in here.)<br />
<br />
Run the code using the <code>go</code> command.<br />
<br />
* <code>ast# go 83000000</code><br />
<br />
(todo: stick the output in here)<br />
<br />
The program will run and then return control to U-Boot. At this point, the watchdog has been disabled, and you can take your time with the rest of the recovery commands. The loaded program code is no longer needed and the memory range can be reused.<br />
<br />
== Notes ==<br />
<br />
* For reference, the GPIO pin associated with this watchdog is GPIOS7 (GPIO 151, physical pin AA20 on the ast2500).</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC&diff=1512Debricking the BMC2018-08-30T16:48:49Z<p>Bdragon: Link to the new steps describing how to disable the FPGA watchdog.</p>
<hr />
<div>While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. <br />
IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.<br />
Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.<br />
--><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* If you are having trouble with U-Boot resetting while you are trying to run these steps, you can [[Talos_II/U-Boot_Recovery/Watchdog|disable the FPGA watchdog]].<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?) note: the kernel param for bypassing rwfs is "overlay-filesystem-in-ram". Append it to the existing boot-args before running the bootm command. This can also be used as part of a password reset procedure.<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access) note: I (Bdragon) have successfully done a ram-only boot using cu's built in xmodem support (escape sequence ~X) to do an image transfer into RAM over the BMC serial interface.<br />
* (TODO: Discuss using u-boot's built in cmp tool to perform basic validation of the u-boot image against a second copy loaded into RAM.)<br />
* (TODO: Write a u-boot standalone application to disable the AST watchdog, and write instructions for loading and executing it from the u-boot shell (the "go" command), to work around the cold-boot watchdog issue.)<br />
* (TODO: Load recovery images over USB?) note: The onboard USB port is connected to the USB switch after all, so this might be problematic.<br />
* (TODO: Discussion of u-boot memory map) Short version is: flash lives at 0x20000000 and default base address for the memory loading tools is 0x83000000. So add 0x63000000 to any flash address to get the eqivilent address for an image-bmc file loaded into RAM. For example, the bootable image of a loaded image-bmc is at 0x83080000.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC/Watchdog&diff=1511Debricking the BMC/Watchdog2018-08-30T16:44:44Z<p>Bdragon: Describe a method of disabling the early FPGA watchdog on the BMC, to make doing U-Boot recovery tasks much easier.</p>
<hr />
<div>Once standby power has been applied to the Talos II (i.e. you have plugged it in), the FPGA waits for the power to stabilize, and then signals the BMC to reset.<br />
<br />
At this point, a watchdog counter in the FPGA begins running. This is done in an attempt to ensure the BMC does not get stuck in early bootup, which would prevent the Talos II from working. If the BMC boot gets stuck in the U-Boot phase, the FPGA will force a reset after approximately 40 seconds. This interferes with BMC recovery, as that leaves you with a limited window to run commands.<br />
<br />
U-Boot on the Talos II has been slightly modified to disable this watchdog counter immediately before transferring control to the OpenBMC kernel.[https://git.raptorcs.com/git/talos-obmc-uboot/tree/common/bootm.c?id=cfee45a3ef2a592d130573ce3b7d8bcfe056060b#n707]<br />
<br />
Since we will want to spend more than 40 seconds at the U-Boot shell during recovery, we need to disable this watchdog ourselves.<br />
<br />
(Todo: figure out if the gpio command can be used instead. The target is the SEQ_CONT signal connected to BMC pin GPIOS7.)<br />
<br />
Using an ARM cross-compiler, we can build a tiny program to do the same thing.<br />
<br />
<source lang="c"><br />
/* watchdog.c - minimal code to disable the FPGA watchdog.<br />
* Derived from Raptor Engineering changes to U-Boot common/bootm.c.<br />
* SPDX-License-Identifier: GPL-2.0+<br />
*/<br />
#include <stdint.h><br />
int main() {<br />
uint32_t* gpio_ctl_reg = 0x1e780084;<br />
uint32_t* gpio_data_reg = 0x1e780080;<br />
<br />
*gpio_ctl_reg |= 0x00800000;<br />
*gpio_data_reg &= ~0x00800000;<br />
}<br />
</source><br />
<br />
Compiling this to an object and then converting it to an s-record will give you a file that can be directly loaded into U-Boot without using special tools.<br />
<br />
<code>$ arm-linux-gnueabihf-gcc -ffreestanding -march=armv6 -mfloat-abi=soft -marm -Os -c watchdog.c</code><br />
<br />
<code>$ arm-linux-gnueabihf-objcopy -O srec watchdog.o watchdog.srec</code><br />
<br />
<code>$ cat watchdog.srec</code><br />
<br />
<pre>S00E0000746F67676C652E7372656394<br />
S113000018309FE5842093E5022582E3842083E56C<br />
S1130010802093E50225C2E3802083E51EFF2FE1C3<br />
S10700200000781E42<br />
S9030000FC</pre><br />
<br />
Due to compiler differences, your output may be slightly different than the provided example s-record code. The example was compiled with <code>arm-linux-gnueabihf-gcc (Debian 6.3.0-18) 6.3.0 20170516</code>.<br />
<br />
Copy this data to the clipboard, as we will need to send it to the BMC within a limited time window in a bit.<br />
<br />
To load and execute this code, do the following at the <code>ast#</code> shell within the watchdog time window:<br />
<br />
* <code>ast# loads 83000000</code><br />
<br />
* Paste the contents of the srec file into the terminal when prompted. Some summary data will be shown on screen.<br />
<br />
(todo: grab and stick the example output in here.)<br />
<br />
Run the code using the <code>go</code> command.<br />
<br />
* <code>ast# go 83000000</code><br />
<br />
(todo: stick the output in here)<br />
<br />
The program will run and then return control to U-Boot. At this point, the watchdog has been disabled, and you can take your time with the rest of the recovery commands.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1510Compiling Firmware2018-08-30T15:40:43Z<p>Bdragon: /* BMC Recovery procedure via U-Boot */ Use transclusion instead so this section can be treated separately.</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc rsync<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone -b raptor-v1.05 --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
'''Note: The master branch is often in a non-functional state. The latest firmware branch (raptor-v1.05 at the time of this update) should be used instead.'''<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<br />
git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v{{CURRENT_BMC_VERSION}}<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
{{:Talos_II/U-Boot_Recovery}}<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Debricking_the_BMC&diff=1509Debricking the BMC2018-08-30T15:39:55Z<p>Bdragon: Move my u-boot recovery instructions to a separate page.</p>
<hr />
<div>While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. <br />
IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.<br />
Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.<br />
--><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?) note: the kernel param for bypassing rwfs is "overlay-filesystem-in-ram". Append it to the existing boot-args before running the bootm command. This can also be used as part of a password reset procedure.<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access) note: I (Bdragon) have successfully done a ram-only boot using cu's built in xmodem support (escape sequence ~X) to do an image transfer into RAM over the BMC serial interface.<br />
* (TODO: Discuss using u-boot's built in cmp tool to perform basic validation of the u-boot image against a second copy loaded into RAM.)<br />
* (TODO: Write a u-boot standalone application to disable the AST watchdog, and write instructions for loading and executing it from the u-boot shell (the "go" command), to work around the cold-boot watchdog issue.)<br />
* (TODO: Load recovery images over USB?) note: The onboard USB port is connected to the USB switch after all, so this might be problematic.<br />
* (TODO: Discussion of u-boot memory map) Short version is: flash lives at 0x20000000 and default base address for the memory loading tools is 0x83000000. So add 0x63000000 to any flash address to get the eqivilent address for an image-bmc file loaded into RAM. For example, the bootable image of a loaded image-bmc is at 0x83080000.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=File:Talos_II_Serial.png&diff=1496File:Talos II Serial.png2018-08-27T05:40:02Z<p>Bdragon: Picture guide to the BMC Serial bracket pinout.</p>
<hr />
<div>Picture guide to the BMC Serial bracket pinout.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1366Compiling Firmware2018-08-16T16:06:12Z<p>Bdragon: /* BMC Recovery procedure via U-Boot */ add some quick notes from memory and experiments to the todo section.</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc rsync<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<br />
git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v{{CURRENT_BMC_VERSION}}<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. <br />
IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.<br />
Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.<br />
--><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?) note: the kernel param for bypassing rwfs is "overlay-filesystem-in-ram". Append it to the existing boot-args before running the bootm command. This can also be used as part of a password reset procedure.<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access) note: I (Bdragon) have successfully done a ram-only boot using cu's built in xmodem support (escape sequence ~X) to do an image transfer into RAM over the BMC serial interface.<br />
* (TODO: Discuss using u-boot's built in cmp tool to perform basic validation of the u-boot image against a second copy loaded into RAM.)<br />
* (TODO: Write a u-boot standalone application to disable the AST watchdog, and write instructions for loading and executing it from the u-boot shell (the "go" command), to work around the cold-boot watchdog issue.)<br />
* (TODO: Load recovery images over USB?) note: The onboard USB port is connected to the USB switch after all, so this might be problematic.<br />
* (TODO: Discussion of u-boot memory map) Short version is: flash lives at 0x20000000 and default base address for the memory loading tools is 0x83000000. So add 0x63000000 to any flash address to get the eqivilent address for an image-bmc file loaded into RAM. For example, the bootable image of a loaded image-bmc is at 0x83080000.<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=File:IMG_0937.JPG&diff=1187File:IMG 0937.JPG2018-08-03T02:43:57Z<p>Bdragon: Photo of the "correct" BMC serial header for identification purposes.</p>
<hr />
<div>Photo of the "correct" BMC serial header for identification purposes.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Configuring_Power_Restore_States&diff=1110Configuring Power Restore States2018-07-02T07:14:08Z<p>Bdragon: wikignoming.</p>
<hr />
<div>== Power Restore Configuration ==<br />
<br />
The BMC controls the system power state. As such, the power restore state is configured on the BMC itself. After gaining SSH access to the BMC, you can configure the system power restore state as follows.<br />
<br />
=== Keep chassis power off after power application ===<br />
<br />
<nowiki>busctl set-property xyz.openbmc_project.Settings /xyz/openbmc_project/control/host0/power_restore_policy xyz.openbmc_project.Control.Power.RestorePolicy PowerRestorePolicy s xyz.openbmc_project.Control.Power.RestorePolicy.Policy.AlwaysOff</nowiki><br />
<br />
=== Always turn chassis power on after power application ===<br />
<nowiki>busctl set-property xyz.openbmc_project.Settings /xyz/openbmc_project/control/host0/power_restore_policy xyz.openbmc_project.Control.Power.RestorePolicy PowerRestorePolicy s xyz.openbmc_project.Control.Power.RestorePolicy.Policy.AlwaysOn</nowiki><br />
<br />
=== Restore last chassis power state after power application ===<br />
<nowiki>busctl set-property xyz.openbmc_project.Settings /xyz/openbmc_project/control/host0/power_restore_policy xyz.openbmc_project.Control.Power.RestorePolicy PowerRestorePolicy s xyz.openbmc_project.Control.Power.RestorePolicy.Policy.Restore</nowiki></div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Talos_II/Hardware_Compatibility_List&diff=1086Talos II/Hardware Compatibility List2018-06-16T04:05:47Z<p>Bdragon: /* DTK/INTEL (compatible) */ adding the one I got off of Amazon for testing purposes.</p>
<hr />
<div>This is a collection of components known to work with the [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Cases ==<br />
<br />
=== Good Cases ===<br />
<br />
These cases were successfully used by someone.<br />
<br />
* '''SuperMicro SC732i-500B'''<br />
** Not recommended for 12 core and higher CPUs<br />
<br />
* '''SuperMicro SC732D4-903B'''<br />
** Add-on sound card recommended<br />
** Add-on USB 2.0 card or USB 3.0 hub recommended<br />
<br />
* '''SuperMicro SC747TQ-R1400B or SC747TG-R1400B-SQ'''<br />
** Hot swap drive capable; SAS recommended<br />
** Recommended for use with one or more high-end GPUs<br />
** Listed as EoL by Supermicro, replaced with 1620 versions. Same fan modules and PDU used in newer, higher watt, version. ([[User:Robbieab|Robbieab]] ([[User talk:Robbieab|talk]]))<br />
** [[:File:TalosII_SystemAssembly_nashimus_v3.mp4|System Assembly Video - SC747TG-R1400B-SQ]]<br />
<br />
* '''Rosewill RSV-L4500'''<br />
** Fans are two wire and use molex connectors<br />
<br />
<br />
<br />
=== Problematic Cases ===<br />
<br />
* '''BeQuiet Dark Base 900''' ([[User:Robbieab|Robbieab]] ([[User talk:Robbieab|talk]]))<br />
** Claims to support E-ATX on the BeQuiet website<br />
** Infographic showing the motherboard space to only be 322mm deep, which is 8.2mm short of the full-size E-ATX. <br />
** Emailed them for clarification, but no response. Can't confirm either way.<br />
<br />
* '''SuperMicro SC822'''<br />
** Low speed fans provide insufficient airflow over CPU0, leading to overheating if more than one 4-core CPU is installed.<br />
<br />
* '''Athena Power RM-3U8G1043'''<br />
** Some motherboard standoffs needed to be removed, and others needed additional hight.<br />
**There was no standoff hole for the top right. <br />
**The support beam across the top of the case interferes with CPU2 heatsink, but can be easily removed.<br />
<br />
==== Standoff Issues ====<br />
<br />
Stand off issues appear to be a very common problem. In many cases mitigation may be possible.<br />
<br />
* '''Fractal Design Define XL R2'''<br />
** Missing standoff holes for the top-left and top-middle positions.<br />
** Some alternative standoff in at least the top-middle position may be required to prevent too much bending of the motherboard while inserting RAM.<br />
<br />
* '''BitFenix Aurora'''<br />
** [[User:MarcusC/BitFenix_Aurora|Multiple missing standoff holes]], some mitigation possible.<br />
<br />
* '''Thermaltake Core W200'''<br />
** Heavy, expensive, massive.<br />
** Compatible ''with caveats''<br />
*** Talos™ II mainboard will fit in E-ATX compatible side only (when viewed from rear of case, the right side) if the dual system case.<br />
*** Missing standoff holes for the top-left and top-middle positions. (non-essential but ensure proper support when inserting and removing RAM to avoid bending mainboard)<br />
*** Must remove wire-hole rubber grommets present under Talos™ II mainboard on right lower side for proper fit<br />
<br />
* '''Nanoxia Deep Silence2''' ([[User:Sharkcz|Sharkcz]])<br />
** missing top-middle standoff hole, but I've used a plastic "flat" standoff instead<br />
** Power LED - red goes to pin 15, black to pin 16<br />
<br />
* '''RAIJINTEK ASTERION PLUS (Model 0R200049)''' ([[User:cyrozap|cyrozap]])<br />
** Missing standoff holes for the top-left and top-middle positions.<br />
*** As a workaround the standoffs can be unscrewed and placed upside-down (screw threads facing up) under the motherboard holes.<br />
*** This actually works surprisingly well, and thanks to the other screw points the motherboard is rigid enough that I don't worry too much about the weight of the HSFs flexing it.<br />
*** That said, it's probably a good idea to always transport the system on its side and avoid bumping it if possible.<br />
** The hinged panels that open with handles are much nicer than fiddling with thumb screws, but annoying since it makes it slightly trickier to do things that involve both the inside and back panel of the case (e.g., inserting PCI-e cards).<br />
** The PSU is at the very bottom of the case, while all the motherboard power connectors are at the very top of the case, so this can cause some issues if your PSU's cables aren't long enough.<br />
*** The EPS12V cables on my power supply had a few inches left over, but the main motherboard power cable was just barely able to reach from the other side of the case to the power connector.<br />
** The front of the case is sheet metal glued to plastic, and it's starting to peel off a little at the top and bottoms.<br />
*** It sticks back in place when I press on it, but I may need to get some better adhesive and re-glue it later.<br />
** For $170, I was hoping for something a little more robust, but at least it's pretty.<br />
<br />
* possible mitigation is plastic standoff like [https://www.kangyang-europe.com/product/pc-board-hardware/ass-10/ ASS-10]<br />
<br />
<br />
=== Candidate Cases ===<br />
<br />
These cases claim E-ATX support and are planned to be used, or were considered, by someone.<br />
<br />
* '''Lian Li PC-V1000L''' ([[User:Robbieab|Robbieab]] ([[User talk:Robbieab|talk]]))<br />
** Similar price point to the Supermicros with high power PSU. <br />
** Very "Apple" brushed aluminium aesthetic. <br />
** Couldn't confirm E-ATX was fullsize.<br />
** Passed over in favour of the SuperMicro SC747TQ-R1400B<br />
<br />
* '''Corsair 760T''' ([[User:mosst|mosst]])<br />
** Reasonably cheap.<br />
** Unusually tasteful aesthetics for a consumer/gaming case. Looks like something Aperture Science would come up with.<br />
** Yet to confirm if E-ATX is fullsize.<br />
<br />
== Power Supplies ==<br />
When planning to run with both CPU sockets populated keep in mind that the power-supply should support also 2 8-pin EPS connectors.<br />
<br />
* Seasonic PRIME 1300W<br />
* Seasonic PRIME Ultra 850W Gold<br />
* Seasonic PRIME Ultra 650W<br />
* Seasonic PRIME Ultra Titanium 1000W (SSR-1000TR)<br />
* FSP Group Twins ATX 1+1 Dual Module 700W 80 PLUS GOLD Hot Swappable Redundant Digital Power Supply ([[User:ebrasca|ebrasca]])<br />
** Customer reported good build quality and proper functionality<br />
<br />
== Memory ==<br />
<br />
The criteria are basically "is it ECC, is it registered, is it NOT LRDIMM"<br />
<br />
From the manual:<br />
<br />
{| class="wikitable"<br />
|-<br />
|Total Slots<br />
|16 (4 channels per CPU)<br />
|-<br />
|Capacity<br />
|2TB maximum<br />
|-<br />
|Memory Type<br />
|DDR4 1600/1866/2133/2400/2666<br />
|-<br />
|Memory Features<br />
|ECC<br />
|-<br />
|Module Sizes<br />
|8GB, 16GB, 32GB, 64GB, 128GB (RDIMM)<br />
|-<br />
|}<br />
<br />
=== Tested Memory ===<br />
<br />
==== Good Memory ====<br />
<br />
{| class="wikitable sortable"<br />
!colspan="6"|Module<br />
!colspan="4"|Validation<br />
|-<br />
!Manufacturer<br />
!Model<br />
!Size<br />
!Speed<br />
!Type<br />
!ECC<br />
!Stepping<br />
!Firmware<br />
!Source<br />
!Notes<br />
|-<br />
|Pacific Sun<br />
|X10723042S<br />
|8GB<br />
|PC4-19200<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.1<br />
|Hostboot cc2d45a<br />
|Official<br />
|<br />
|-<br />
|Samsung<br />
|M393A1G40DB0-CPB<br />
|8GB<br />
|PC4-17000<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|Hostboot 30dfd3b<br />
|meklort<br />
|Requires [[Talos_II/Firmware|System Package v1.02]]<br />
|-<br />
|-<br />
|Kingston<br />
|KTH-PL424/16G<br />
|16GB<br />
|PC4-19200<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.1<br />
|Hostboot cc2d45a<br />
|Official<br />
|<br />
|-<br />
|Micron<br />
|MTA18ASF2G72PZ-2G3B1<br />
|16GB<br />
|PC4-19200<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|Hostboot 28927a7<br />
|Official<br />
|<br />
|-<br />
|Micron<br />
|MTA18ASF2G72PDZ-2G3D1<br />
|16GB<br />
|PC4-19200<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.1<br />
|Hostboot cc2d45a<br />
|Official<br />
|<br />
|-<br />
|Micron<br />
|MTA36ASF4G72PZ-2G6D1<br />
|32GB<br />
|PC4-21333<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|Hostboot 6ffaeb4<br />
|cyrozap<br />
|<br />
|-<br />
|Samsung<br />
|M393A4K40BB1-CRC<br />
|32GB<br />
|PC4-19200<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.1<br />
|Hostboot 1e2221d<br />
|Official<br />
|<br />
|-<br />
|Samsung<br />
|M393A2K40BB2-CTD<br />
|16GB<br />
|PC4-21300<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|Hostboot 0c8fa110<br />
|meklort<br />
|Will run at 2400MT/s with [[Talos_II/Firmware|System Package v1.00]]<br />
|-<br />
|Samsung<br />
|M393A4K40BB2-CTD8Q<br />
|32GB<br />
|PC4-21333<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|Hostboot 28927a7<br />
|luke-jr<br />
|<br />
|-<br />
|Samsung<br />
|M393A2G40EB2-CTD<br />
|16GB <br />
|PC4-21300V-R<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|Hostboot 30dfd3b<br />
|JSharp<br />
|Tested extensively with [[Talos_II/Firmware|System Package v1.02]] but does boot on v1.00, Dual 8-Core POWER9, x8 DIMM Modules (RCS Recommended Slot Configuration)<br />
|-<br />
|Kingston<br />
|KVR24R17S8K4/32<br />
|8GB<br />
|PC4-19200<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|1.04, PNOR d286337d<br />
|sharkcz<br />
| kit 4x 8GB, got 1 stick faulty, but 3x 8GB worked OK<br />
|-<br />
|Kingston<br />
|KVR24R17D8/16MA<br />
|16GB<br />
|PC4-19200<br />
|Registered<br />
|Yes<br />
|POWER9 DD2.2<br />
|1.04, PNOR d286337d<br />
|sharkcz<br />
|<br />
|-<br />
|}<br />
<br />
==== Incompatible Memory ====<br />
<br />
NOTE: Memory may be removed from this table after firmware support has been added, or there may be a fundamental hardware incompatibility. If you have incompatible memory listed in the table below, you may want to bookmark and check this page from time to time to see if a firmware update has resolved the issue.<br />
<br />
{| class="wikitable sortable"<br />
!colspan="6"|Module<br />
!colspan="4"|Test Conditions<br />
|-<br />
!Manufacturer<br />
!Model<br />
!Size<br />
!Speed<br />
!Type<br />
!ECC<br />
!Stepping<br />
!Firmware<br />
!Last Test<br />
|-<br />
|Samsung<br />
|M386A8K40BMB-CRC<br />
|64GB<br />
|DDR4-19200<br />
|Registered LRDIMM<br />
|Yes<br />
|POWER9 DD2.1<br />
|Hostboot 1e2221d<br />
|02/14/2018<br />
|-<br />
|}<br />
<br />
== SAS/SATA Storage Drives ==<br />
<br />
Connected via optional on-board [[PM8068]] controller, or via PCIe controller. NVMe cards are also [[#NVMe Storage Drives|supported]].<br />
<br />
Boards with onboard SAS have 1x MiniSAS (or something like that? confirm!) port, and 4x standard SATA-III ports.<br />
<br />
== PCIe Devices ==<br />
<br />
=== Storage Controllers ===<br />
<br />
* IOCrest SI-PEX40062 (Chipset: Marvell 88SE9235)<br />
* LSI 9300 SAS HBAs<br />
* [[PM8068]]-based SAS HBAs <br />
* Supermicro AOC-SLG3-4E2P 4-port OCuLink adapter<br />
<br />
=== NVMe M.2 Adapters ===<br />
* [http://ableconn.com/products_2.php?gid=62 Ableconn PEXM2-SSD M.2 NGFF PCIe SSD to PCI Express 3.0 x4 Host Adapter Card (M.2 to PCIe adapter)]<br />
* [http://www.delock.com/produkte/G_89370/merkmale.html Delock PCI Express x4 Card > 1 x internal NVMe M.2 Key M 80 mm - Low Profile Form Factor]<br />
<br />
=== NVMe Storage Drives ===<br />
* Samsung 950 PRO (with M.2 to PCIe adapter)<br />
* Samsung 960 EVO / PRO (with M.2 to PCIe adapter)<br />
* Intel Optane 900P NVMe XPoint PCIe<br />
* WD Black PCIe (with M.2 to PCIe adapter)<br />
<br />
=== Graphics Cards ===<br />
<br />
No display? Check out the [[Troubleshooting/GPU|GPU Troubleshooting]] page.<br />
<br />
==== AMD ====<br />
<br />
All AMD GPUs currently have DMA issues (limited to 32-bit, which can cause crashes) with the current Talos II firmware.<br />
This is expected to be fixed in future firmware updates.<br />
<br />
* AMD Radeon HD 5850 - Must disable onboard VGA first. Currently has issues with only using 32-bit DMA.<br />
* AMD Radeon HD 7850 - Disabled onboard VGA. Using amdgpu is highly unstable, radeon driver is usable but has issues with only using 32-bit DMA.<br />
* AMD Radeon HD 7950 - Must disable onboard VGA first. Currently has issues with only using 32-bit DMA.<br />
* Radeon R9 290X<br />
* AMD Radeon Pro WX7100 - Available pre-installed on Talos II workstation, server, and desktop configurations.<br />
* AMD Radeon Pro WX5100<br />
* AMD Radeon Pro WX4100 - May need at least linux 4.16 in order to get Xorg to work.<br />
<br />
=== Sound Cards ===<br />
<br />
* Creative Sound Blaster Audigy FX SB1570 PCIe 5.1 Sound Card<br />
* Creative Sound Blaster X-Fi Xtreme Fidelity PCIe Audio Sound Card (SB0880)<br />
* AMD Radeon HD 5850 and 7950 (HDMI audio)<br />
* [http://www.vantecusa.com/products_detail.php?p_id=156&p_name=+USB+Stereo+Audio+Adapter&pc_id=9&pc_name=Adapters&pt_id=3&pt_name=Audio+%2B++Video#tab-1 VANTEC NBA-120U (USB)]<br />
* [http://mackie.com/products/onyx-blackjack Mackie Onyx Blackjack (USB) Recording Interface]<br />
<br />
=== ... ===<br />
<br />
== CAPI Devices ==<br />
<br />
* Mellanox ConnectX-6 EN 200Gb/s Adapter Card<br />
<br />
== Serial Adapters for J7701 Header ==<br />
* [http://pinoutguide.com/Motherboard/rs232_header_pinout.shtml Pinout Details]<br />
=== DTK/INTEL (compatible) ===<br />
* CablesToGo 09480 (unverified)<br />
* StarTech PLATE9MLP (unverified)<br />
* Assmann Serial Slot Bracket AK-610300-003-E, sold under PremiumCord brand (used by [[User:Sharkcz|Sharkcz]])<br />
* E-ITX ACC3100[https://www.amazon.com/dp/B00DSTTDQW/] (tested by [[User:Bdragon|Bdragon]])<br />
<br />
=== AT/EVEREX (not compatible) ===<br />
* StarTech PLATE9M16<br />
* Gigabyte COM port</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1081Compiling Firmware2018-06-12T06:04:40Z<p>Bdragon: /* BMC Recovery procedure via U-Boot */ Elaborate on the status of these instructions</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<br />
git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v{{CURRENT_BMC_VERSION}}<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. <br />
IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.<br />
Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.<br />
--><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?)<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access)<br />
* (TODO: Load recovery images over USB?)<br />
* (TODO: Discussion of u-boot memory map)<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1078Compiling Firmware2018-06-12T05:50:50Z<p>Bdragon: /* BMC Recovery procedure via U-Boot */ more wiki formatting, slight rewording</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<br />
git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v{{CURRENT_BMC_VERSION}}<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
WIP -- these instructions are still being tested, use at your own risk!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. --><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.<br />
<br />
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.<br />
<br />
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?)<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access)<br />
* (TODO: Load recovery images over USB?)<br />
* (TODO: Discussion of u-boot memory map)<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1075Compiling Firmware2018-06-12T05:35:04Z<p>Bdragon: /* BMC Recovery procedure via U-Boot */</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<br />
git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v{{CURRENT_BMC_VERSION}}<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
WIP -- these instructions are still being tested, use at your own risk!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. --><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using u-boot to manually bootstrap the BMC over the network or serial line.<br />
<br />
* Prepare a TFTP server, and place image-bmc, image-rofs, and image-kernel in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default 0penBmc, it may be one or the other depending on the state of the rwfs partition. If it boots up properly, the problem is probably in your kernel partition and you can retry flashing your image-kernel using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* <code>mount -t tmpfs none /tmp</code><br />
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* <code>cd /tmp</code><br />
* <code>tftp -g -r image-rofs x.x.x.x</code><br />
* <code>tftp -g -r image-kernel x.x.x.x</code><br />
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of 'cat /sys/class/mtd/mtd3/name' is "kernel" and the output of 'cat /sys/class/mtd/mtd4/name' is "rofs". We will be flashing mtd devices directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* <code>flashcp -v image-kernel /dev/mtd3</code><br />
* <code>flashcp -v image-rofs /dev/mtd4</code><br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?)<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access)<br />
* (TODO: Load recovery images over USB?)<br />
* (TODO: Discussion of u-boot memory map)<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1074Compiling Firmware2018-06-12T05:32:54Z<p>Bdragon: /* BMC Recovery procedure via U-Boot */ Now that dragon_pilot has successfully done a recovery, add the missing steps that were figured out today.</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<br />
git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v{{CURRENT_BMC_VERSION}}<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
WIP -- these instructions are still being tested, use at your own risk!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. --><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using u-boot to manually bootstrap the BMC over the network or serial line.<br />
<br />
* Prepare a TFTP server, and place image-bmc, image-rofs, and image-kernel in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default 0penBmc, it may be one or the other depending on the state of the rwfs partition. If it boots up properly, the problem is probably in your kernel partition and you can retry flashing your image-kernel using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)<br />
* mount -t tmpfs none /tmp<br />
* run "udhcpc" to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)<br />
* cd /tmp<br />
* tftp -g -r image-rofs x.x.x.x<br />
* tftp -g -r image-kernel x.x.x.x<br />
* IMPORTANT: Use md5sum, sha1sum, or sha256sum to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!<br />
* Verify that the output of 'cat /sys/class/mtd/mtd3/name' is "kernel" and the output of 'cat /sys/class/mtd/mtd4/name' is "rofs". We will be flashing mtd devices directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.<br />
* flashcp -v image-kernel /dev/mtd3<br />
* flashcp -v image-rofs /dev/mtd4<br />
* (TODO: Describe how to reset rwfs in case it was damaged as well?)<br />
* After the flash is complete, you can run restart the BMC and it should boot successfully.<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access)<br />
* (TODO: Load recovery images over USB?)<br />
* (TODO: Discussion of u-boot memory map)<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Talos_II/Building_FAQ&diff=1072Talos II/Building FAQ2018-06-12T04:11:00Z<p>Bdragon: /* BMC serial port J7701 */ I'm mostly seeing "DTK" on the internet, was this a typo originally in the wiki?</p>
<hr />
<div>== Where is the installation manual online? ==<br />
<br />
[[File:T2P9D01_users_guide_version_1_0.pdf]]<br />
<br />
== My motherboard bag's seal/labels are broken! Has it been compromised? ==<br />
<br />
This is normal for now. (It may have been compromised still, but the broken labels don't indicate that.)<br />
<br />
== Mounting in case ==<br />
<br />
=== Where do I get the stand-offs and screws? ===<br />
<br />
They should come with your case. (Check inside drive bays and such.)<br />
<br />
=== Should I use rubber spacers with the stand-offs? ===<br />
<br />
Stand-offs are supposed to help ground the motherboard, so it's better not to.<br />
<br />
=== My case doesn't have holes for some stand-offs! ===<br />
<br />
Not necessarily a big deal, especially for the top-left where the I/O plate helps hold it in place.<br />
<br />
However, note that without stand-offs, you may accidentally bend the board when inserting CPUs, RAM, or other components.<br />
Such bending may damage the board!<br />
<br />
== CPU/HSF installation ==<br />
<br />
=== What is an indium pad? Does the stock HSF include it? ===<br />
<br />
Indium pads help heat transfer from the CPU to the HSF.<br />
4-core and 8-core CPUs do not require them (and do not ship with them).<br />
More powerful CPUs should ship with them if required (TBD whether pre-applied to the HSF, or separately).<br />
<br />
=== Should I remove the label/sticker from the HSF? ===<br />
<br />
No.<br />
Do not remove the label/sticker, or you will void the warranty of the HSF.<br />
<br />
=== Can I use 4mm hex driver? ===<br />
yes, 5/32" = 3.97 mm<br />
<br />
== Front panel I/O ==<br />
<br />
=== Which is the other side of the buttons? ===<br />
<br />
Typically ground, though there is nothing mandating this in the general case. ATX case switches normally short out two adjacent pins when depressed.<br />
<br />
FIXME: Confirm this is the case for Talos specifically.<br />
<br />
=== Are the LED "cathode" pins the plus or minus side? ===<br />
<br />
Minus.<br />
<br />
=== What should the plus side of the LED be connected to? ===<br />
<br />
The associated Anode pin.<br />
<br />
{| class="wikitable"<br />
! Purpose || - || +<br />
|-<br />
| Fan fail || 6 || 8<br />
|-<br />
| NIC 2 || 10 || 9<br />
|-<br />
| NIC 1 || 12 || 11<br />
|-<br />
| HD* || 14 || 15<br />
|-<br />
| Power || 16 || 15<br />
|}<br />
<br />
=== What does the Identify button do? ===<br />
<br />
Turns on and off the Identify LEDs. This is mainly useful in server farms, as the ID LED status can be both read and set via software (IPMI). The main use is making sure that the correct server is unplugged, restarted, upgraded, etc. by datacenter staff.<br />
<br />
=== What does the NMI button do? ===<br />
<br />
As of this writing the NMI button is ignored by the BMC. It may be used to generate an NMI in future firmware revisions, or serve another purpose entirely.<br />
<br />
=== The HD activity LED doesn't work! ===<br />
<br />
The integrated Microsemi controller does not report activity (yet). A much-belated SAS controller firmware update from Microsemi is expected by 04/20/2018 to enable this functionality.<br />
<br />
In the interim, J10115 can be connected to other hardware to control the HD activity LED.<br />
<br />
=== What is J10115? ===<br />
<br />
Something related to HD activity LED. :)<br />
<br />
== BMC serial port J7701 ==<br />
When buying the "serial port bracket" you will need one with Intel/TDK (DTK)/Tyan style, not AT/Everex/Gigabyte, see http://pinoutguide.com/Motherboard/rs232_header_pinout.shtml for differences.<br />
Intel is https://iczc.cz/8fi3g7r5amg33a1pjn9tl4v9r8_7/obrazek while the other one is https://iczc.cz/5dessg9ns0ht49fed64jmrsita_7/obrazek.<br />
The proof is on page 77 of the schematics.<br />
<br />
Be careful when looking at specification pages on item listings, some of the wrong ones are sold as "Intel" compatible despite being the other style.<br />
<br />
== What is OCC mode? ==<br />
<br />
The On Chip Controller (OCC) is a clock / thermal management engine.<br />
<br />
The OCC can enter a safe mode if external hardware detects a condition that would require power throttling. This feature is not active in firmware on Talos II, but the wiring required to support it is present for future expansion.<br />
<br />
== What are the effects of the "CPU secure mode disable" jumpers? ==<br />
<br />
When secure mode is disabled, the on-board SBE will not halt IPL if the next stage (hostboot) fails security verification. When secure mode is enabled, each step of the IPL process verifies the next, and will halt IPL if a discrepancy (hash difference, invalid signature, etc.) is found. Talos II ships with secure mode disabled as of this writing.<br />
<br />
== How do I verify the PGP key that signed the DVD? ==<br />
<br />
(Unknown; while the process to verify the DVD is signed by a given key is [[Verifying_DVDs|documented]], there is no documented process at this time to verify which key is the correct Raptor Sales Team key)<br />
<br />
== What is micro PCI-e? ==<br />
<br />
Unknown.<br />
<br />
== How to get versions of firmware components? ==<br />
* run <code>lsprop</code> under <code>/proc/device-tree/ibm,firmware-versions</code><br />
* run <code>lsmcode</code> (available in <code>lsvpd</code> package)<br />
* run <code>ipmitool fru print 47</code></div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Talos_II/Building_FAQ&diff=1071Talos II/Building FAQ2018-06-12T03:57:28Z<p>Bdragon: /* BMC serial port J7701 */ welp, I was linking the wrong one after all.</p>
<hr />
<div>== Where is the installation manual online? ==<br />
<br />
[[File:T2P9D01_users_guide_version_1_0.pdf]]<br />
<br />
== My motherboard bag's seal/labels are broken! Has it been compromised? ==<br />
<br />
This is normal for now. (It may have been compromised still, but the broken labels don't indicate that.)<br />
<br />
== Mounting in case ==<br />
<br />
=== Where do I get the stand-offs and screws? ===<br />
<br />
They should come with your case. (Check inside drive bays and such.)<br />
<br />
=== Should I use rubber spacers with the stand-offs? ===<br />
<br />
Stand-offs are supposed to help ground the motherboard, so it's better not to.<br />
<br />
=== My case doesn't have holes for some stand-offs! ===<br />
<br />
Not necessarily a big deal, especially for the top-left where the I/O plate helps hold it in place.<br />
<br />
However, note that without stand-offs, you may accidentally bend the board when inserting CPUs, RAM, or other components.<br />
Such bending may damage the board!<br />
<br />
== CPU/HSF installation ==<br />
<br />
=== What is an indium pad? Does the stock HSF include it? ===<br />
<br />
Indium pads help heat transfer from the CPU to the HSF.<br />
4-core and 8-core CPUs do not require them (and do not ship with them).<br />
More powerful CPUs should ship with them if required (TBD whether pre-applied to the HSF, or separately).<br />
<br />
=== Should I remove the label/sticker from the HSF? ===<br />
<br />
No.<br />
Do not remove the label/sticker, or you will void the warranty of the HSF.<br />
<br />
=== Can I use 4mm hex driver? ===<br />
yes, 5/32" = 3.97 mm<br />
<br />
== Front panel I/O ==<br />
<br />
=== Which is the other side of the buttons? ===<br />
<br />
Typically ground, though there is nothing mandating this in the general case. ATX case switches normally short out two adjacent pins when depressed.<br />
<br />
FIXME: Confirm this is the case for Talos specifically.<br />
<br />
=== Are the LED "cathode" pins the plus or minus side? ===<br />
<br />
Minus.<br />
<br />
=== What should the plus side of the LED be connected to? ===<br />
<br />
The associated Anode pin.<br />
<br />
{| class="wikitable"<br />
! Purpose || - || +<br />
|-<br />
| Fan fail || 6 || 8<br />
|-<br />
| NIC 2 || 10 || 9<br />
|-<br />
| NIC 1 || 12 || 11<br />
|-<br />
| HD* || 14 || 15<br />
|-<br />
| Power || 16 || 15<br />
|}<br />
<br />
=== What does the Identify button do? ===<br />
<br />
Turns on and off the Identify LEDs. This is mainly useful in server farms, as the ID LED status can be both read and set via software (IPMI). The main use is making sure that the correct server is unplugged, restarted, upgraded, etc. by datacenter staff.<br />
<br />
=== What does the NMI button do? ===<br />
<br />
As of this writing the NMI button is ignored by the BMC. It may be used to generate an NMI in future firmware revisions, or serve another purpose entirely.<br />
<br />
=== The HD activity LED doesn't work! ===<br />
<br />
The integrated Microsemi controller does not report activity (yet). A much-belated SAS controller firmware update from Microsemi is expected by 04/20/2018 to enable this functionality.<br />
<br />
In the interim, J10115 can be connected to other hardware to control the HD activity LED.<br />
<br />
=== What is J10115? ===<br />
<br />
Something related to HD activity LED. :)<br />
<br />
== BMC serial port J7701 ==<br />
When buying the "serial port bracket" you will need one with Intel/TDK/Tyan style, not AT/Everex/Gigabyte, see http://pinoutguide.com/Motherboard/rs232_header_pinout.shtml for differences.<br />
Intel is https://iczc.cz/8fi3g7r5amg33a1pjn9tl4v9r8_7/obrazek while the other one is https://iczc.cz/5dessg9ns0ht49fed64jmrsita_7/obrazek.<br />
The proof is on page 77 of the schematics.<br />
<br />
Be careful when looking at specification pages on item listings, some of the wrong ones are sold as "Intel" compatible despite being the other style.<br />
<br />
== What is OCC mode? ==<br />
<br />
The On Chip Controller (OCC) is a clock / thermal management engine.<br />
<br />
The OCC can enter a safe mode if external hardware detects a condition that would require power throttling. This feature is not active in firmware on Talos II, but the wiring required to support it is present for future expansion.<br />
<br />
== What are the effects of the "CPU secure mode disable" jumpers? ==<br />
<br />
When secure mode is disabled, the on-board SBE will not halt IPL if the next stage (hostboot) fails security verification. When secure mode is enabled, each step of the IPL process verifies the next, and will halt IPL if a discrepancy (hash difference, invalid signature, etc.) is found. Talos II ships with secure mode disabled as of this writing.<br />
<br />
== How do I verify the PGP key that signed the DVD? ==<br />
<br />
(Unknown; while the process to verify the DVD is signed by a given key is [[Verifying_DVDs|documented]], there is no documented process at this time to verify which key is the correct Raptor Sales Team key)<br />
<br />
== What is micro PCI-e? ==<br />
<br />
Unknown.<br />
<br />
== How to get versions of firmware components? ==<br />
* run <code>lsprop</code> under <code>/proc/device-tree/ibm,firmware-versions</code><br />
* run <code>lsmcode</code> (available in <code>lsvpd</code> package)<br />
* run <code>ipmitool fru print 47</code></div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Talos_II/Building_FAQ&diff=1070Talos II/Building FAQ2018-06-11T17:22:11Z<p>Bdragon: /* How do I verify the PGP key that signed the DVD? */ adding wikilink</p>
<hr />
<div>== Where is the installation manual online? ==<br />
<br />
[[File:T2P9D01_users_guide_version_1_0.pdf]]<br />
<br />
== My motherboard bag's seal/labels are broken! Has it been compromised? ==<br />
<br />
This is normal for now. (It may have been compromised still, but the broken labels don't indicate that.)<br />
<br />
== Mounting in case ==<br />
<br />
=== Where do I get the stand-offs and screws? ===<br />
<br />
They should come with your case. (Check inside drive bays and such.)<br />
<br />
=== Should I use rubber spacers with the stand-offs? ===<br />
<br />
Stand-offs are supposed to help ground the motherboard, so it's better not to.<br />
<br />
=== My case doesn't have holes for some stand-offs! ===<br />
<br />
Not necessarily a big deal, especially for the top-left where the I/O plate helps hold it in place.<br />
<br />
However, note that without stand-offs, you may accidentally bend the board when inserting CPUs, RAM, or other components.<br />
Such bending may damage the board!<br />
<br />
== CPU/HSF installation ==<br />
<br />
=== What is an indium pad? Does the stock HSF include it? ===<br />
<br />
Indium pads help heat transfer from the CPU to the HSF.<br />
4-core and 8-core CPUs do not require them (and do not ship with them).<br />
More powerful CPUs should ship with them if required (TBD whether pre-applied to the HSF, or separately).<br />
<br />
=== Should I remove the label/sticker from the HSF? ===<br />
<br />
No.<br />
Do not remove the label/sticker, or you will void the warranty of the HSF.<br />
<br />
=== Can I use 4mm hex driver? ===<br />
yes, 5/32" = 3.97 mm<br />
<br />
== Front panel I/O ==<br />
<br />
=== Which is the other side of the buttons? ===<br />
<br />
Typically ground, though there is nothing mandating this in the general case. ATX case switches normally short out two adjacent pins when depressed.<br />
<br />
FIXME: Confirm this is the case for Talos specifically.<br />
<br />
=== Are the LED "cathode" pins the plus or minus side? ===<br />
<br />
Minus.<br />
<br />
=== What should the plus side of the LED be connected to? ===<br />
<br />
The associated Anode pin.<br />
<br />
{| class="wikitable"<br />
! Purpose || - || +<br />
|-<br />
| Fan fail || 6 || 8<br />
|-<br />
| NIC 2 || 10 || 9<br />
|-<br />
| NIC 1 || 12 || 11<br />
|-<br />
| HD* || 14 || 15<br />
|-<br />
| Power || 16 || 15<br />
|}<br />
<br />
=== What does the Identify button do? ===<br />
<br />
Turns on and off the Identify LEDs. This is mainly useful in server farms, as the ID LED status can be both read and set via software (IPMI). The main use is making sure that the correct server is unplugged, restarted, upgraded, etc. by datacenter staff.<br />
<br />
=== What does the NMI button do? ===<br />
<br />
As of this writing the NMI button is ignored by the BMC. It may be used to generate an NMI in future firmware revisions, or serve another purpose entirely.<br />
<br />
=== The HD activity LED doesn't work! ===<br />
<br />
The integrated Microsemi controller does not report activity (yet). A much-belated SAS controller firmware update from Microsemi is expected by 04/20/2018 to enable this functionality.<br />
<br />
In the interim, J10115 can be connected to other hardware to control the HD activity LED.<br />
<br />
=== What is J10115? ===<br />
<br />
Something related to HD activity LED. :)<br />
<br />
== BMC serial port J7701 ==<br />
When buying the "serial port bracket" you will need one with Intel/TDK/Tyan style, not AT/Everex/Gigabyte, see http://pinoutguide.com/Motherboard/rs232_header_pinout.shtml for differences.<br />
Intel is https://iczc.cz/8fi3g7r5amg33a1pjn9tl4v9r8_7/obrazek while the other one is https://iczc.cz/5dessg9ns0ht49fed64jmrsita_7/obrazek.<br />
The proof is on page 77 of the schematics.<br />
<br />
== What is OCC mode? ==<br />
<br />
The On Chip Controller (OCC) is a clock / thermal management engine.<br />
<br />
The OCC can enter a safe mode if external hardware detects a condition that would require power throttling. This feature is not active in firmware on Talos II, but the wiring required to support it is present for future expansion.<br />
<br />
== What are the effects of the "CPU secure mode disable" jumpers? ==<br />
<br />
When secure mode is disabled, the on-board SBE will not halt IPL if the next stage (hostboot) fails security verification. When secure mode is enabled, each step of the IPL process verifies the next, and will halt IPL if a discrepancy (hash difference, invalid signature, etc.) is found. Talos II ships with secure mode disabled as of this writing.<br />
<br />
== How do I verify the PGP key that signed the DVD? ==<br />
<br />
(Unknown; while the process to verify the DVD is signed by a given key is [[Verifying_DVDs|documented]], there is no documented process at this time to verify which key is the correct Raptor Sales Team key)<br />
<br />
== What is micro PCI-e? ==<br />
<br />
Unknown.<br />
<br />
== How to get versions of firmware components? ==<br />
* run <code>lsprop</code> under <code>/proc/device-tree/ibm,firmware-versions</code><br />
* run <code>lsmcode</code> (available in <code>lsvpd</code> package)<br />
* run <code>ipmitool fru print 47</code></div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=User_talk:JSharp&diff=1069User talk:JSharp2018-06-11T17:18:06Z<p>Bdragon: /* IRC channel */ new section</p>
<hr />
<div>== Ultravisor state ==<br />
<br />
I am assuming you are justinrwlynn because of [https://twitter.com/justinrwlynn/status/953221978659278849 this tweet].<br />
<br />
I made a page about the [[Machine State Register|Machine State Register]] and [[Privilege States|Privilege States]] that might be helpful to narrow down the search for a possible UV bit.<br />
<br />
== IRC channel ==<br />
<br />
The Talos IRC channel is #talos-workstation on freenode.net. See you there!</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Template:CURRENT_BMC_VERSION&diff=1066Template:CURRENT BMC VERSION2018-06-11T03:48:23Z<p>Bdragon: defining the current bmc version</p>
<hr />
<div>1.05</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1065Compiling Firmware2018-06-11T03:47:47Z<p>Bdragon: /* Grabbing the sources */ Convert to wiki syntax and templatize the version</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<br />
git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v{{CURRENT_BMC_VERSION}}<br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
WIP -- these instructions are still being tested, use at your own risk!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. --><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using u-boot to manually bootstrap the BMC over the network or serial line.<br />
<br />
* Prepare a TFTP server, and place image-bmc in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* After the BMC has booted, attempt the normal firmware update procedure again.<br />
<br />
* (TODO: Figure out a safe way to wipe rwfs using the shipped u-boot, in the case that the reason for boot failure was rwfs corruption. <!-- Going by the environment variables, an earlier u-boot build must have had the ubi tools, but they appear to be missing in the production builds. Not sure what happened here. - Bdragon -->)<br />
* (TODO: Come up with a procedure for making a minimal recovery image that gets us into a better environment and is quickly loadable via serial? secondary u-boot image with more tools enabled?)<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access)<br />
* (TODO: Discussion of u-boot memory map)<br />
<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=User:Bdragon/Template:CURRENT_BMC_RELEASE&diff=1064User:Bdragon/Template:CURRENT BMC RELEASE2018-06-11T03:33:37Z<p>Bdragon: testing variables in my user namespace.</p>
<hr />
<div>1.05</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=User:Bdragon&diff=1063User:Bdragon2018-06-11T03:29:01Z<p>Bdragon: Created page with "I help out in IRC."</p>
<hr />
<div>I help out in IRC.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1062Compiling Firmware2018-06-11T03:22:17Z<p>Bdragon: tweaking wording</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<pre>git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v1.02</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
WIP -- these instructions are still being tested, use at your own risk!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. --><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using u-boot to manually bootstrap the BMC over the network or serial line.<br />
<br />
* Prepare a TFTP server, and place image-bmc in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* After the BMC has booted, attempt the normal firmware update procedure again.<br />
<br />
* (TODO: Figure out a safe way to wipe rwfs using the shipped u-boot, in the case that the reason for boot failure was rwfs corruption. <!-- Going by the environment variables, an earlier u-boot build must have had the ubi tools, but they appear to be missing in the production builds. Not sure what happened here. - Bdragon -->)<br />
* (TODO: Come up with a procedure for making a minimal recovery image that gets us into a better environment and is quickly loadable via serial? secondary u-boot image with more tools enabled?)<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access)<br />
* (TODO: Discussion of u-boot memory map)<br />
<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Compiling_Firmware&diff=1061Compiling Firmware2018-06-11T03:20:39Z<p>Bdragon: /* Building the OpenBMC firmware */ Starting on emercency recovery procedure</p>
<hr />
<div>The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.<br />
<br />
== Requirements ==<br />
* At least 25GB of free hard drive space<br />
* 16GB of free RAM<br />
<br />
=== Operating System ===<br />
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:<br />
<pre><br />
sudo yum install debootstrap dpkg<br />
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian<br />
sudo mount -t proc none debian-chroot/proc/<br />
sudo mount -o bind /sys/ debian-chroot/sys/<br />
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/<br />
</pre><br />
<br />
Enter the chroot and install the needed packages:<br />
<pre><br />
sudo chroot debian-chroot/<br />
apt-get install software-properties-common locales<br />
# Packages needed for PNOR builds<br />
apt-get install cscope ctags libz-dev libexpat-dev \<br />
python texinfo \<br />
build-essential g++ git bison flex unzip \<br />
libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \<br />
wget bc<br />
# Packages needed for OpenBMC builds<br />
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat<br />
</pre><br />
<br />
Create a chroot user:<br />
<pre><br />
useradd -m build-user -s /bin/bash<br />
su build-user<br />
cd<br />
</pre><br />
<br />
You can now use the chroot to build the firmware.<br />
<br />
To enter the chroot in the future, you can run the following from a regular terminal:<br />
<pre><br />
sudo chroot debian-chroot/<br />
su build-user<br />
cd<br />
</pre><br />
<br />
== Building the PNOR Firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code:<br />
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-op-build<br />
. op-build-env<br />
op-build talos_defconfig<br />
op-build</pre><br />
<br />
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:<br />
<pre><br />
op-build hostboot-rebuild openpower-pnor-rebuild<br />
</pre><br />
<br />
=== Updating the firmware ===<br />
Copy the firmware to the BMC<br />
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre><br />
<br />
<br />
At this point, you should connect two SSH sessions to OpenBMC.<br />
In the first session, run the following to display the console during bootup:<br />
<pre>ssh -p 2200 root@<talos-openbmc></pre><br />
The console log will be useful in debugging any issues with the firmware that could occur.<br />
<br />
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:<br />
<pre>ssh root@<talos-openbmc><br />
root@talos:~# obmcutil state<br />
CurrentBMCState : xyz.openbmc_project.State.BMC.BMCState.Ready<br />
CurrentPowerState : xyz.openbmc_project.State.Chassis.PowerState.Off<br />
CurrentHostState : xyz.openbmc_project.State.Host.HostState.Off<br />
</pre><br />
The CurrentHostState must be Off before continuing with the procedure.<br />
If the CurrentHostState is not Off, please turn off the machine:<br />
<pre>obmcutil chassisoff</pre><br />
<br />
Once off, perform the update:<br />
<pre>pflash -E -p /tmp/talos.pnor</pre><br />
<br />
Start the machine:<br />
<pre>obmcutil poweron</pre><br />
<br />
Note: the machine may reboot multiple times after the initial flash.<br />
<br />
=== Troubleshooting ===<br />
==== SBE_MASTER_VERSION_DOWNLEVEL ====<br />
If you see the following message reported in the console, then the SBE update process did not work as expected:<br />
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008<br />
16.74823| SBE Image Version Miscompare with Master Target<br />
16.74824| ModuleId 0x0d SBE_MASTER_VERSION_COMPARE<br />
16.74825| ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL<br />
16.74826| UserData1 Master Target HUID : 0x0000000000050000<br />
16.74826| UserData2 Master Target Loop Index : 0x0000000000000000</pre><br />
<br />
The machine needs to be reset to finish the update proceedure using the following:<br />
<pre>obmcutil chassisoff<br />
systemctl stop xyz.openbmc_project.State.Host.service<br />
systemctl start xyz.openbmc_project.State.Host.service<br />
obmcutil poweron</pre><br />
The update should now complete as expected.<br />
<br />
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.<br />
<br />
==== internal compiler error: Killed ====<br />
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:<br />
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre><br />
To continue you have a few options:<br />
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line<br />
<pre>op-build -j4</pre><br />
* Increase the swap space<br />
* Install additional RAM<br />
<br />
== Building the OpenBMC firmware ==<br />
=== Grabbing the sources ===<br />
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.<br />
To download the source code and check out the tag:<br />
<pre>git clone https://git.raptorcs.com/git/talos-openbmc<br />
cd talos-openbmc<br />
git checkout raptor-v1.02</pre><br />
<br />
=== Building the firmware ===<br />
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.<br />
<br />
Once the packages are installed, the firmware can be build using the following commands:<br />
<pre>cd talos-openbmc<br />
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf<br />
. openbmc-env<br />
bitbake obmc-phosphor-image<br />
</pre><br />
<br />
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.<br />
<br />
=== Updating the firmware ===<br />
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/<br />
<pre><br />
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/<br />
</pre><br />
<br />
Once the images have been transferred, reboot the BMC:<br />
<pre>root@<talos-openbmc> reboot</pre><br />
<br />
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.<br />
<br />
=== BMC Recovery procedure via U-Boot ===<br />
WIP -- these instructions are still being tested, use at your own risk!<br />
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure. --><br />
<br />
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using u-boot to manually bootstrap the BMC over the network or serial line.<br />
<br />
* Prepare a TFTP server, and place image-bmc in the root. (TODO: elaborate on how to set this up)<br />
<br />
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).<br />
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.<br />
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a virtual image of the BMC contents into RAM.<br />
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.<br />
* After the BMC has booted, attempt the normal firmware update procedure again.<br />
<br />
* (TODO: Figure out a safe way to wipe rwfs using the shipped u-boot, in the case that the reason for boot failure was rwfs corruption. <!-- Going by the environment variables, an earlier u-boot build must have had the ubi tools, but they appear to be missing in the production builds. Not sure what happened here. - Bdragon -->)<br />
* (TODO: Come up with a procedure for making a minimal recovery image that gets us into a better environment and is quickly loadable via serial? secondary u-boot image with more tools enabled?)<br />
<br />
* (TODO: Discussion of using Kermit to upload the image without network access)<br />
* (TODO: Discussion of u-boot memory map)<br />
<br />
<br />
=== Troubleshooting ===<br />
TODO</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Updating_Firmware&diff=950Updating Firmware2018-05-05T00:29:37Z<p>Bdragon: Mention the necessary patch to use flashrom to flash the FPGA.</p>
<hr />
<div>= Firmware Upgrade Quick-Start =<br />
<br />
Your Raptor Computing Systems POWER9 mainboard contains three primary firmware components -- a system control FPGA bitstream, the BMC software stack, and the host PNOR. The BMC and host PNOR are easily upgradeable over a network connection.<br />
<br />
In contrast, the FPGA is rarely changed. Should an FPGA upgrade be desired, a direct SPI programming connection to a Flashrom compatible system is required via the provided FPGA programming header.<br />
<br />
A list of current firmware versions for each supported product is available on the pages shown below:<br />
<br />
* [[Talos_II/Firmware|Talos II]]<br />
<br />
While we strongly encourage you to compile your own firmware components from the provided source, we also provide prebuilt firmware images for download. Please note that, in general, the only way to guarantee that you or your organization's security requirements are met is to download and audit the provided source code, then compile from that audited codebase. This is not unique to our systems; the nature of all software is that the binary form may be very difficult to analyze for undesired or unintended operation.<br />
<br />
== BMC ==<br />
'''NOTE:''' The BMC should never need to be fully reprogrammed. Erasing the entire BMC Flash device will also erase U-Boot and its associated environment variables, requiring that they be reloaded from information printed on the mainboard itself. In particular, from the factory, the IPMI MAC address is stored in both the U-Boot loader via the default environment variable string, and the currently active U-Boot environment variables. This IPMI MAC address may also be found on the mainboard below CPU0 should you need to reprogram it for any reason.<br />
<br />
The preferred method of BMC update is to take the BMC update files that you have either compiled or obtained from official Raptor Computing Systems sources, and to upload them to the BMC. Once uploaded, the BMC is able to self-update.<br />
<br />
Upgrade by transferring the following two files to /run/initramfs/ on the BMC:<br />
<br />
* image-kernel<br />
* image-rofs<br />
<br />
After transfer, reboot the BMC via the 'reboot' command over SSH or the local BMC serial console.<br />
<br />
Default BMC login information is contained in the [[:File:T2P9D01 users guide version 1 0.pdf|User's Guide]]. scp or any similar utility is capable of transferring the upgrade files.<br />
<br />
== Host PNOR ==<br />
The host PNOR device, which contains hostboot, skiboot, and other host-level firmware components required to [[IPL|IPL]] your POWER9 system, is able to be modified in its entirety via the BMC.<br />
<br />
With chassis power off, but standby power on, transfer the upgrade .pnor file to the BMC's /tmp/ directory. Once the transfer is complete, log in to the BMC via SSH.<br />
<br />
Execute the following command:<br />
pflash -E -p /tmp/<your PNOR file><br />
<br />
If no errors occur, the upgrade is complete. You may now power on and use your updated system.<br />
<br />
== FPGA ==<br />
Referencing the schematics provided on the included DVD, carefully connect your SPI programmer to the FPGA programming header. Apply standby power to your system but do not turn it on.<br />
<br />
Using a current version of [https://www.flashrom.org/Flashrom Flashrom] with the [[:File:Atmel_enablement.diff|Atmel enablement patch]] applied, flash the new FPGA firmware binary to the on-board storage device. Complete the upgrade by removing all power.<br />
<br />
Wait for all LEDs on your system to extinguish, then reapply power. The upgrade is now complete and and you may use your system normally.</div>Bdragonhttps://wiki.raptorcs.com/w/index.php?title=Power_ISA/Privilege_States&diff=918Power ISA/Privilege States2018-04-25T23:38:26Z<p>Bdragon: stewart mentioned that a paper had been published explaining a lot more about the Ultravisor State.</p>
<hr />
<div>== States ==<br />
<br />
=== Ultravisor State ===<br />
<br />
At the moment very little information exists about the Ultravisor State. It is not mentioned in Power ISA version 2.07 documents at all, and version 3.0B only mentions it as a possible privilege of instructions. There is no official documentation of a ''UV'' - ''Ultravisor State'' [[Power ISA/Machine State Register|Machine State Register]] bit, although some source code does reference its existence.<ref>https://patchwork.ozlabs.org/patch/719952/</ref><br />
<br />
Skiboot documentation mentions this as one of "the four rings".<ref>[https://open-power.github.io/skiboot/doc/xive.html#i-device-tree-updates P9 XIVE Exploitation > I - Device-tree updates] "reg property contains the addresses & sizes for the register ranges corresponding respectively to the 4 rings: Ultravisor level, Hypervisor level, Guest OS level, User level"</ref><br />
<br />
A report from IBM for the Air Force Research Laboratory indicates that the Ultravisor State was tested in a modified [[POWER8|POWER8]] processor simulation.<ref>[[File:AFRL-RI-RS-TR-2017-021.pdf|HARDWARE SUPPORT FOR MALWARE DEFENSE AND END-TO-END TRUST]]. IBM. 2017-02</ref><br />
<br />
==== POWER9 ====<br />
<br />
IBM has confirmed to Raptor in direct messaging that the ultravisor state does not exist in POWER9, despite some material continuing to reference it. This information was also made public on Twitter.<ref>Lynn, Justin. [https://twitter.com/justinrwlynn/status/956772078702571520 tweet]</ref><br />
<br />
On March 22, 2018, a paper<ref>Guerney D. H. Hunt, Richard (Rick) H. Boivie, Elaine Rivette Palmer, Dimitrios Pendarakis https://www.ibm.com/developerworks/library/l-support-protected-computing/ Supporting protected computing on IBM Power Architecture</ref> was published on IBM developerWorks, explaining the reasoning for and the future use of the Ultravisor State.<br />
<br />
=== Hypervisor State ===<br />
<br />
Hypervisor State is indicated by the [[Machine State Register#Bit_3_-_Hypervisor_State_.28HV.29|HV (bit 3)]] of the Machine State Register, and is normally used by a hypervisor. An operating system running without a hypervisor can run in Hypervisor State, with its userland in Problem State and avoid using Privileged State altogether.<br />
<br />
Hypervisor State was introduced in POWER4, although for some time it was not included in documentation, appearing only as a ''reserved'' bit in the Machine State Register.<ref>Kerr, Jeremy. [https://www.youtube.com/watch?v=DigNr08GVss OpenPOWER: building an open-source software stack from bare metal] (video). Linux.conf.au 2015</ref><br />
<br />
=== Privileged State ===<br />
<br />
Privileged State, also called Supervisor Mode, is normally used by an operating system running on top of a hypervisor.<br />
<br />
=== Problem State ===<br />
<br />
Problem State, also called User Mode, is indicated by the [[Machine State Register#Bit_49_-_Problem_State_.28PR.29|PR (bit 49)]] of the Machine State Register.<br />
<br />
== Instruction Classification ==<br />
<br />
{| class="wikitable sortable"<br />
|+Privilege Classification of Instructions in Power ISA<br />
!Code<br />
!2.07<br />
!3.0B<br />
!Description<br />
|-<br />
|P<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|a privileged instruction.<br />
|-<br />
|O<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|an instruction that is treated as privileged or nonprivileged (or hypervisor, for mtspr), depend-<br />
ing on the SPR or PMR number.<br />
|-<br />
|PI<br />
|style="background:#F99;vertical-align:middle;text-align:center;"|No<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|an instruction that is illegal in privileged state.<br />
|-<br />
|H<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|an instruction that can be executed only in hypervisor state<br />
|-<br />
|PH<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|style="background:#F99;vertical-align:middle;text-align:center;"|No<br />
|a hypervisor privileged instruction if Category Embedded.Hypervisor is implemented; otherwise<br />
denotes a privileged instruction.<br />
|-<br />
|M<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|style="background:#F99;vertical-align:middle;text-align:center;"|No<br />
|an instruction that is treated as privileged or nonprivileged, depending on the value of the UCLE<br />
bit in the [[Machine State Register|MSR]]<br />
|-<br />
|U<br />
|style="background:#F99;vertical-align:middle;text-align:center;"|No<br />
|style="background:#9F9;vertical-align:middle;text-align:center;"|Yes<br />
|an instruction that can be executed only in ultravisor state<br />
|}<br />
<br />
== References ==<br />
<br />
<references/><br />
<br />
[[Category:Power Architecture]]</div>Bdragon