Difference between revisions of "Compiling Firmware"

From RCS Wiki
Jump to navigation Jump to search
(→‎BMC Recovery procedure via U-Boot: add some quick notes from memory and experiments to the todo section.)
(Updated links to use gitlab.raptorengineering.com instead of scm.raptorcs.com)
 
(29 intermediate revisions by 13 users not shown)
Line 1: Line 1:
The following steps can be used to compile and update the firmware on [[Talos_II|Talos™ II]]-based solutions. It's maintained by both [[Raptor Computing Systems|Raptor CS]] and community members.
+
==Purpose==
 +
The following steps can be used to compile and update the firmware on [[Raptor Computing Systems|Raptor CS]]'s [[OpenPOWER]] systems, such as the [[Talos II]] or [[Blackbird]].
  
== Requirements ==
+
==Applicability==
 +
These specific instructions are for the [[Talos II]], though the process for compiling firmware for other [[OpenPOWER]] systems like [[Blackbird]] should be very similar.
 +
 
 +
==Requirements==
 
* At least 25GB of free hard drive space
 
* At least 25GB of free hard drive space
 
* 16GB of free RAM
 
* 16GB of free RAM
  
=== Operating System ===
+
===Building on Debian===
The build system (op-build) has been primarily tested using Debian stretch. If you are on a different operating system such as Fedora 28, a Debian chroot should be used:
+
The build system (op-build) has been primarily tested using Debian Stretch. Ensure you install the following packages:
<pre>
+
 
sudo yum install debootstrap dpkg
+
# Packages needed for OpenPOWER Firmware builds
sudo debootstrap stretch debian-chroot http://httpredir.debian.org/debian
+
$ sudo apt install cscope ctags libz-dev libexpat-dev python texinfo build-essential g++ git bison flex unzip libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc wget bc rsync
sudo mount -t proc none debian-chroot/proc/
+
sudo mount -o bind /sys/ debian-chroot/sys/
+
# Packages needed for OpenBMC builds
sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/
+
$ sudo apt install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat
</pre>
+
 
 +
===Building on other Linux distributions===
 +
If you are on a different distribution, such as Fedora 28, a Debian chroot is recommended:
 +
$ sudo yum install debootstrap dpkg
 +
$ sudo debootstrap stretch debian-chroot http://archive.debian.org/debian
 +
$ sudo mount -t proc none debian-chroot/proc/
 +
$ sudo mount -o bind /sys/ debian-chroot/sys/
 +
$ sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/
  
 
Enter the chroot and install the needed packages:
 
Enter the chroot and install the needed packages:
<pre>
+
$ sudo chroot debian-chroot/
sudo chroot debian-chroot/
+
# apt install software-properties-common locales
apt-get install software-properties-common locales
+
# Packages needed for PNOR builds
+
# Packages needed for OpenPOWER Firmware builds
apt-get install cscope ctags libz-dev libexpat-dev \
+
$ sudo apt install cscope ctags libz-dev libexpat-dev python texinfo build-essential g++ git bison flex unzip libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc wget bc rsync
          python texinfo \
+
          build-essential g++ git bison flex unzip \
+
# Packages needed for OpenBMC builds
          libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc \
+
$ sudo apt install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat
          wget bc rsync
 
# Packages needed for OpenBMC builds
 
apt-get install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat
 
</pre>
 
  
Create a chroot user:
+
Also create a user inside the chroot to build under:
<pre>
+
$ useradd -m build-user -s /bin/bash
useradd -m build-user -s /bin/bash
+
$ su build-user
su build-user
+
$ cd
cd
 
</pre>
 
  
 
You can now use the chroot to build the firmware.
 
You can now use the chroot to build the firmware.
  
To enter the chroot in the future, you can run the following from a regular terminal:
+
To enter the chroot in the future, you can run the following from any terminal:
<pre>
+
sudo chroot debian-chroot/
sudo chroot debian-chroot/
+
su build-user
su build-user
+
cd
cd
 
</pre>
 
  
== Building the PNOR Firmware ==
+
==Building the OpenPOWER Firmware==
=== Grabbing the sources ===
+
===Downloading the sources===
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.
+
[[Raptor Computing Systems|Raptor CS]] maintains a public Git repository containing the complete source code for the firmware.
To download the source code:
+
To download the source code for Talos systems:
<pre>git clone --recursive https://scm.raptorcs.com/scm/git/talos-op-build</pre>
+
git clone -b raptor-v2.10 --recursive https://gitlab.raptorengineering.com/openpower-firmware/machine-talos-ii/op-build.git
 +
 
 +
To download the source code for Blackbird systems:
 +
git clone -b raptor-v2.10 --recursive https://gitlab.raptorengineering.com/openpower-firmware/machine-blackbird/op-build.git
 +
 
 +
'''Note:''' The <tt>master</tt> branch is often in a non-functional state. The latest firmware branch (either <tt>raptor-v2.10</tt>, <tt>raptor-v2.00</tt> or <tt>raptor-v1.05</tt> at the time of this update) should be used.
 +
 
 +
===Building the firmware===
 +
Before building the firmware, check the <tt>README.md</tt> file to ensure that all needed packages are installed.
 +
 
 +
The firmware can then be built using the following commands:
 +
$ cd talos-op-build
 +
$ git submodule update
 +
$ . op-build-env
 +
$ export KERNEL_BITS=64 # needed when building on ppc64, or libopenssl will try to build in 32 bit mode
 +
$ op-build talos_defconfig
 +
$ op-build
 +
 
 +
You can pass <tt>-j&lt;num-cores&gt;</tt> to perform a parallel build (<tt>op-build</tt> invokes <tt>make</tt>), though this may result in very high memory usage.
  
=== Building the firmware ===
+
If the build completes successfully, the final firmware image is at <tt>output/images/talos.pnor</tt>.
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.
 
  
Once the packages are installed, the firmware can be build using the following commands:
+
===Rebuilding an individual package===
<pre>cd talos-op-build
+
To rebuild an individual package (such as Hostboot) and recreate the <tt>talos.pnor</tt> image, run:
. op-build-env
+
$ op-build <em>pkgname</em>-rebuild openpower-pnor-rebuild
op-build talos_defconfig
+
where <tt><em>pkgname</em></tt> is the name of the package to rebuild.
op-build</pre>
 
  
To rebuild an individual package (such as hostboot) and recreate the pnor image, the following can be run:
+
For example:
<pre>
+
$ op-build hostboot-rebuild openpower-pnor-rebuild
op-build hostboot-rebuild openpower-pnor-rebuild
 
</pre>
 
  
=== Updating the firmware ===
+
Note when recompiling hostboot into a PNOR image with openpower-pnor-rebuild, it is usually recommended to force a machine XML rebuild as well:
Copy the firmware to the BMC
+
<nowiki>$ rm -rf output/build/machine-xml-*
<pre>scp ./output/images/talos.pnor root@<talos-openbmc>:/tmp/</pre>
+
$ rm -rf output/build/hostboot-*
 +
$ ./op-build openpower-pnor-rebuild</nowiki>
  
 +
==Installing the OpenPOWER firmware==
 +
===Transfer image to BMC===
 +
Copy the firmware to the BMC:
 +
$ scp ./output/images/talos.pnor root@$TALOS_BMC_ADDR:/tmp/
  
 +
===Establish BMC sessions===
 
At this point, you should connect two SSH sessions to OpenBMC.
 
At this point, you should connect two SSH sessions to OpenBMC.
In the first session, run the following to display the console during bootup:
+
In the first session, run the following to display the console during boot:
<pre>ssh -p 2200 root@<talos-openbmc></pre>
+
$ ssh -p 2200 root@$TALOS_BMC_ADDR
 +
 
 
The console log will be useful in debugging any issues with the firmware that could occur.
 
The console log will be useful in debugging any issues with the firmware that could occur.
  
In the second BMC session, ensure the system is off by running obmcutil. You should see the following:
+
In the second session, get a shell on the BMC via SSH:
<pre>ssh root@<talos-openbmc>
+
$ ssh root@$TALOS_BMC_ADDR
root@talos:~# obmcutil state
+
root@talos:~#
CurrentBMCState    : xyz.openbmc_project.State.BMC.BMCState.Ready
+
 
CurrentPowerState  : xyz.openbmc_project.State.Chassis.PowerState.Off
+
'''Ensure the system is off''' before proceeding:
CurrentHostState    : xyz.openbmc_project.State.Host.HostState.Off
+
root@talos:~# obmcutil state
</pre>
+
CurrentBMCState    : xyz.openbmc_project.State.BMC.BMCState.Ready
 +
CurrentPowerState  : xyz.openbmc_project.State.Chassis.PowerState.Off
 +
CurrentHostState    : xyz.openbmc_project.State.Host.HostState.Off
 +
 
 
The CurrentHostState must be Off before continuing with the procedure.
 
The CurrentHostState must be Off before continuing with the procedure.
 
If the CurrentHostState is not Off, please turn off the machine:
 
If the CurrentHostState is not Off, please turn off the machine:
<pre>obmcutil chassisoff</pre>
+
root@talos:~# obmcutil chassisoff
  
Once off, perform the update:
+
===Running the firmware temporarily===
<pre>pflash -E -p /tmp/talos.pnor</pre>
+
You can test the firmware without installing it, though this requires using the v2.00+ BMC firmware which contains [https://gerrit.openbmc-project.xyz/plugins/gitiles/openbmc/hiomapd/+/a042978b03c91ca3a716e99f313ef5cda42820ba an updated <tt>mboxd</tt> with support for file-backed pnor images].
 +
 
 +
First, stop <tt>mboxd</tt>:
 +
root@talos:~# systemctl stop mboxd
 +
 
 +
Restart <tt>mboxd</tt> with the additional <tt>-b</tt> argument:
 +
root@talos:~# mboxd -f 64M -w 1M -b file:/tmp/talos.pnor
 +
 
 +
You can now test the new firmware image by starting the machine:
 +
root@talos:~# obmcutil poweron
 +
 
 +
When you have finished testing the image, stop the machine:
 +
root@talos:~# obmcutil poweroff
 +
 
 +
'''Note:''' Ensure the machine is off before proceeding. Verify this by running <tt>obmcutil state</tt>.
 +
 
 +
Finally, terminate <tt>mboxd</tt> and restart the normal <tt>mboxd</tt>:
 +
root@talos:~# systemctl start mboxd
 +
 
 +
===Flashing the firmware===
 +
<span style="color:#FF0000">Warning: Some PNOR firmware updates may require a BMC update to function. Before flashing ensure that your installed BMC firmware is capable of booting the image. It is also possible to downgrade using these instructions in the event of a non-functioning firmware image.</span>
 +
 
 +
Ensure the system is off.
 +
 
 +
Perform the update:
 +
root@talos:~# pflash -E -p /tmp/talos.pnor
  
 
Start the machine:
 
Start the machine:
<pre>obmcutil poweron</pre>
+
root@talos:~# obmcutil poweron
  
Note: the machine may reboot multiple times after the initial flash.
+
'''Note:''' The machine may reboot multiple times when first booted after a firmware update. This is normal; do not interrupt the process.
  
=== Troubleshooting ===
+
==Troubleshooting the OpenPOWER Firmware==
==== SBE_MASTER_VERSION_DOWNLEVEL ====
+
 
If you see the following message reported in the console, then the SBE update process did not work as expected:
+
===General advice===
<pre> 16.74709|Error reported by sbe (0x2200) PLID 0x90000008
+
 
 +
;Always upgrade PNOR and BMC together
 +
:Many mismatched PNOR/BMC version combinations lead to weird failures.
 +
 
 +
;Try downgrading the PNOR+BMC firmware
 +
:Firmware package 1.04 seems the most reliable at updating the [[Self-Boot_Engine|SBE SEEPROM]] inside the POWER9 chip package.
 +
 
 +
;Always use processor socket 0 for SBE updates
 +
:The BMC firmware and/or FSI driver seem to either forget to update the SBE SEEPROM in the second CPU socket, leading to a boot with only CPU 0 active.  When you get a brand new chip you need to install it in CPU socket 0 leaving socket 1 empty, wait for the double-reboot to update the SEEPROM, and then you can move that chip to socket 1 if you like.
 +
 
 +
;Try unplugging the HSF fan power during [[Self-Boot_Engine|SBE]] update
 +
:Not kidding about this.  The BMC is insanely complicated &mdash; it's got an entire operating system in there for some reason.  It even has systemd.  The BMC's systemd often gets into a funky loop restarting <tt>hwmon</tt> over and over and over, interrupting the SBE SEEPROM reflash every time it does this.  Unplugging the PROC0 HSF 4-pin connector gets it to fail hard (due to inability to read the tachometer) and stay failed so the SBE update can proceed.  Ugly as this is, it's easier than trying to figure out what systemd thinks it's doing.
 +
 
 +
===SBE_MASTER_VERSION_DOWNLEVEL===
 +
If you see the following message reported in the console, then the [[Self-Boot_Engine|SBE]] update process did not work as expected:
 +
16.74709|Error reported by sbe (0x2200) PLID 0x90000008
 
  16.74823|  SBE Image Version Miscompare with Master Target
 
  16.74823|  SBE Image Version Miscompare with Master Target
 
  16.74824|  ModuleId  0x0d SBE_MASTER_VERSION_COMPARE
 
  16.74824|  ModuleId  0x0d SBE_MASTER_VERSION_COMPARE
 
  16.74825|  ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL
 
  16.74825|  ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL
 
  16.74826|  UserData1  Master Target HUID : 0x0000000000050000
 
  16.74826|  UserData1  Master Target HUID : 0x0000000000050000
  16.74826|  UserData2  Master Target Loop Index : 0x0000000000000000</pre>
+
  16.74826|  UserData2  Master Target Loop Index : 0x0000000000000000
 +
 
 +
The machine needs to be reset to finish the update procedure:
 +
root@talos:~# obmcutil chassisoff
 +
root@talos:~# systemctl stop xyz.openbmc_project.State.Host.service
 +
root@talos:~# systemctl start xyz.openbmc_project.State.Host.service
 +
root@talos:~# obmcutil poweron
  
The machine needs to be reset to finish the update proceedure using the following:
 
<pre>obmcutil chassisoff
 
systemctl stop xyz.openbmc_project.State.Host.service
 
systemctl start xyz.openbmc_project.State.Host.service
 
obmcutil poweron</pre>
 
 
The update should now complete as expected.
 
The update should now complete as expected.
  
A bug report is open[https://github.com/open-power/sbe/issues/7] to track this issue.
+
A [https://github.com/open-power/sbe/issues/7 bug report] is open to track this issue.
  
==== internal compiler error: Killed ====
+
===internal compiler error: Killed===
Building the hostboot source code requires a large amount of ram. If your machine runs out, you may see an error similar ot the following:
+
Building the Hostboot source code requires a large amount of RAM. If your machine runs out, you may see an error similar to the following:
<pre>powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)</pre>
+
powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)
 
To continue you have a few options:
 
To continue you have a few options:
 
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line
 
* Reduce the number of parallel jobs being run by appending -j<num> to you build command line
<pre>op-build -j4</pre>
+
op-build -j4
* Increase the swap space
+
* Increase the swap space (not recommended)
 
* Install additional RAM
 
* Install additional RAM
 +
 +
===Bugs===
 +
Firmware issues should be reported preferably upstream.
 +
If they are specific to [[Raptor Computing Systems|Raptor CS]] products, please report them [https://bugs.raptorengineering.com/ on their bug tracker].
  
 
== Building the OpenBMC firmware ==
 
== Building the OpenBMC firmware ==
=== Grabbing the sources ===
+
=== Downloading the sources ===
[[Raptor Computing Systems|Raptor CS]] maintains a public git repository containing the complete source code for the firmware.
+
[[Raptor Computing Systems|Raptor CS]] maintains a public Git repository containing the complete source code for the firmware.
To download the source code and check out the tag:
+
To download the source code:
 
+
$ git clone -b raptor-v{{CURRENT_BMC_VERSION}} https://git.raptorcs.com/git/talos-openbmc
  git clone https://git.raptorcs.com/git/talos-openbmc
 
  cd talos-openbmc
 
  git checkout raptor-v{{CURRENT_BMC_VERSION}}
 
  
 
=== Building the firmware ===
 
=== Building the firmware ===
Before building the firmware, all needed support packages must be installed. Please see the README.md file for directions on installing the needed packages.
+
Ensure that all needed support packages are installed. See the <tt>README.md</tt> for information on needed packages.
 
 
Once the packages are installed, the firmware can be build using the following commands:
 
<pre>cd talos-openbmc
 
export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf
 
. openbmc-env
 
bitbake obmc-phosphor-image
 
</pre>
 
 
 
The resulting firmware can be found in the tmp/deploy/images/talos/ directory.
 
 
 
=== Updating the firmware ===
 
Once firmware has been built, the resulting kernel and rofs binaries need to be copied over to the /run/initramfs/
 
<pre>
 
scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@<talos-openbmc>:/run/initramfs/
 
</pre>
 
  
Once the images have been transferred, reboot the BMC:
+
The firmware can then be built using the following commands:
<pre>root@<talos-openbmc> reboot</pre>
+
$ cd talos-openbmc
 +
$ export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf
 +
$ . openbmc-env
 +
$ bitbake obmc-phosphor-image
  
OpenBMC may take a while to reboot. Once complete, you will be able to log back in via ssh.
+
The resulting firmware image can then be found in the <tt>tmp/deploy/images/talos/</tt> directory.
  
=== BMC Recovery procedure via U-Boot ===
+
'''Note:''' If <tt>mboxd</tt> fails to build, you may need to [https://github.com/openbmc/openbmc/issues/2780 patch <tt>mboxd.bb</tt>].<br />
While these instructions have been successfully applied in practice, they are still preliminary. Ask questions in IRC if you are unclear on what to do!
+
'''Note:''' If building newer versions of the firmware, TEMPLATECONF has changed to TEMPLATECONF=meta-rcs/meta-talos/conf. This should be set before running <code>. open-env</code>. If not, do a git clean and start over with the new TEMPLATECONF.
<!-- Hi fellow wiki people! Ask Bdragon in IRC if you have questions about this procedure.  
 
    IRC user dragon_pilot was successfully able to recover a nonworking BMC from u-boot, these instructions are the result of that experiment.
 
    Further testing and refinement would be appreciated, preferably by someone who has easy access to an external flasher.
 
-->
 
  
In the event of a failure updating the BMC, but with a functioning u-boot, you can still recover by using U-Boot to manually bootstrap the BMC by manually loading a boot image over the network or BMC serial line.
+
===Installing the firmware===
 +
<span style="color:#FF0000">Warning: If you are attempting to upgrade a Talos system from firmware 1.06 or earlier to the 04-16-2019 beta branch, you must follow the directions at [[Talos_II/Firmware/Public_Beta]]. Failure to do so may result in a non-booting BMC.</span>
  
If your BMC flash is corrupted to the extent that U-Boot is not loading properly, you '''WILL''' need to remove and flash the BMC flash chip externally.
+
Once firmware has been built, the resulting <tt>image-kernel</tt> and <tt>image-rofs</tt> binaries must be copied to <tt>/run/initramfs/</tt> on the BMC:
 +
$ scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@$TALOS_BMC_ADDR:/run/initramfs/
  
* Prepare a TFTP server, and place <code>image-bmc</code>, <code>image-rofs</code>, and <code>image-kernel</code> in the root. (TODO: elaborate on how to set this up)
+
Once the images have been transferred, reboot the BMC. The new firmware files will be detected and automatically applied.
 +
root@talos:~# reboot
  
* Connect a serial console to the [[Talos_II/Building_FAQ#BMC_serial_port_J7701|BMC serial port]] (J7701, serial port bracket required) and set to 115200 8n1, disable RTS/CTS (hardware flow control).
+
The reboot may take some time. Once complete, you will be able to log back in via SSH.
* Disconnect and reconnect power to the machine to force a BMC restart. Press a key to interrupt auto-boot when prompted.
 
* Run <code>dhcp x.x.x.x:image-bmc</code>, replacing the IP address of your TFTP server. This will load a copy of the stock boot image into RAM.
 
* Run <code>bootm 83080000</code>. This will prepare and boot off of the loaded virtual image.
 
* If your rofs partition is not functional, you will be dropped into the systemd emergency shell at this point. Try both the password you set as well as the default <code>0penBmc</code>, it may be one or the other depending on the state of the rwfs partition. If it boots up properly instead of dropping you into the emergency shell, the problem is probably in your kernel partition and you can retry flashing your <code>image-kernel</code> using the normal procedure. (The rest of these instructions are for the systemd emergency shell.)
 
* <code>mount -t tmpfs none /tmp</code>
 
* run <code>udhcpc</code> to get an IP address. (TODO: verify that this is the actual command that you run. Do you have to specify the network interface too?)
 
* <code>cd /tmp</code>
 
* <code>tftp -g -r image-rofs x.x.x.x</code>
 
* <code>tftp -g -r image-kernel x.x.x.x</code>
 
* IMPORTANT: Use <code>md5sum</code>, <code>sha1sum</code>, or <code>sha256sum</code> to verify successful transfer of image-rofs and image-kernel! tftp is a very barebones protocol and relies on transport layer checksumming, which is optional and not always available in UDP!
 
* Verify that the output of <code>cat /sys/class/mtd/mtd3/name</code> is <code>kernel</code> and the output of <code>cat /sys/class/mtd/mtd4/name</code> is <code>rofs</code>. We will be flashing mtd partitions directly in the next step and this is the last chance to verify that they will be flashed to the correct partition.
 
* <code>flashcp -v image-kernel /dev/mtd3</code>
 
* <code>flashcp -v image-rofs /dev/mtd4</code>
 
* (TODO: Describe how to reset rwfs in case it was damaged as well?) note: the kernel param for bypassing rwfs is "overlay-filesystem-in-ram". Append it to the existing boot-args before running the bootm command. This can also be used as part of a password reset procedure.
 
* After the flash is complete, you can run restart the BMC and it should boot successfully.
 
  
* (TODO: Discussion of using Kermit to upload the image without network access) note: I (Bdragon) have successfully done a ram-only boot using cu's built in xmodem support (escape sequence ~X) to do an image transfer into RAM over the BMC serial interface.
+
===Recovering from failed firmware updates===
* (TODO: Discuss using u-boot's built in cmp tool to perform basic validation of the u-boot image against a second copy loaded into RAM.)
+
See [[Debricking the BMC]].
* (TODO: Write a u-boot standalone application to disable the AST watchdog, and write instructions for loading and executing it from the u-boot shell (the "go" command), to work around the cold-boot watchdog issue.)
 
* (TODO: Load recovery images over USB?) note: The onboard USB port is connected to the USB switch after all, so this might be problematic.
 
* (TODO: Discussion of u-boot memory map) Short version is: flash lives at 0x20000000 and default base address for the memory loading tools is 0x83000000. So add 0x63000000 to any flash address to get the eqivilent address for an image-bmc file loaded into RAM. For example, the bootable image of a loaded image-bmc is at 0x83080000.
 
  
=== Troubleshooting ===
+
[[Category:Guides]]
TODO
 

Latest revision as of 07:50, 9 April 2024

Purpose

The following steps can be used to compile and update the firmware on Raptor CS's OpenPOWER systems, such as the Talos II or Blackbird.

Applicability

These specific instructions are for the Talos II, though the process for compiling firmware for other OpenPOWER systems like Blackbird should be very similar.

Requirements

  • At least 25GB of free hard drive space
  • 16GB of free RAM

Building on Debian

The build system (op-build) has been primarily tested using Debian Stretch. Ensure you install the following packages:

# Packages needed for OpenPOWER Firmware builds
$ sudo apt install cscope ctags libz-dev libexpat-dev python texinfo build-essential g++ git bison flex unzip libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc wget bc rsync

# Packages needed for OpenBMC builds
$ sudo apt install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat

Building on other Linux distributions

If you are on a different distribution, such as Fedora 28, a Debian chroot is recommended:

$ sudo yum install debootstrap dpkg
$ sudo debootstrap stretch debian-chroot http://archive.debian.org/debian
$ sudo mount -t proc none debian-chroot/proc/
$ sudo mount -o bind /sys/ debian-chroot/sys/
$ sudo mount -o bind /dev/shm/ debian-chroot/dev/shm/

Enter the chroot and install the needed packages:

$ sudo chroot debian-chroot/
# apt install software-properties-common locales

# Packages needed for OpenPOWER Firmware builds
$ sudo apt install cscope ctags libz-dev libexpat-dev python texinfo build-essential g++ git bison flex unzip libssl-dev libxml-simple-perl libxml-sax-perl libxml2-dev libxml2-utils xsltproc wget bc rsync

# Packages needed for OpenBMC builds
$ sudo apt install git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat

Also create a user inside the chroot to build under:

$ useradd -m build-user -s /bin/bash
$ su build-user
$ cd

You can now use the chroot to build the firmware.

To enter the chroot in the future, you can run the following from any terminal:

sudo chroot debian-chroot/
su build-user
cd

Building the OpenPOWER Firmware

Downloading the sources

Raptor CS maintains a public Git repository containing the complete source code for the firmware. To download the source code for Talos systems:

git clone -b raptor-v2.10 --recursive https://gitlab.raptorengineering.com/openpower-firmware/machine-talos-ii/op-build.git

To download the source code for Blackbird systems:

git clone -b raptor-v2.10 --recursive https://gitlab.raptorengineering.com/openpower-firmware/machine-blackbird/op-build.git

Note: The master branch is often in a non-functional state. The latest firmware branch (either raptor-v2.10, raptor-v2.00 or raptor-v1.05 at the time of this update) should be used.

Building the firmware

Before building the firmware, check the README.md file to ensure that all needed packages are installed.

The firmware can then be built using the following commands:

$ cd talos-op-build
$ git submodule update
$ . op-build-env
$ export KERNEL_BITS=64 # needed when building on ppc64, or libopenssl will try to build in 32 bit mode
$ op-build talos_defconfig
$ op-build

You can pass -j<num-cores> to perform a parallel build (op-build invokes make), though this may result in very high memory usage.

If the build completes successfully, the final firmware image is at output/images/talos.pnor.

Rebuilding an individual package

To rebuild an individual package (such as Hostboot) and recreate the talos.pnor image, run:

$ op-build pkgname-rebuild openpower-pnor-rebuild

where pkgname is the name of the package to rebuild.

For example:

$ op-build hostboot-rebuild openpower-pnor-rebuild

Note when recompiling hostboot into a PNOR image with openpower-pnor-rebuild, it is usually recommended to force a machine XML rebuild as well:

$ rm -rf output/build/machine-xml-*
$ rm -rf output/build/hostboot-*
$ ./op-build openpower-pnor-rebuild

Installing the OpenPOWER firmware

Transfer image to BMC

Copy the firmware to the BMC:

$ scp ./output/images/talos.pnor root@$TALOS_BMC_ADDR:/tmp/

Establish BMC sessions

At this point, you should connect two SSH sessions to OpenBMC. In the first session, run the following to display the console during boot:

$ ssh -p 2200 root@$TALOS_BMC_ADDR

The console log will be useful in debugging any issues with the firmware that could occur.

In the second session, get a shell on the BMC via SSH:

$ ssh root@$TALOS_BMC_ADDR
root@talos:~#

Ensure the system is off before proceeding:

root@talos:~# obmcutil state
CurrentBMCState     : xyz.openbmc_project.State.BMC.BMCState.Ready
CurrentPowerState   : xyz.openbmc_project.State.Chassis.PowerState.Off
CurrentHostState    : xyz.openbmc_project.State.Host.HostState.Off

The CurrentHostState must be Off before continuing with the procedure. If the CurrentHostState is not Off, please turn off the machine:

root@talos:~# obmcutil chassisoff

Running the firmware temporarily

You can test the firmware without installing it, though this requires using the v2.00+ BMC firmware which contains an updated mboxd with support for file-backed pnor images.

First, stop mboxd:

root@talos:~# systemctl stop mboxd

Restart mboxd with the additional -b argument:

root@talos:~# mboxd -f 64M -w 1M -b file:/tmp/talos.pnor

You can now test the new firmware image by starting the machine:

root@talos:~# obmcutil poweron

When you have finished testing the image, stop the machine:

root@talos:~# obmcutil poweroff

Note: Ensure the machine is off before proceeding. Verify this by running obmcutil state.

Finally, terminate mboxd and restart the normal mboxd:

root@talos:~# systemctl start mboxd

Flashing the firmware

Warning: Some PNOR firmware updates may require a BMC update to function. Before flashing ensure that your installed BMC firmware is capable of booting the image. It is also possible to downgrade using these instructions in the event of a non-functioning firmware image.

Ensure the system is off.

Perform the update:

root@talos:~# pflash -E -p /tmp/talos.pnor

Start the machine:

root@talos:~# obmcutil poweron

Note: The machine may reboot multiple times when first booted after a firmware update. This is normal; do not interrupt the process.

Troubleshooting the OpenPOWER Firmware

General advice

Always upgrade PNOR and BMC together
Many mismatched PNOR/BMC version combinations lead to weird failures.
Try downgrading the PNOR+BMC firmware
Firmware package 1.04 seems the most reliable at updating the SBE SEEPROM inside the POWER9 chip package.
Always use processor socket 0 for SBE updates
The BMC firmware and/or FSI driver seem to either forget to update the SBE SEEPROM in the second CPU socket, leading to a boot with only CPU 0 active. When you get a brand new chip you need to install it in CPU socket 0 leaving socket 1 empty, wait for the double-reboot to update the SEEPROM, and then you can move that chip to socket 1 if you like.
Try unplugging the HSF fan power during SBE update
Not kidding about this. The BMC is insanely complicated — it's got an entire operating system in there for some reason. It even has systemd. The BMC's systemd often gets into a funky loop restarting hwmon over and over and over, interrupting the SBE SEEPROM reflash every time it does this. Unplugging the PROC0 HSF 4-pin connector gets it to fail hard (due to inability to read the tachometer) and stay failed so the SBE update can proceed. Ugly as this is, it's easier than trying to figure out what systemd thinks it's doing.

SBE_MASTER_VERSION_DOWNLEVEL

If you see the following message reported in the console, then the SBE update process did not work as expected:

16.74709|Error reported by sbe (0x2200) PLID 0x90000008
16.74823|  SBE Image Version Miscompare with Master Target
16.74824|  ModuleId   0x0d SBE_MASTER_VERSION_COMPARE
16.74825|  ReasonCode 0x2215 SBE_MASTER_VERSION_DOWNLEVEL
16.74826|  UserData1  Master Target HUID : 0x0000000000050000
16.74826|  UserData2  Master Target Loop Index : 0x0000000000000000

The machine needs to be reset to finish the update procedure:

root@talos:~# obmcutil chassisoff
root@talos:~# systemctl stop xyz.openbmc_project.State.Host.service
root@talos:~# systemctl start xyz.openbmc_project.State.Host.service
root@talos:~# obmcutil poweron

The update should now complete as expected.

A bug report is open to track this issue.

internal compiler error: Killed

Building the Hostboot source code requires a large amount of RAM. If your machine runs out, you may see an error similar to the following:

powerpc64le-buildroot-linux-gnu-g++.br_real: internal compiler error: Killed (program cc1plus)

To continue you have a few options:

  • Reduce the number of parallel jobs being run by appending -j<num> to you build command line
op-build -j4
  • Increase the swap space (not recommended)
  • Install additional RAM

Bugs

Firmware issues should be reported preferably upstream. If they are specific to Raptor CS products, please report them on their bug tracker.

Building the OpenBMC firmware

Downloading the sources

Raptor CS maintains a public Git repository containing the complete source code for the firmware. To download the source code:

$ git clone -b raptor-v1.07 https://git.raptorcs.com/git/talos-openbmc

Building the firmware

Ensure that all needed support packages are installed. See the README.md for information on needed packages.

The firmware can then be built using the following commands:

$ cd talos-openbmc
$ export TEMPLATECONF=meta-openbmc-machines/meta-openpower/meta-rcs/meta-talos/conf
$ . openbmc-env
$ bitbake obmc-phosphor-image

The resulting firmware image can then be found in the tmp/deploy/images/talos/ directory.

Note: If mboxd fails to build, you may need to patch mboxd.bb.
Note: If building newer versions of the firmware, TEMPLATECONF has changed to TEMPLATECONF=meta-rcs/meta-talos/conf. This should be set before running . open-env. If not, do a git clean and start over with the new TEMPLATECONF.

Installing the firmware

Warning: If you are attempting to upgrade a Talos system from firmware 1.06 or earlier to the 04-16-2019 beta branch, you must follow the directions at Talos_II/Firmware/Public_Beta. Failure to do so may result in a non-booting BMC.

Once firmware has been built, the resulting image-kernel and image-rofs binaries must be copied to /run/initramfs/ on the BMC:

$ scp tmp/deploy/images/talos/image-rofs tmp/deploy/images/talos/image-kernel root@$TALOS_BMC_ADDR:/run/initramfs/

Once the images have been transferred, reboot the BMC. The new firmware files will be detected and automatically applied.

root@talos:~# reboot

The reboot may take some time. Once complete, you will be able to log back in via SSH.

Recovering from failed firmware updates

See Debricking the BMC.