Difference between revisions of "Troubleshooting/BMC Power"

From RCS Wiki
Jump to navigation Jump to search
(Document register map)
Line 1: Line 1:
On the BMC:
+
Mainboard power sequencing is controlled by an iCE40 FPGA itself controlled by the BMC. This page provides information on how to interrogate the FPGA for status information when low-level power faults occur.
  
  i2cget -y 12 0x31 0x03  # <-- resets power seqencing error flags
+
=Starting points=
  i2cget -y 12 0x31 0x07  # ATX Power State
+
When troubleshooting power sequencing, start by reading the key registers from the BMC shell:
    # 0x20  --
 
    # 0x22  -- ATX PSU Start Request Sent
 
    # 0x23  -- ATX PSU Signalling Power Good
 
    # 0x2C  -- Timeout, previous or current operational fault detected
 
    # 0x2E  -- General Fault Detected in Power Sequencing
 
    # 0xyy
 
        ||
 
        |-------------- System status flags
 
        --------------- System configuration flags
 
  
   i2cget -y 12 0x31 0x18  # Rail fault indicator low bits
+
   $ i2cget -y 12 0x31 0x07  # Read ATX Power State
    # 0x01   -- ATX PSU power good not asserted in timeframe required
+
  $ i2cget -y 12 0x31 0x18  # Read Power Sequencer Fail Status 1
 +
   $ i2cget -y 12 0x31 0x19  # Read Power Sequencer Fail Status 2
  
   i2cget -y 12 0x31 0x19   # Rail fault indicator high bits
+
* If either of the Power Sequencer Fail Status registers reads as nonzero, a power sequencing error or voltage regulator failure has occurred.
 +
 
 +
* The last digit of the ATX Power State register is as follows:
 +
{| class="wikitable sortable"
 +
! Value
 +
! Meaning
 +
|-
 +
| 0x?0
 +
| System off
 +
|-
 +
| 0x?2
 +
| ATX power requested but not yet provided
 +
|-
 +
| 0x?3
 +
| ATX power good
 +
|-
 +
| 0x?C
 +
| Timeout, previous or current operational fault detected
 +
|-
 +
| 0x?E
 +
| General fault detected in power sequencing
 +
|}
 +
 
 +
* You can clear power sequencer error flags (enabling you to try powering on again) by running:
 +
 
 +
   $ i2cget -y 12 0x31 0x03   # Clear errors
 +
 
 +
=FPGA Register Map=
 +
 
 +
The FPGA on the mainboard is used for power sequencing. It exposes a register map to the BMC over I2C (bus 12, address 0x31).
 +
(Bit 0 below refers to the least significant bit.) The below information was derived from the [https://git.raptorcs.com/git/talos-system-fpga/tree/main.v Verilog source code for the FPGA].
 +
 
 +
To read the FPGA registers, use this command at the BMC shell:
 +
 
 +
  $ i2cget -y 12 0x31 0xREGNO
 +
 
 +
{| class="wikitable sortable"
 +
! Address
 +
! Type
 +
! Description
 +
|-
 +
| 0x00
 +
| RO
 +
| '''FPGA Bitstream Version'''.
 +
|-
 +
| 0x01
 +
| &mdash;
 +
| (not used)
 +
|-
 +
| 0x02
 +
| &mdash;
 +
| (not used)
 +
|-
 +
| 0x03
 +
| R2C
 +
| '''Clear Error'''. Reading this register clears power sequencer errors. The value read is meaningless.
 +
|-
 +
| 0x04
 +
| &mdash;
 +
| (not used)
 +
|-
 +
| 0x05
 +
| RO
 +
| '''Power Good Status 1'''. This register indicates the 'power good' indicators of various mainboard voltage regulators.
 +
{| class="wikitable"
 +
! Bit
 +
! Description
 +
|-
 +
| 0
 +
|  '''ATX Power Good.''' The ATX power supply is indicating that primary voltage rails are now usable.
 +
|-
 +
| 1
 +
| '''Miscellaneous I/O Power Good.'''
 +
|-
 +
| 2
 +
| '''CPU1 Vdn Power Good.'''
 +
|-
 +
| 3
 +
| '''CPU2 Vdn Power Good.'''
 +
|-
 +
| 4
 +
| '''AVdd Power Good.'''
 +
|-
 +
| 5
 +
| '''CPU1 Vio Power Good.'''
 +
|-
 +
| 6
 +
| '''CPU2 Vio Power Good.'''
 +
|-
 +
| 7
 +
| '''CPU1 Vdd Power Good.'''
 +
|}
 +
|-
 +
| 0x06
 +
| RO
 +
| '''Power Good Status 2'''.
 +
{| class="wikitable"
 +
! Bit
 +
! Description
 +
|-
 +
| 0
 +
| '''CPU2 Vdd Power Good.'''
 +
|-
 +
| 1
 +
| '''CPU1 Vcs Power Good.'''
 +
|-
 +
| 2
 +
| '''CPU2 Vcs Power Good.'''
 +
|-
 +
| 3
 +
| '''CPU1 Vpp Power Good.'''
 +
|-
 +
| 4
 +
| '''CPU2 Vpp Power Good.'''
 +
|-
 +
| 5
 +
| '''CPU1 Vddr/Vtt Power Good.'''
 +
|-
 +
| 6
 +
| '''CPU2 Vddr/Vtt Power Good.'''
 +
|-
 +
| 7
 +
| &mdash;
 +
|}
 +
 
 +
|-
 +
| 0x07
 +
| RO
 +
| '''ATX Power State.''' Bits are as follows:
 +
{| class="wikitable"
 +
! Bit
 +
! Description
 +
|-
 +
| 0
 +
|  '''ATX Power Good.''' The ATX power supply is indicating that primary voltage rails are now usable.
 +
|-
 +
| 1
 +
| '''ATX Power Requested.''' The mainboard is requesting that the ATX power supply bring primary voltage rails online.
 +
|-
 +
| 2
 +
| '''Error Found.''' Either a wait error or operation error occurred.
 +
|-
 +
| 3
 +
| '''Operation Error.'''
 +
|-
 +
| 4
 +
| '''Wait Error.''' A timeout occurred during a power sequencing step during power on. This can occur if the ATX power supply never assets power good, for example.
 +
|-
 +
| 5
 +
| '''CPU2 Present.''' A CPU module is installed in the second CPU socket.
 +
|-
 +
| 6
 +
| '''AST VGA Disabled.''' The disable BMC VGA jumper is set.
 +
|-
 +
| 7
 +
| '''Mode Set.''' ??? Set from BMC.
 +
|}
 +
|-
 +
| 0x08
 +
| RO
 +
| '''Power Enable Status 1'''. Indicates what power rails are currently requested on. The bits correspond to those of the 'Power Good Status 1' register.
 +
|-
 +
| 0x09
 +
| RO
 +
| '''Power Enable Status 2'''. Indicates what power rails are currently requested on. The bits correspond to those of the 'Power Good Status 2' register.
 +
|-
 +
| 0x0A
 +
| &mdash;
 +
| (not used)
 +
|-
 +
| 0x0B
 +
| &mdash;
 +
| (not used)
 +
|-
 +
| 0x0C
 +
| RO
 +
| '''Vendor ID Byte 1'''.
 +
|-
 +
| 0x0D
 +
| RO
 +
| '''Vendor ID Byte 2'''.
 +
|-
 +
| 0x0E
 +
| RO
 +
| '''Vendor ID Byte 3'''.
 +
|-
 +
| 0x0F
 +
| RO
 +
| '''Vendor ID Byte 4'''.
 +
|-
 +
| 0x10
 +
| RW
 +
| '''LED Override'''.
 +
|-
 +
| 0x18
 +
| RO
 +
| '''Power Sequencer Fail Status 1'''. Indicates what power rails have failed. The bits correspond to those of the 'Power Good Status 1' register. If this value is nonzero, at least one power rail failure has occurred. (The value is the XOR of the 'Power Enable Status 1' register and the 'Power Good Status 1' register.)
 +
|-
 +
| 0x19
 +
| RO
 +
| '''Power Sequencer Fail Status 2'''. Indicates what power rails have failed. The bits correspond to those of the 'Power Good Status 2' register. If this value is nonzero, at least one power rail failure has occurred. (The value is the XOR of the 'Power Enable Status 2' register and the 'Power Good Status 2' register.)
 +
|-
 +
| 0x33
 +
| RW
 +
| '''System Override'''.
 +
{| class="wikitable"
 +
! Bit
 +
! Description
 +
|-
 +
| 0
 +
|  '''ATX Force Enable.'''
 +
|-
 +
| 1
 +
| '''MFR Force Enable CPU2 Voltage Regulators.''' Causes voltage regulators to CPU2 to be enabled even if the CPU2 socket is detected as empty.
 +
|-
 +
| 2
 +
| '''MFR Force CPU2 Present.''' Causes power sequencing to fail if CPU2 voltage regulators don't come online even if CPU2 socket is detected as empty.
 +
|}
 +
|}

Revision as of 15:05, 17 August 2018

Mainboard power sequencing is controlled by an iCE40 FPGA itself controlled by the BMC. This page provides information on how to interrogate the FPGA for status information when low-level power faults occur.

Starting points

When troubleshooting power sequencing, start by reading the key registers from the BMC shell:

 $ i2cget -y 12 0x31 0x07   # Read ATX Power State
 $ i2cget -y 12 0x31 0x18   # Read Power Sequencer Fail Status 1
 $ i2cget -y 12 0x31 0x19   # Read Power Sequencer Fail Status 2
  • If either of the Power Sequencer Fail Status registers reads as nonzero, a power sequencing error or voltage regulator failure has occurred.
  • The last digit of the ATX Power State register is as follows:
Value Meaning
0x?0 System off
0x?2 ATX power requested but not yet provided
0x?3 ATX power good
0x?C Timeout, previous or current operational fault detected
0x?E General fault detected in power sequencing
  • You can clear power sequencer error flags (enabling you to try powering on again) by running:
 $ i2cget -y 12 0x31 0x03   # Clear errors

FPGA Register Map

The FPGA on the mainboard is used for power sequencing. It exposes a register map to the BMC over I2C (bus 12, address 0x31). (Bit 0 below refers to the least significant bit.) The below information was derived from the Verilog source code for the FPGA.

To read the FPGA registers, use this command at the BMC shell:

 $ i2cget -y 12 0x31 0xREGNO
Address Type Description
0x00 RO FPGA Bitstream Version.
0x01 (not used)
0x02 (not used)
0x03 R2C Clear Error. Reading this register clears power sequencer errors. The value read is meaningless.
0x04 (not used)
0x05 RO Power Good Status 1. This register indicates the 'power good' indicators of various mainboard voltage regulators.
Bit Description
0 ATX Power Good. The ATX power supply is indicating that primary voltage rails are now usable.
1 Miscellaneous I/O Power Good.
2 CPU1 Vdn Power Good.
3 CPU2 Vdn Power Good.
4 AVdd Power Good.
5 CPU1 Vio Power Good.
6 CPU2 Vio Power Good.
7 CPU1 Vdd Power Good.
0x06 RO Power Good Status 2.
Bit Description
0 CPU2 Vdd Power Good.
1 CPU1 Vcs Power Good.
2 CPU2 Vcs Power Good.
3 CPU1 Vpp Power Good.
4 CPU2 Vpp Power Good.
5 CPU1 Vddr/Vtt Power Good.
6 CPU2 Vddr/Vtt Power Good.
7
0x07 RO ATX Power State. Bits are as follows:
Bit Description
0 ATX Power Good. The ATX power supply is indicating that primary voltage rails are now usable.
1 ATX Power Requested. The mainboard is requesting that the ATX power supply bring primary voltage rails online.
2 Error Found. Either a wait error or operation error occurred.
3 Operation Error.
4 Wait Error. A timeout occurred during a power sequencing step during power on. This can occur if the ATX power supply never assets power good, for example.
5 CPU2 Present. A CPU module is installed in the second CPU socket.
6 AST VGA Disabled. The disable BMC VGA jumper is set.
7 Mode Set. ??? Set from BMC.
0x08 RO Power Enable Status 1. Indicates what power rails are currently requested on. The bits correspond to those of the 'Power Good Status 1' register.
0x09 RO Power Enable Status 2. Indicates what power rails are currently requested on. The bits correspond to those of the 'Power Good Status 2' register.
0x0A (not used)
0x0B (not used)
0x0C RO Vendor ID Byte 1.
0x0D RO Vendor ID Byte 2.
0x0E RO Vendor ID Byte 3.
0x0F RO Vendor ID Byte 4.
0x10 RW LED Override.
0x18 RO Power Sequencer Fail Status 1. Indicates what power rails have failed. The bits correspond to those of the 'Power Good Status 1' register. If this value is nonzero, at least one power rail failure has occurred. (The value is the XOR of the 'Power Enable Status 1' register and the 'Power Good Status 1' register.)
0x19 RO Power Sequencer Fail Status 2. Indicates what power rails have failed. The bits correspond to those of the 'Power Good Status 2' register. If this value is nonzero, at least one power rail failure has occurred. (The value is the XOR of the 'Power Enable Status 2' register and the 'Power Good Status 2' register.)
0x33 RW System Override.
Bit Description
0 ATX Force Enable.
1 MFR Force Enable CPU2 Voltage Regulators. Causes voltage regulators to CPU2 to be enabled even if the CPU2 socket is detected as empty.
2 MFR Force CPU2 Present. Causes power sequencing to fail if CPU2 voltage regulators don't come online even if CPU2 socket is detected as empty.