Skip to content

Firmware Dump

Amit Cohen edited this page Aug 20, 2020 · 3 revisions
Table of Contents
  1. Dumping Firmware Using mstflint
    1. Dumping Firmware State
  2. Automatic Firmware Dumping
  3. Further Resources

Dumping Firmware Using mstflint

In order to perform a firmware dump, please install mstflint. In Fedora it can be installed using dnf:

$ dnf install mstflint

Dumping Firmware State

Firmware state is dumped by the tool mstregdump. The dump file is used by Mellanox Support for hardware troubleshooting purposes. In order to put things in perspective, it is necessary to use it three times to perform three consecutive register dumps:

# for i in {1..3}; do mstregdump 01:00.0 > mstregdump$i; done

Where 01:00.0 is the PCI address of the device.

The dump files can then be packed and compressed, and the archive handed over to the party that requested it.

# tar cvJf mstregdump.tar.xz mstregdump[123]
mstregdump1
mstregdump2
mstregdump3

Automatic Firmware Dumping

The previous section described how a firmware dump can be triggered manually from user space. This section describes how a firmware dump can be taken automatically, in response to firmware fatal events. These events are supported by kernel 5.10 and above.

Firmware fatal events are reported to user space via netlink notifications as part of the devlink health mechanism. The fw_dump.py scripts listens to these netlink notifications and triggers multiple firmware dumps using the previously mentioned mstregdump utility.

The script can be started automatically using a systemd service unit file. For example:

# /etc/systemd/system/fw_dump.service
[Unit]
Description=Firmware dumps trigger

[Service]
Type=simple
ExecStart=/usr/local/bin/fw_dump.py

[Install]
WantedBy=multi-user.target

To start the service, run:

$ systemctl start fw_dump

To make the configuration persistent, run:

$ systemctl enable fw_dump.service

Further Resources

  1. man mstregdump
  2. man devlink-health
Clone this wiki locally