Mellanox ConnectX-6 Lx EN
Allikas: Imre kasutab arvutit
Mine navigeerimisribaleMine otsikasti
Sissejuhatus
Mellanox riistvara
Väited
- Mellanox nö kangemad võrguseadmed jaotatakse kaheks suuremaks osakonnaks: 1. SmartNIC, 2. SuperNIC
- SmartNIC - nvidia connectx seadmed
- SuperNIC - nvidia bluefield seadmed
ConnectX seadmed
- 'connectx-6 lx' ja 'connectx-6 dx' seadmed on kõik ethernet seadmed (st mitte infiniband)
- 'connectx-6' seade on füüsiliselt universaalne ethernet/infiniband seade, st võimalik on tarkvaraliselt kaardi poole pöördudes lülitada ta käima ethernet või infiniband režiimis
Mellanox integratsioonid
dpdk
- https://enterprise-support.nvidia.com/s/article/mellanox-dpdk
- https://doc.dpdk.org/guides/nics/mlx5.html
kernel tls
TODO
inbox draiver ja utiliidid
Misc
# lspci | grep 3d:00 3d:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 3d:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
devlink show andmed
# devlink dev show pci/0000:3d:00.0 pci/0000:3d:00.1
ja devlink info
# devlink dev info pci/0000:3d:00.0: driver mlx5_core versions: fixed: fw.psid SM_1281000001000 running: fw.version 26.35.2000 fw 26.35.2000 stored: fw.version 26.35.2000 fw 26.35.2000 pci/0000:3d:00.1: driver mlx5_core versions: fixed: fw.psid SM_1281000001000 running: fw.version 26.35.2000 fw 26.35.2000 stored: fw.version 26.35.2000 fw 26.35.2000
ethtool andmed
# ethtool -i ens7f0np0 driver: mlx5_core version: 5.15.0-92-generic firmware-version: 26.35.2000 (SM_1281000001000) expansion-rom-version: bus-info: 0000:3d:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes
kus
- ethtool ja devlink-dev-info väljundis klapib kaardil oleva firmware versioon - 26.35.2000
MLNX EN tarkvara kasutamine
Mõisted
- MFT - NVIDIA Firmware Tools, tõenäoliselt algupäraselt Mellanox Firmware Tools
Kasutamine füüsilise seadme tervikuna passthru režiimis
Väited
- üldiselt proxmox v. 8 keskkonnas saab mellanox seadme anda üle virtuaalsele arvuti tavalisel viisil (valides pve webgui liidses 'Add -> PCI device' ja näidates esimese MLNX seadme; teine lisatakse automaatselt
- tundub, et füüsilist mellanox seadet ei saa tervikuna nö täiuslikult virtuaalsele arvutile edasi anda, ühe asjana puudub jääb sr-iov võimekus
- virtuaalses arvutis saab võrguseadet kasutada tema PF osas, VF ei ole ligipääsetav
- virtuaalsele arvutile saab lisada tavalise pve webgui peal vIOMMU ning siis paigutatakse virtuaalse arvuti seadmed sh erinevad MLNX adapteri füüsilised pordid erinevatesse IOMMU gruppidesse
Sellisele asjakorraldusele on üldiselt iseloomulik, et host arvutis tegeleb edasi vfio draiver seadmega
root@pve-moraal-x570:~# lspci -vvv | grep vfio Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci
Tarkvara paigaldamine
TODO
root@debian-mlnx-01:~# mount /root/mlnx-en-24.04-0.6.6.0-debian12.1-x86_64.iso /mnt/mlnx
root@debian-mlnx-01:~# find /lib/modules/6.1.0-22-amd64/ -type f -mmin -20 -ls | grep dkms 661960 4312 -rw-r--r-- 1 root root 4415205 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_core.ko 661959 28 -rw-r--r-- 1 root root 25237 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx_compat.ko 661961 8 -rw-r--r-- 1 root root 5565 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_ib.ko 661962 52 -rw-r--r-- 1 root root 49445 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxfw.ko 661963 208 -rw-r--r-- 1 root root 210797 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxdevm.ko
Firmware uuendamine
root@debian-mlnx-01:~# apt-get install mlnx-fw-updater Reading package lists... Done Building dependency tree... Done Reading state information... Done The following package was automatically installed and is no longer required: linux-image-6.1.0-15-amd64 Use 'apt autoremove' to remove it. The following NEW packages will be installed: mlnx-fw-updater 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 0 B/50.2 MB of archives. After this operation, 87.9 MB of additional disk space will be used. Get:1 file:/mnt/mlnx/DEBS_ETH ./ mlnx-fw-updater 24.04-0.6.6.0 [50.2 MB] Selecting previously unselected package mlnx-fw-updater. (Reading database ... 63847 files and directories currently installed.) Preparing to unpack .../mlnx-fw-updater_24.04-0.6.6.0_amd64.deb ... Unpacking mlnx-fw-updater (24.04-0.6.6.0) ... Setting up mlnx-fw-updater (24.04-0.6.6.0) ... Initializing... Attempting to perform Firmware update... Querying Mellanox devices firmware ... Device #1: ---------- Device Type: ConnectX6LX Part Number: MCX631102AN-ADA_Ax Description: ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto PSID: MT_0000000531 PCI Device Name: 01:00.0 Base GUID: e8ebd303000b7874 Base MAC: e8ebd30b7874 Versions: Current Available FW 26.32.2004 26.41.1000 PXE 3.6.0502 3.7.0400 UEFI 14.25.0018 14.34.0012 Status: Update required --------- Found 1 device(s) requiring firmware update... Device #1: Updating FW ... FSMST_INITIALIZE - OK Writing Boot image component - OK Done Restart needed for updates to take effect. Log File: /tmp/oaFVUkaJsl Real log file: /tmp/mlnx_fw_update.log root@debian-mlnx-01:~# less /tmp/mlnx_fw_update.log CMD: mlxup -u --log-on-update --ssl-certificate /tmp/OloIGrYWuz/mlxfwmanager_sriov_dis_x86_64_4127-dir/ca-bundle.crt --current-dir /opt/mellanox/mlnx-fw-updater/ -L /tmp/oaFVUkaJsl -y -d 01:00.0 Querying Mellanox devices firmware ... Device #1: ---------- ...
Paistab, et tulemusena on kaardil olemas kaks versiooni firmwarest
root@debian-mlnx-01:~# devlink dev info pci/0000:01:00.0: driver mlx5_core versions: fixed: fw.psid MT_0000000531 running: fw.version 26.32.2004 fw 26.32.2004 stored: fw.version 26.41.1000 fw 26.41.1000 pci/0000:01:00.1: driver mlx5_core versions: fixed: fw.psid MT_0000000531 running: fw.version 26.32.2004 fw 26.32.2004 stored: fw.version 26.41.1000 fw 26.41.1000
kus
- running version - 26.32.2004
- stored version - 26.41.1000
systemd unit mlnx-en.d
Tundub, et mlx driveritega tegeleb systemd unit
root@debian-mlnx-01:~# dpkg -S /etc/mlnx-en.conf mlnx-en-utils: /etc/mlnx-en.conf root@debian-mlnx-01:~# cat /etc/mlnx-en.conf # Allow calling the service script with the option 'stop' for unloading the driver stack. # This flag should be disabled when the OS root file system is on remote storage. ALLOW_STOP=yes # Run sysctl performance tuning script RUN_SYSCTL=no # Run /usr/sbin/mlnx_tune RUN_MLNX_TUNE=no # Load MLX4 modules MLX4_LOAD=no # Load MLX5 modules MLX5_LOAD=yes root@debian-mlnx-01:~# systemctl start mlnx-en.d root@debian-mlnx-01:~# systemctl status mlnx-en.d ● mlnx-en.d.service - mlnx-en.d - configure Mellanox devices Loaded: loaded (/lib/systemd/system/mlnx-en.d.service; enabled; preset: enabled) Active: active (exited) since Sun 2024-06-30 00:49:29 EEST; 6s ago Docs: file:/etc/mlnx-en.conf Process: 1505 ExecStart=/etc/init.d/mlnx-en.d start (code=exited, status=0/SUCCESS) Main PID: 1505 (code=exited, status=0/SUCCESS) CPU: 385ms Jun 30 00:49:26 debian-mlnx-01 systemd[1]: Starting mlnx-en.d.service - mlnx-en.d - configure Mellanox devices... Jun 30 00:49:28 debian-mlnx-01 mlnx-en.d[1505]: [32B blob data] Jun 30 00:49:29 debian-mlnx-01 systemd[1]: Finished mlnx-en.d.service - mlnx-en.d - configure Mellanox devices.
samal ajal dmesg väljundis
# dmesg -T -w [Sun Jun 30 00:49:26 2024] Compat-mlnx-ofed backport release: 7037b8d [Sun Jun 30 00:49:26 2024] Backport based on https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git 7037b8d [Sun Jun 30 00:49:26 2024] compat.git: https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: firmware version: 26.32.2004 [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link) [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048) [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Port module event: module 0, Cable plugged [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: mlx5_pcie_event:304:(pid 1398): PCIe slot advertised sufficient power (75W). [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.0 enp1s0f0np0: renamed from eth0 [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: firmware version: 26.32.2004 [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Port module event: module 1, Cable plugged [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: mlx5_pcie_event:304:(pid 1391): PCIe slot advertised sufficient power (75W). [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1 enp1s0f1np1: renamed from eth0
Kusjuures 'systemctl stop mlnx-en.d' eemaldab mlx moodulid mälust
root@debian-mlnx-01:~# lsmod | grep mlx mlx5_core 2269184 0 mlxfw 36864 1 mlx5_core mlxdevm 180224 1 mlx5_core mlx_compat 20480 2 mlxdevm,mlx5_core psample 20480 1 mlx5_core tls 135168 1 mlx5_core pci_hyperv_intf 16384 1 mlx5_core root@debian-mlnx-01:~# systemctl stop mlnx-en.d root@debian-mlnx-01:~# lsmod | grep mlx root@debian-mlnx-01:~#
Misc
- https://www.youtube.com/watch?v=XLPgDEbUMgk - 'How to set Mellanox ConnectX VPI to Ethernet or Infiniband in Linux'