Mellanox ConnectX-6 Lx EN: erinevus redaktsioonide vahel

Allikas: Imre kasutab arvutit
Mine navigeerimisribaleMine otsikasti
P (Imre teisaldas lehekülje Mellanox ConnectX-6 Lx pealkirja Mellanox ConnectX-6 Lx EN alla)
 
(ei näidata sama kasutaja 68 vahepealset redaktsiooni)
1. rida: 1. rida:
 
===Sissejuhatus===
 
===Sissejuhatus===
  +
  +
===Mellanox riistvara===
  +
  +
Väited
  +
  +
* Mellanox nö kangemad võrguseadmed jaotatakse kaheks suuremaks osakonnaks: 1. SmartNIC, 2. SuperNIC
  +
* SmartNIC - nvidia connectx seadmed
  +
* SuperNIC - nvidia bluefield seadmed
  +
  +
ConnectX seadmed
  +
  +
* 'connectx-6 lx' ja 'connectx-6 dx' seadmed on kõik ethernet seadmed (st mitte infiniband)
  +
* 'connectx-6' seade on füüsiliselt universaalne ethernet/infiniband seade, st võimalik on tarkvaraliselt kaardi poole pöördudes lülitada ta käima ethernet või infiniband režiimis
  +
  +
===Mellanox integratsioonid===
  +
  +
Kasutatakse
  +
  +
* Ubuntu v. 24.04 platvormil
  +
* PVE v. 8.2.2
  +
* QEMU v. 9.0.0
  +
* OVS v. 3.1.0
  +
  +
Mellanox füüsilise kaardi kasutamise variatsioonid
  +
  +
* dual-port adapteri saab PVE webgui peal tervikuna nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutada 'All Functions') - tulemusena antakse valides ühe pcie seadme mõlemad st .0 ja .1 seadmed virtuaalsele arvutile; neid kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse
  +
* dual-port adapterist saab PVE webgui peal ühe füüsilise pordi nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutamata jätta 'All Functions') - tulemusena antakse valides ühe pcie seadme ainult see pcie seade, st .0 ja .1 seadme virtuaalsele arvutile; seda kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse
  +
* tavalisel viisil ovs abil: ovs-tava puutub asjasse
  +
* vhost-user draiveriga; ovs-dpdk puutub asjasse
  +
* representoriga; ovs-dpdk puutub asjasse
  +
* vdpa abil; ovs-dpdk puutub asjasse
  +
  +
====dpdk abil liikluse kohale toomine ovs switchi juurde====
  +
  +
Väited
  +
  +
* eesmärgiks on füüsiliselt võrgust kohale tuua ovs switchi peale füüsilise võrgukaardi juures dpdk lahendust kasutades võimalikult palju liiklust
  +
* ei tegelda liikluse edasi jõudmisega ovs switchi külge kinnitatud virtuaalse arvuti juurde
  +
* tegevused toimuvad Ubuntu 24.04 keskkonnas (põhjusel, et alustuseks on nii ehk selgem st võrreldes kohe PVE peal toimetama hakkamisega)
  +
* midagi ei kompileerita st kõik paigaldatakse Ubuntu tava apt repost
  +
  +
Tulemuseks ovs switch paistab selline
  +
  +
<pre>
  +
root@dpdp-u2404:~# ovs-vsctl show
  +
09d915bd-744b-4ff1-a223-983c02f05f3b
  +
Bridge br0
  +
datapath_type: netdev
  +
Port dpdk-p0
  +
Interface dpdk-p0
  +
type: dpdk
  +
options: {dpdk-devargs="0000:0f:00.0"}
  +
Port br0
  +
Interface br0
  +
type: internal
  +
Port vlan11
  +
tag: 11
  +
Interface vlan11
  +
type: internal
  +
ovs_version: "3.3.0"
  +
</pre>
  +
  +
Ning võrk toimib selliselt
  +
  +
<pre>
  +
root@dpdp-u2404:~# ping -c 4 192.168.1.254
  +
PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data.
  +
64 bytes from 192.168.1.254: icmp_seq=1 ttl=255 time=0.765 ms
  +
64 bytes from 192.168.1.254: icmp_seq=2 ttl=255 time=0.344 ms
  +
64 bytes from 192.168.1.254: icmp_seq=3 ttl=255 time=0.285 ms
  +
64 bytes from 192.168.1.254: icmp_seq=4 ttl=255 time=0.312 ms
  +
  +
--- 192.168.1.254 ping statistics ---
  +
4 packets transmitted, 4 received, 0% packet loss, time 3100ms
  +
rtt min/avg/max/mdev = 0.285/0.426/0.765/0.196 ms
  +
</pre>
  +
  +
Samal ajal ei ole nö tavalisel võrguliidesel midagi kuulda, põhjusel, et kernel ei tegele nende pakettidega tavalises mõttes
  +
  +
<pre>
  +
root@dpdp-u2404:~# tcpdump -ni enp15s0f0np0
  +
libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
  +
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
  +
listening on enp15s0f0np0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
  +
^C
  +
0 packets captured
  +
0 packets received by filter
  +
0 packets dropped by kernel
  +
</pre>
  +
  +
Sellise olukorra saavutamiseks Ubuntu v. 24.04 operatsioonisüsteemil paigaldatakse tarkvara OVS-DPDK, sobib taustaks vaadata juhendeid
  +
  +
* 'How to use DPDK with Open vSwitch' - https://ubuntu.com/server/docs/how-to-use-dpdk-with-open-vswitch
  +
* 'OVS Offload Using ASAP² Direct' - https://docs.nvidia.com/networking/display/mlnxofedv590590/ovs+offload+using+asap%C2%B2+direct#src-2408744435_safe-id-T1ZTT2ZmbG9hZFVzaW5nQVNBUMKyRGlyZWN0LU9WUy1LZXJuZWxIYXJkd2FyZU9mZmxvYWRz
  +
* 'Using Open vSwitch with DPDK' - https://docs.openvswitch.org/en/latest/howto/dpdk/
  +
  +
root@dpdp-u2404:~# apt-get install openvswitch-switch-dpdk
  +
  +
Muu hulgas paigaldatakse sõltuvustena paketid
  +
  +
* dpdk
  +
* openvswitch-switch
  +
  +
Ja kävitatakse ovs protsessid
  +
  +
<pre>
  +
root@dpdp-u2404:~# systemctl | grep ovs | grep runni
  +
ovs-vswitchd.service loaded active running Open vSwitch Forwarding Unit
  +
ovsdb-server.service loaded active running Open vSwitch Database Unit
  +
</pre>
  +
  +
kusjuures ovs tööd juhib fail 'root@dpdp-u2404:~# less /var/lib/openvswitch/conf.db' st kui midagi läheb ovs osakonna seadistamisel valesti sobib uuesti algamiseks lõpetada protsessid, kustutada failid ja käivitada protsessid
  +
  +
root@dpdp-u2404:~# systemctl stop ovs-vswitchd
  +
root@dpdp-u2404:~# systemctl stop ovsdb-server
  +
root@dpdp-u2404:~# rm /var/lib/openvswitch/.conf.db.~lock~
  +
root@dpdp-u2404:~# rm /var/lib/openvswitch/conf.db
  +
root@dpdp-u2404:~# systemctl stop ovs-vswitchd
  +
  +
Peale tarkvara paigaldamist on ovs käivitatud olekus ning sobib seda edasi seadistada
  +
  +
root@dpdp-u2404:~# echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
  +
root@dpdp-u2404:~# update-alternatives --get-selections
  +
root@dpdp-u2404:~# update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk
  +
root@dpdp-u2404:~# update-alternatives --get-selections
  +
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
  +
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1"
  +
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048"
  +
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0"
  +
root@dpdp-u2404:~# systemctl restart openvswitch-switch
  +
  +
other_config elemendi eemaldamiseks sobib öelda nt
  +
  +
ovs-vsctl remove Open_vSwitch . other_config dpdk-socket-mem
  +
  +
OVS rakenduses sisemiste sadistuste tegemiseks
  +
  +
root@dpdp-u2404:~# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
  +
root@dpdp-u2404:~# ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0
  +
root@dpdp-u2404:~# ovs-vsctl add-port br0 vlan11 tag=11 -- set interface vlan11 type=internal
  +
root@dpdp-u2404:~# ifconfig vlan11 192.168.1.57/24
  +
  +
Märkused
  +
  +
* lahendusele on iseloomulik, et üks CPU on 100% koormatud ('us' user load top väljundis)
  +
  +
Jõudluse hindamine
  +
  +
<pre>
  +
root@dpdp-u2404:~# timeout 30 hping3 -S -c 400000000 --flood -p 53 192.168.1.254
  +
  +
root@dpdp-u2404:~# vnstat -l vlan11
  +
vlan11 / traffic statistics
  +
  +
rx | tx
  +
--------------------------------------+------------------
  +
bytes 8.06 KiB | 456.04 MiB
  +
--------------------------------------+------------------
  +
max 3.32 kbit/s | 133.65 Mbit/s
  +
average 1.83 kbit/s | 106.27 Mbit/s
  +
min 944 bit/s | 0 bit/s
  +
--------------------------------------+------------------
  +
packets 138 | 8855498
  +
--------------------------------------+------------------
  +
max 7 p/s | 309383 p/s
  +
average 3 p/s | 245986 p/s
  +
min 2 p/s | 0 p/s
  +
--------------------------------------+------------------
  +
time 36 seconds
  +
</pre>
  +
  +
Tuunimiseks sobib kasutada nt selliseid hoobasid
  +
  +
root@dpdp-u2404:~# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4"
  +
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0
  +
  +
Tulemusena
  +
  +
<pre>
  +
root@dpdp-u2404:~# ovs-vsctl show
  +
09d915bd-744b-4ff1-a223-983c02f05f3b
  +
Bridge br0
  +
datapath_type: netdev
  +
Port dpdk-p0
  +
Interface dpdk-p0
  +
type: dpdk
  +
options: {dpdk-devargs="0000:0f:00.0", n_rxq="2"}
  +
...
  +
</pre>
  +
  +
ning 'top -> 1' peal on paista, et 0x00f0 määrab, et 4-7 cpu komplektist tuleb kasutada kõik protsessorid. Tulemusena on siseneva st kohale jõudva ja vastuvõetava liikluse maht mitu korda suurem, nt
  +
  +
<pre>
  +
root@dpdp-u2404:~# vnstat -l -i vlan11
  +
Monitoring vlan11... (press CTRL-C to stop)
  +
  +
rx: 382.62 Mbit/s 854067 p/s tx: 198.13 Mbit/s 427007 p/s^C
  +
...
  +
</pre>
  +
  +
Kasulikud lisamaterjalid
  +
  +
* https://enterprise-support.nvidia.com/s/article/mellanox-dpdk
  +
* https://doc.dpdk.org/guides/nics/mlx5.html
  +
* http://www.virtualopensystems.com/en/solutions/guides/snabbswitch-qemu/
  +
  +
====dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vhost user protokoll====
  +
  +
Väited
  +
  +
* vhost-user-client
  +
* tegevused toimuvad PVE v. 8.2.2 keskkonnas
  +
* tulemusena saab PVE webgui jms naturaalsel viisil virtuaalse arvutit opereerida (mh kävitada-seisata) kuid nt tema võrguosakond töötab koostöös dpdk lahendusega
  +
* midagi ei kompileerita st kõik paigaldatakse Debian v. 12 ja PVE tava apt repodest
  +
* lahtiseks jääb, et kuidas peaks valmistama ette PVE guesti võrgu osakonna, et ta toimiks dpdk abil kohale jõudnud paketttidega efektiivselt (praegu on see tavaline debian v. 12 arvuti virtio võrgundusega jne)
  +
  +
Lahendus jätkab ja täiendab eelmise punkti lahendust. PVE host ettevalmistamine
  +
  +
* dpdk tarkvara paigaldamine - vt eelmine Ubuntu v. 24.04 punkt
  +
* ovs dpdk omaduste seadistamine
  +
  +
<pre>
  +
root@pve-moraal-x570:~/20240706# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
  +
root@pve-moraal-x570:~/20240706# grep -i hugepages /proc/meminfo
  +
AnonHugePages: 0 kB
  +
ShmemHugePages: 0 kB
  +
FileHugePages: 0 kB
  +
HugePages_Total: 2048
  +
HugePages_Free: 2048
  +
HugePages_Rsvd: 0
  +
HugePages_Surp: 0
  +
Hugepagesize: 2048 kB
  +
  +
root@pve-moraal-x570:~/20240706# findmnt | grep huge
  +
│ ├─/dev/hugepages hugetlbfs hugetlbfs rw,relatime,pagesize=2M
  +
│ ├─/run/hugepages/kvm/2048kB hugetlbfs hugetlbfs rw,relatime,pagesize=2M
  +
│ └─/run/hugepages/kvm/1048576kB hugetlbfs hugetlbfs rw,relatime,pagesize=1024M
  +
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1"
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048"
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0"
  +
root@pve-moraal-x570:~/20240706# systemctl restart openvswitch-switch
  +
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0
  +
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vlan14 tag=14 -- set interface vlan14 type=internal
  +
root@pve-moraal-x570:~/20240706# ifconfig vlan14 192.168.112.169/24
  +
  +
root@pve-moraal-x570:~/20240706# mkdir /var/run/vhostuserclient
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuserclient "options:vhost-server-path=/var/run/vhostuserclient/vhost-user-client-1"
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl set port vhost-user-1 tag=10
  +
(see vist ei ole asjakohane root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true )
  +
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4"
  +
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0
  +
</pre>
  +
  +
PVE guest seadistusfail
  +
  +
<pre>
  +
root@pve-moraal-x570:~/20240706# cat /etc/pve/qemu-server/123.conf
  +
agent: 1
  +
bios: ovmf
  +
boot: order=virtio0;ide2;net0
  +
cores: 4
  +
cpu: host
  +
efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K
  +
ide2: none,media=cdrom
  +
machine: q35
  +
memory: 1024
  +
name: deb11-tm-tartu-btrfs
  +
# net0: virtio=12:4A:8D:1E:33:3D,bridge=vmbr0,firewall=1,tag=10
  +
numa: 1
  +
onboot: 0
  +
ostype: l26
  +
rng0: source=/dev/urandom
  +
scsihw: virtio-scsi-pci
  +
serial0: socket
  +
smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940
  +
sockets: 1
  +
virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G
  +
virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G
  +
virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G
  +
vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1
  +
hugepages: 1024
  +
args: -machine q35+pve0,kernel_irqchip=split \
  +
-device intel-iommu,intremap=on,caching-mode=on \
  +
-chardev socket,id=char1,path=/var/run/vhostuserclient/vhost-user-client-1,server=on \
  +
-netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce=on,queues=4 \
  +
-device virtio-net-pci,mac=12:4A:8D:1E:33:3D,netdev=mynet1,mq=on,vectors=10,rx_queue_size=1024,tx_queue_size=256
  +
</pre>
  +
  +
kus
  +
  +
* PVE webgui peal ei ole võrguliidest seadistatud, selle asemel on võrguliides moodustatud 'arg' abil käsitsi
  +
* 'args' abil on lisatud muid vajalikke elemente qemu guestile (võib olla 2024 seisuga ei ole see kõik vajalik, nt tundub, et 'kernel_irqchip=split' on niikuinii default
  +
* 'args' jätel olev komplekt chardev/netdev/device põhineb eeldusel, et eelpool kirjeldatud ovs ettevalmistused vhostuserclient osas on tehtud
  +
  +
Tulemus
  +
  +
* TODO
  +
  +
Kasulikud lisamaterjalid
  +
  +
* https://stackoverflow.com/questions/69710907/connect-qemu-kvm-vms-using-vhost-user-client-and-ovs-dpdk
  +
* https://docs.redhat.com/en/documentation/red_hat_openstack_platform/10/html/network_functions_virtualization_planning_guide/ch-vhost-user-ports
  +
* https://forum.proxmox.com/threads/tutorial-run-open-vswitch-ovs-dpdk-on-pve-7-0.97116/
  +
* https://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/
  +
* https://docs.nvidia.com/networking/display/mlnxenv24040700/ovs+offload+using+asap%C2%B2+direct#src-2958624697_safe-id-T1ZTT2ZmbG9hZFVzaW5nQVNBUMKyRGlyZWN0LWh3dmRwYQ
  +
* https://www.youtube.com/watch?v=y0ASTg3VCCc
  +
* https://www.intel.com/content/www/us/en/developer/articles/technical/data-plane-development-kit-vhost-user-client-mode-with-open-vswitch.html
  +
* https://www.redhat.com/en/blog/journey-vhost-users-realm
  +
* https://www.redhat.com/en/virtio-networking-series
  +
  +
====dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vf representor====
  +
  +
Esmalt veendutakse, et Mellanox füüsiline võrgukaart on olemas; ja eswitch legacy režiimis
  +
  +
<pre>
  +
root@pve-moraal-x570:~# lspci | grep Mellanox
  +
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
  +
root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.1
  +
pci/0000:0f:00.1: mode legacy inline-mode none encap-mode basic
  +
  +
root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.0
  +
pci/0000:0f:00.0: mode legacy inline-mode none encap-mode basic
  +
</pre>
  +
  +
Lülitatakse kaks VF sisse
  +
  +
<pre>
  +
root@pve-moraal-x570:~# echo 2 > /sys/class/net/enp15s0f0np0/device/sriov_numvfs
  +
  +
root@pve-moraal-x570:~# lspci | grep Mellanox
  +
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
</pre>
  +
  +
Omistatakse mac aadressid VF jaoks
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 0 mac e4:11:22:33:46:50
  +
root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 1 mac e4:11:22:33:46:51
  +
  +
root@pve-moraal-x570:~# ip link show dev enp15s0f0np0
  +
3: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
  +
link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff
  +
vf 0 link/ether e4:11:22:33:46:50 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
  +
vf 1 link/ether e4:11:22:33:46:51 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
  +
</pre>
  +
  +
Seotakse lahti mlx5_core draiver VF seadmest
  +
  +
<pre>
  +
root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use
  +
Kernel driver in use: mlx5_core
  +
Kernel driver in use: mlx5_core
  +
Kernel driver in use: mlx5_core
  +
Kernel driver in use: mlx5_core
  +
  +
root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
  +
root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind
  +
  +
root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use
  +
Kernel driver in use: mlx5_core
  +
Kernel driver in use: mlx5_core
  +
</pre>
  +
  +
Lülitatakse legacy asemel switchdev mode
  +
  +
root@pve-moraal-x570:~# devlink dev eswitch set pci/0000:0f:00.0 mode switchdev
  +
  +
Seostatakse VF seadmetega mlx5_core draiver
  +
  +
<pre>
  +
root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/bind
  +
root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/bind
  +
  +
root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use
  +
Kernel driver in use: mlx5_core
  +
Kernel driver in use: mlx5_core
  +
Kernel driver in use: mlx5_core
  +
Kernel driver in use: mlx5_core
  +
</pre>
  +
  +
Tehakse restart ovs protsessidele
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ovs-vsctl show
  +
9ffe33a7-4486-4701-b4c0-880c61f407b7
  +
ovs_version: "3.1.0"
  +
  +
root@pve-moraal-x570:~# systemctl restart openvswitch-switch
  +
  +
root@pve-moraal-x570:~# ovs-vsctl show
  +
9ffe33a7-4486-4701-b4c0-880c61f407b7
  +
ovs_version: "3.1.0"
  +
</pre>
  +
  +
Võetakse kasutusele hugepage ressurss
  +
  +
<pre>
  +
# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
  +
</pre>
  +
  +
Rakendatakse ovs üldised seadistused
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
  +
root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
  +
root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1"
  +
root@pve-moraal-x570:~# systemctl restart openvswitch-switch
  +
</pre>
  +
  +
Tekitatakse ovs sisu
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev
  +
root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0
  +
root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 representor -- set Interface representor type=dpdk options:dpdk-devargs=0000:0f:00.0,representor=0
  +
root@pve-moraal-x570:~# ovs-vsctl set port representor tag=10
  +
</pre>
  +
  +
Tulemusena
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ovs-vsctl list open_vSwitch
  +
_uuid : 9ffe33a7-4486-4701-b4c0-880c61f407b7
  +
bridges : [1d7619bb-7b33-4959-9266-325808d72c13]
  +
cur_cfg : 7
  +
datapath_types : [netdev, system]
  +
datapaths : {}
  +
db_version : "8.3.1"
  +
dpdk_initialized : true
  +
dpdk_version : "DPDK 22.11.5"
  +
external_ids : {hostname=pve-moraal-x570.sise.moraal.ee, rundir="/var/run/openvswitch", system-id="9e871242-44eb-499f-a049-32f089e65f68"}
  +
iface_types : [afxdp, afxdp-nonpmd, bareudp, dpdk, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan]
  +
manager_options : []
  +
next_cfg : 7
  +
other_config : {dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1", dpdk-init="true", hw-offload="true"}
  +
ovs_version : "3.1.0"
  +
ssl : []
  +
statistics : {}
  +
system_type : debian
  +
system_version : "12"
  +
</pre>
  +
  +
ja
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ovs-vsctl show
  +
9ffe33a7-4486-4701-b4c0-880c61f407b7
  +
Bridge vmbr0
  +
datapath_type: netdev
  +
Port dpdk-p0
  +
Interface dpdk-p0
  +
type: dpdk
  +
options: {dpdk-devargs="0000:0f:00.0"}
  +
Port vmbr0
  +
Interface vmbr0
  +
type: internal
  +
Port representor
  +
tag: 10
  +
Interface representor
  +
type: dpdk
  +
options: {dpdk-devargs="0000:0f:00.0,representor=0"}
  +
ovs_version: "3.1.0"
  +
</pre>
  +
  +
PVE virtuaalse arvuti seadistus, representor-lahendus töötab selliselt, et vf seade tuleb nö ededal ja tavalisel kujul anda virtuaalsele arvutile kasutada
  +
  +
<pre>
  +
root@pve-moraal-x570:~# cat /etc/pve/qemu-server/123.conf
  +
agent: 1
  +
bios: ovmf
  +
boot: order=virtio0;ide2;net0
  +
cores: 4
  +
cpu: host
  +
efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K
  +
hostpci0: 0000:0f:00.2
  +
hugepages: 1024
  +
ide2: none,media=cdrom
  +
machine: q35,viommu=intel
  +
memory: 1024
  +
name: deb11-tm-tartu-btrfs
  +
numa: 1
  +
onboot: 0
  +
ostype: l26
  +
rng0: source=/dev/urandom
  +
scsihw: virtio-scsi-pci
  +
serial0: socket
  +
smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940
  +
sockets: 1
  +
vga: virtio
  +
virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G
  +
virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G
  +
virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G
  +
vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1
  +
</pre>
  +
  +
Tulemusena
  +
  +
* võrk töötab
  +
* jõudlus on mõõdukas
  +
* virtuaalses arvutis kasutatakse võrguseadmel mlx5_core tavalist draiverit
  +
  +
<pre>
  +
root@tm-tartu-x570:~# lspci | grep Mella
  +
06:10.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
</pre>
  +
  +
PVE host peal on paista iseloomulik nö 100% ühe cpu kasutus
  +
  +
[[Fail:20240708-mlnx-pmd-01.png]]
  +
  +
Kasulikud lisamaterjalid
  +
  +
* https://docs.openvswitch.org/en/latest/topics/dpdk/phy/#
  +
* https://github.com/Mellanox/scalablefunctions/wiki/Upstream-step-by-step-guide
  +
* https://www.youtube.com/watch?v=37MN8C_MNuQ
  +
  +
====dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vdpa====
  +
  +
TODO
  +
  +
Kasulikud lisamaterjalid
  +
  +
* https://metonymical.hatenablog.com/entry/2021/04/14/002638
  +
  +
====kernel tls====
  +
  +
<pre>
  +
TODO
  +
</pre>
  +
  +
====inbox draiver ja utiliidid====
  +
  +
# apt-get install mstflint
  +
  +
Kaardi SRIOV muutmiseks sobib öelda
  +
  +
<pre>
  +
root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=False
  +
  +
Device #1:
  +
----------
  +
  +
Device type: ConnectX6LX
  +
Name: MCX631102AN-ADA_Ax
  +
Description: ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto
  +
Device: 0000:0f:00.0
  +
  +
Configurations: Next Boot New
  +
SRIOV_EN True(1) False(0)
  +
  +
Apply new Configuration? (y/n) [n] : y
  +
Applying... Done!
  +
-I- Please reboot machine to load new configurations.
  +
</pre>
  +
  +
Tulemusena ei ole enam VF võimekust
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s*
  +
lrwxrwxrwx 1 root root 0 Jun 30 03:32 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci
  +
-r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_device
  +
-r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_vendor
  +
</pre>
  +
  +
Peale tagasi sisse lülitamist on VF võimekus tagasi
  +
  +
<pre>
  +
root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=True
  +
  +
root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s*
  +
-rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_drivers_autoprobe
  +
-rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_numvfs
  +
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_offset
  +
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_stride
  +
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_totalvfs
  +
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_device
  +
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_total_msix
  +
lrwxrwxrwx 1 root root 0 Jun 30 03:39 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci
  +
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_device
  +
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_vendor
  +
  +
root@pve-moraal-x570:~# cat /sys/class/net/enp15s0f0np0/device/sriov_totalvfs
  +
8
  +
</pre>
  +
  +
Nende muudatuste tegemiseks peab olema secure boot välja lülitatud.
  +
  +
mstlink utiliit
  +
  +
<pre>
  +
root@pve-moraal-x570:~# mstlink -d 0000:0f:00.0 --show_device
  +
  +
Operational Info
  +
----------------
  +
State : Polling
  +
Physical state : ETH_AN_FSM_ENABLE
  +
Speed : N/A
  +
Width : N/A
  +
FEC : N/A
  +
Loopback Mode : No Loopback
  +
Auto Negotiation : FORCE - 25G,10G,1G
  +
  +
Supported Info
  +
--------------
  +
Enabled Link Speed (Ext.) : 0x00000052 (25G,10G,1G)
  +
Supported Cable Speed (Ext.) : 0x00000003 (1G,100M)
  +
  +
Troubleshooting Info
  +
--------------------
  +
Status Opcode : 36
  +
Group Opcode : PHY FW
  +
Recommendation : Force Mode no partner detected.
  +
  +
Tool Information
  +
----------------
  +
Firmware Version : 26.41.1000
  +
amBER Version : 2.05
  +
MSTFLINT Version : mstflint 4.21.0
  +
  +
Device Info
  +
-----------
  +
Part Number : N/A
  +
Part Name : N/A
  +
Serial Number : N/A
  +
Revision : N/A
  +
FW Version : 26.41.1000
  +
  +
Note: P/N, Product Name, S/N and Revision are supported only in switches
  +
</pre>
  +
  +
Virtual Functions kasutamiseks, lähtepunt on nö tavaolek
  +
  +
<pre>
  +
root@pve-moraal-x570:~# lspci | grep Mellanox
  +
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
</pre>
  +
  +
4 funktsiooni kasutamiseks sobib öelda
  +
  +
<pre>
  +
root@pve-moraal-x570:~# lspci | grep Mellanox
  +
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
  +
0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
0f:00.4 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
0f:00.5 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
</pre>
  +
  +
Kusjuures moodustatakse sellised seadmed
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ip link show dev enp15s0f0np0
  +
50: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
  +
link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff
  +
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
  +
vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
  +
vf 2 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
  +
vf 3 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
  +
root@pve-moraal-x570:~# ip link show dev enp15s0f0v0
  +
56: enp15s0f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
  +
link/ether 3a:48:d1:d6:57:dd brd ff:ff:ff:ff:ff:ff
  +
</pre>
  +
  +
VF seadet saab host peal kasutama asuda praktiliselt nagu tavalist võrguseadet. Teine variant on ta saata edasi pcie passthru abil PVE virtuaalsele arvutile. Virtuaalses arvutis paistab kaart selline
  +
  +
<pre>
  +
root@pve-sdn-01:~# lspci | grep Mell
  +
01:00.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
  +
root@pve-sdn-01:~# ethtool -i enp1s0
  +
driver: mlx5_core
  +
version: 6.8.8-1-pve
  +
firmware-version: 26.41.1000 (MT_0000000531)
  +
expansion-rom-version:
  +
bus-info: 0000:01:00.0
  +
supports-statistics: yes
  +
supports-test: yes
  +
supports-eeprom-access: no
  +
supports-register-dump: no
  +
supports-priv-flags: yes
  +
</pre>
  +
  +
kus
  +
  +
* virtuaalses arvutis sobib kasutada sama mlx5 draiverit mida kasutatakse pve host peal
  +
  +
===dpdk kasutamine===
  +
  +
Paigaldatakse dpdk ja dpdk-dev paketid
  +
  +
# apt-get install dpdk dpdk-dev
  +
  +
Tulemusena on süsteemis muu hulgas utiliidid
  +
  +
* dpdk-devbind.py
  +
* dpdk-hugepages.py
  +
* dpdk-testpmd
  +
  +
Olukorra hindamine, vt https://doc.dpdk.org/guides/nics/mlx5.html -> 'Usage example'
  +
  +
<pre>
  +
root@pve-moraal-x570:~# ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d / -f 5
  +
enp15s0f0np0
  +
enp15s0f1np1
  +
</pre>
  +
  +
testpmd käivitamine, täpslt ei ole saada aru, kas see on edukas käivitamine
  +
  +
<pre>
  +
root@pve-moraal-x570:~# dpdk-testpmd -l 8-15 -n 4 -a 0f:00.0 -a 0f:00.1 -- --rxq=2 --txq=2 -i
  +
EAL: Detected CPU lcores: 24
  +
EAL: Detected NUMA nodes: 1
  +
EAL: Detected shared linkage of DPDK
  +
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
  +
EAL: Selected IOVA mode 'VA'
  +
EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.0 (socket -1)
  +
EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.1 (socket -1)
  +
Interactive-mode selected
  +
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
  +
testpmd: create a new mbuf pool <mb_pool_0>: n=203456, size=2176, socket=0
  +
testpmd: preferred mempool ops selected: ring_mp_mc
  +
Configuring Port 0 (socket 0)
  +
Port 0: E8:EB:D3:0B:78:74
  +
Configuring Port 1 (socket 0)
  +
Port 1: E8:EB:D3:0B:78:75
  +
Checking link statuses...
  +
Done
  +
testpmd>
  +
</pre>
  +
  +
Näiteks
  +
  +
<pre>
  +
root@pve-moraal-x570:~# dpdk-testpmd -l 6-9 -n 4 -a 0f:00.0 -- --rxq=4 --txq=4 -i
  +
testpmd> set fwd txonly
  +
  +
testpmd> show port stats all
  +
  +
######################## NIC statistics for port 0 ########################
  +
RX-packets: 0 RX-missed: 0 RX-bytes: 0
  +
RX-errors: 0
  +
RX-nombuf: 0
  +
TX-packets: 5153664 TX-errors: 0 TX-bytes: 329834496
  +
  +
Throughput (since last show)
  +
Rx-pps: 0 Rx-bps: 0
  +
Tx-pps: 1420582 Tx-bps: 727338208
  +
############################################################################
  +
</pre>
  +
  +
ja teisel pordil võrku pealt kuulates paistab
  +
  +
<pre>
  +
root@pve-moraal-x570:~# tcpdump -c 4 -nei enp15s0f1np1
  +
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
  +
listening on enp15s0f1np1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
  +
09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
  +
09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
  +
09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
  +
09:07:53.700765 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
  +
</pre>
  +
  +
Kasulikud lisamaterjalid
  +
  +
* https://www.youtube.com/watch?v=KX1QOqMtchg
  +
* https://www.youtube.com/watch?v=0yDdMWQPCOI
  +
* https://www.youtube.com/watch?v=Un5-AN4nb9s
   
 
===Misc===
 
===Misc===
64. rida: 843. rida:
   
 
* ethtool ja devlink-dev-info väljundis klapib kaardil oleva firmware versioon - 26.35.2000
 
* ethtool ja devlink-dev-info väljundis klapib kaardil oleva firmware versioon - 26.35.2000
  +
  +
===Tootja MLNX EN tarkvara kasutamine===
  +
  +
====Mõisted====
  +
  +
* MFT - NVIDIA Firmware Tools, tõenäoliselt algupäraselt Mellanox Firmware Tools
  +
  +
====Kasutamine füüsilise seadme tervikuna passthru režiimis====
  +
  +
Väited
  +
  +
* üldiselt proxmox v. 8 keskkonnas saab mellanox seadme anda üle virtuaalsele arvuti tavalisel viisil (valides pve webgui liidses 'Add -> PCI device' ja näidates esimese MLNX seadme; teine lisatakse automaatselt
  +
* tundub, et füüsilist mellanox seadet ei saa tervikuna nö täiuslikult virtuaalsele arvutile edasi anda, ühe asjana puudub jääb sr-iov võimekus
  +
* virtuaalses arvutis saab võrguseadet kasutada tema PF osas, VF ei ole ligipääsetav
  +
* virtuaalsele arvutile saab lisada tavalise pve webgui peal vIOMMU ning siis paigutatakse virtuaalse arvuti seadmed sh erinevad MLNX adapteri füüsilised pordid erinevatesse IOMMU gruppidesse
  +
  +
Sellisele asjakorraldusele on üldiselt iseloomulik, et host arvutis tegeleb edasi vfio draiver seadmega
  +
  +
<pre>
  +
root@pve-moraal-x570:~# lspci -vvv | grep vfio
  +
Kernel driver in use: vfio-pci
  +
Kernel driver in use: vfio-pci
  +
</pre>
  +
  +
Ilma rebootida host peale seadme koos mlx driveri kasutamisega tagasi saamiseks sobib
  +
  +
* lõpetada virtuaalse arvuti töötamine
  +
* öelda host peal
  +
  +
<pre>
  +
echo 1 > /sys/bus/pci/devices/0000\:0f\:00.0/remove
  +
echo 1 > /sys/bus/pci/devices/0000\:0f\:00.1/remove
  +
echo 1 > /sys/bus/pci/rescan
  +
</pre>
  +
  +
====Tarkvara paigaldamine====
  +
  +
TODO
  +
  +
root@debian-mlnx-01:~# mount /root/mlnx-en-24.04-0.6.6.0-debian12.1-x86_64.iso /mnt/mlnx
  +
  +
<pre>
  +
root@debian-mlnx-01:~# find /lib/modules/6.1.0-22-amd64/ -type f -mmin -20 -ls | grep dkms
  +
661960 4312 -rw-r--r-- 1 root root 4415205 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_core.ko
  +
661959 28 -rw-r--r-- 1 root root 25237 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx_compat.ko
  +
661961 8 -rw-r--r-- 1 root root 5565 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_ib.ko
  +
661962 52 -rw-r--r-- 1 root root 49445 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxfw.ko
  +
661963 208 -rw-r--r-- 1 root root 210797 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxdevm.ko
  +
</pre>
  +
  +
====Firmware uuendamine====
  +
  +
<pre>
  +
root@debian-mlnx-01:~# apt-get install mlnx-fw-updater
  +
Reading package lists... Done
  +
Building dependency tree... Done
  +
Reading state information... Done
  +
The following package was automatically installed and is no longer required:
  +
linux-image-6.1.0-15-amd64
  +
Use 'apt autoremove' to remove it.
  +
The following NEW packages will be installed:
  +
mlnx-fw-updater
  +
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
  +
Need to get 0 B/50.2 MB of archives.
  +
After this operation, 87.9 MB of additional disk space will be used.
  +
Get:1 file:/mnt/mlnx/DEBS_ETH ./ mlnx-fw-updater 24.04-0.6.6.0 [50.2 MB]
  +
Selecting previously unselected package mlnx-fw-updater.
  +
(Reading database ... 63847 files and directories currently installed.)
  +
Preparing to unpack .../mlnx-fw-updater_24.04-0.6.6.0_amd64.deb ...
  +
Unpacking mlnx-fw-updater (24.04-0.6.6.0) ...
  +
Setting up mlnx-fw-updater (24.04-0.6.6.0) ...
  +
Initializing...
  +
Attempting to perform Firmware update...
  +
Querying Mellanox devices firmware ...
  +
  +
Device #1:
  +
----------
  +
  +
Device Type: ConnectX6LX
  +
Part Number: MCX631102AN-ADA_Ax
  +
Description: ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto
  +
PSID: MT_0000000531
  +
PCI Device Name: 01:00.0
  +
Base GUID: e8ebd303000b7874
  +
Base MAC: e8ebd30b7874
  +
Versions: Current Available
  +
FW 26.32.2004 26.41.1000
  +
PXE 3.6.0502 3.7.0400
  +
UEFI 14.25.0018 14.34.0012
  +
  +
Status: Update required
  +
  +
---------
  +
Found 1 device(s) requiring firmware update...
  +
  +
Device #1: Updating FW ...
  +
FSMST_INITIALIZE - OK
  +
Writing Boot image component - OK
  +
Done
  +
  +
Restart needed for updates to take effect.
  +
Log File: /tmp/oaFVUkaJsl
  +
Real log file: /tmp/mlnx_fw_update.log
  +
  +
root@debian-mlnx-01:~# less /tmp/mlnx_fw_update.log
  +
CMD: mlxup -u --log-on-update --ssl-certificate /tmp/OloIGrYWuz/mlxfwmanager_sriov_dis_x86_64_4127-dir/ca-bundle.crt --current-dir /opt/mellanox/mlnx-fw-updater/ -L /tmp/oaFVUkaJsl -y -d 01:00.0
  +
Querying Mellanox devices firmware ...
  +
  +
Device #1:
  +
----------
  +
  +
...
  +
</pre>
  +
  +
Paistab, et tulemusena on kaardil olemas kaks versiooni firmwarest
  +
  +
<pre>
  +
root@debian-mlnx-01:~# devlink dev info
  +
pci/0000:01:00.0:
  +
driver mlx5_core
  +
versions:
  +
fixed:
  +
fw.psid MT_0000000531
  +
running:
  +
fw.version 26.32.2004
  +
fw 26.32.2004
  +
stored:
  +
fw.version 26.41.1000
  +
fw 26.41.1000
  +
pci/0000:01:00.1:
  +
driver mlx5_core
  +
versions:
  +
fixed:
  +
fw.psid MT_0000000531
  +
running:
  +
fw.version 26.32.2004
  +
fw 26.32.2004
  +
stored:
  +
fw.version 26.41.1000
  +
fw 26.41.1000
  +
</pre>
  +
  +
kus
  +
  +
* running version - 26.32.2004
  +
* stored version - 26.41.1000
  +
  +
====systemd unit mlnx-en.d====
  +
  +
Tundub, et mlx driveritega tegeleb systemd unit
  +
  +
<pre>
  +
root@debian-mlnx-01:~# dpkg -S /etc/mlnx-en.conf
  +
mlnx-en-utils: /etc/mlnx-en.conf
  +
  +
root@debian-mlnx-01:~# cat /etc/mlnx-en.conf
  +
# Allow calling the service script with the option 'stop' for unloading the driver stack.
  +
# This flag should be disabled when the OS root file system is on remote storage.
  +
ALLOW_STOP=yes
  +
  +
# Run sysctl performance tuning script
  +
RUN_SYSCTL=no
  +
  +
# Run /usr/sbin/mlnx_tune
  +
RUN_MLNX_TUNE=no
  +
  +
# Load MLX4 modules
  +
MLX4_LOAD=no
  +
  +
# Load MLX5 modules
  +
MLX5_LOAD=yes
  +
  +
root@debian-mlnx-01:~# systemctl start mlnx-en.d
  +
root@debian-mlnx-01:~# systemctl status mlnx-en.d
  +
● mlnx-en.d.service - mlnx-en.d - configure Mellanox devices
  +
Loaded: loaded (/lib/systemd/system/mlnx-en.d.service; enabled; preset: enabled)
  +
Active: active (exited) since Sun 2024-06-30 00:49:29 EEST; 6s ago
  +
Docs: file:/etc/mlnx-en.conf
  +
Process: 1505 ExecStart=/etc/init.d/mlnx-en.d start (code=exited, status=0/SUCCESS)
  +
Main PID: 1505 (code=exited, status=0/SUCCESS)
  +
CPU: 385ms
  +
  +
Jun 30 00:49:26 debian-mlnx-01 systemd[1]: Starting mlnx-en.d.service - mlnx-en.d - configure Mellanox devices...
  +
Jun 30 00:49:28 debian-mlnx-01 mlnx-en.d[1505]: [32B blob data]
  +
Jun 30 00:49:29 debian-mlnx-01 systemd[1]: Finished mlnx-en.d.service - mlnx-en.d - configure Mellanox devices.
  +
</pre>
  +
  +
samal ajal dmesg väljundis
  +
  +
<pre>
  +
# dmesg -T -w
  +
  +
[Sun Jun 30 00:49:26 2024] Compat-mlnx-ofed backport release: 7037b8d
  +
[Sun Jun 30 00:49:26 2024] Backport based on https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git 7037b8d
  +
[Sun Jun 30 00:49:26 2024] compat.git: https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git
  +
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: firmware version: 26.32.2004
  +
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link)
  +
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
  +
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
  +
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Port module event: module 0, Cable plugged
  +
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: mlx5_pcie_event:304:(pid 1398): PCIe slot advertised sufficient power (75W).
  +
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic)
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.0 enp1s0f0np0: renamed from eth0
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: firmware version: 26.32.2004
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link)
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Port module event: module 1, Cable plugged
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: mlx5_pcie_event:304:(pid 1391): PCIe slot advertised sufficient power (75W).
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic)
  +
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1 enp1s0f1np1: renamed from eth0
  +
</pre>
  +
  +
Kusjuures 'systemctl stop mlnx-en.d' eemaldab mlx moodulid mälust
  +
  +
<pre>
  +
root@debian-mlnx-01:~# lsmod | grep mlx
  +
mlx5_core 2269184 0
  +
mlxfw 36864 1 mlx5_core
  +
mlxdevm 180224 1 mlx5_core
  +
mlx_compat 20480 2 mlxdevm,mlx5_core
  +
psample 20480 1 mlx5_core
  +
tls 135168 1 mlx5_core
  +
pci_hyperv_intf 16384 1 mlx5_core
  +
  +
root@debian-mlnx-01:~# systemctl stop mlnx-en.d
  +
root@debian-mlnx-01:~# lsmod | grep mlx
  +
root@debian-mlnx-01:~#
  +
</pre>
  +
  +
====devlink kasutamine====
  +
  +
<pre>
  +
root@debian-mlnx-01:~# devlink dev param show pci/0000:01:00.0 name enable_roce
  +
pci/0000:01:00.0:
  +
name enable_roce type generic
  +
values:
  +
cmode driverinit value true
  +
</pre>
  +
  +
====Monitoring====
  +
  +
Tundub, et inimese jaoks on see kaarti küljes olev füüsiline radiaator päris kuum (sõrme küljes ei jaoks hoida), ja tundub, et sisuliselt on see ok, 'The adapter card incorporates the ConnectX IC, which operates in the range of temperatures between 0°C and 105°C.', https://docs.nvidia.com/networking/display/connectx6lxen/monitoring
  +
  +
<pre>
  +
root@debian-mlnx-01:~# devlink dev
  +
pci/0000:01:00.0
  +
pci/0000:01:00.1
  +
root@debian-mlnx-01:~# mget_temp -d 0000:01:00.0
  +
82
  +
root@debian-mlnx-01:~# mget_temp -d 0000:01:00.1
  +
83
  +
</pre>
  +
  +
Kasulikud lisamaterjalid
  +
  +
* https://docs.nvidia.com/networking/display/connectx6lxen/monitoring
   
 
===Misc===
 
===Misc===
   
 
* https://www.youtube.com/watch?v=XLPgDEbUMgk - 'How to set Mellanox ConnectX VPI to Ethernet or Infiniband in Linux'
 
* https://www.youtube.com/watch?v=XLPgDEbUMgk - 'How to set Mellanox ConnectX VPI to Ethernet or Infiniband in Linux'
  +
  +
===E810 kasutamine===
  +
  +
Kasulikud lisamaterjalid
  +
  +
* https://bugzilla.redhat.com/show_bug.cgi?id=2082528
  +
* https://bugzilla.redhat.com/show_bug.cgi?id=1878026
   
 
===Kasulikud lisamaterjalid===
 
===Kasulikud lisamaterjalid===

Viimane redaktsioon: 15. juuli 2024, kell 01:48

Sissejuhatus

Mellanox riistvara

Väited

  • Mellanox nö kangemad võrguseadmed jaotatakse kaheks suuremaks osakonnaks: 1. SmartNIC, 2. SuperNIC
  • SmartNIC - nvidia connectx seadmed
  • SuperNIC - nvidia bluefield seadmed

ConnectX seadmed

  • 'connectx-6 lx' ja 'connectx-6 dx' seadmed on kõik ethernet seadmed (st mitte infiniband)
  • 'connectx-6' seade on füüsiliselt universaalne ethernet/infiniband seade, st võimalik on tarkvaraliselt kaardi poole pöördudes lülitada ta käima ethernet või infiniband režiimis

Mellanox integratsioonid

Kasutatakse

  • Ubuntu v. 24.04 platvormil
  • PVE v. 8.2.2
  • QEMU v. 9.0.0
  • OVS v. 3.1.0

Mellanox füüsilise kaardi kasutamise variatsioonid

  • dual-port adapteri saab PVE webgui peal tervikuna nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutada 'All Functions') - tulemusena antakse valides ühe pcie seadme mõlemad st .0 ja .1 seadmed virtuaalsele arvutile; neid kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse
  • dual-port adapterist saab PVE webgui peal ühe füüsilise pordi nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutamata jätta 'All Functions') - tulemusena antakse valides ühe pcie seadme ainult see pcie seade, st .0 ja .1 seadme virtuaalsele arvutile; seda kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse
  • tavalisel viisil ovs abil: ovs-tava puutub asjasse
  • vhost-user draiveriga; ovs-dpdk puutub asjasse
  • representoriga; ovs-dpdk puutub asjasse
  • vdpa abil; ovs-dpdk puutub asjasse

dpdk abil liikluse kohale toomine ovs switchi juurde

Väited

  • eesmärgiks on füüsiliselt võrgust kohale tuua ovs switchi peale füüsilise võrgukaardi juures dpdk lahendust kasutades võimalikult palju liiklust
  • ei tegelda liikluse edasi jõudmisega ovs switchi külge kinnitatud virtuaalse arvuti juurde
  • tegevused toimuvad Ubuntu 24.04 keskkonnas (põhjusel, et alustuseks on nii ehk selgem st võrreldes kohe PVE peal toimetama hakkamisega)
  • midagi ei kompileerita st kõik paigaldatakse Ubuntu tava apt repost

Tulemuseks ovs switch paistab selline

root@dpdp-u2404:~# ovs-vsctl show
09d915bd-744b-4ff1-a223-983c02f05f3b
    Bridge br0
        datapath_type: netdev
        Port dpdk-p0
            Interface dpdk-p0
                type: dpdk
                options: {dpdk-devargs="0000:0f:00.0"}
        Port br0
            Interface br0
                type: internal
        Port vlan11
            tag: 11
            Interface vlan11
                type: internal
    ovs_version: "3.3.0"

Ning võrk toimib selliselt

root@dpdp-u2404:~# ping -c 4 192.168.1.254
PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data.
64 bytes from 192.168.1.254: icmp_seq=1 ttl=255 time=0.765 ms
64 bytes from 192.168.1.254: icmp_seq=2 ttl=255 time=0.344 ms
64 bytes from 192.168.1.254: icmp_seq=3 ttl=255 time=0.285 ms
64 bytes from 192.168.1.254: icmp_seq=4 ttl=255 time=0.312 ms

--- 192.168.1.254 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3100ms
rtt min/avg/max/mdev = 0.285/0.426/0.765/0.196 ms

Samal ajal ei ole nö tavalisel võrguliidesel midagi kuulda, põhjusel, et kernel ei tegele nende pakettidega tavalises mõttes

root@dpdp-u2404:~# tcpdump -ni enp15s0f0np0
libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp15s0f0np0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

Sellise olukorra saavutamiseks Ubuntu v. 24.04 operatsioonisüsteemil paigaldatakse tarkvara OVS-DPDK, sobib taustaks vaadata juhendeid

root@dpdp-u2404:~# apt-get install openvswitch-switch-dpdk

Muu hulgas paigaldatakse sõltuvustena paketid

  • dpdk
  • openvswitch-switch

Ja kävitatakse ovs protsessid

root@dpdp-u2404:~# systemctl | grep ovs | grep runni
  ovs-vswitchd.service                loaded active running   Open vSwitch Forwarding Unit
  ovsdb-server.service                loaded active running   Open vSwitch Database Unit

kusjuures ovs tööd juhib fail 'root@dpdp-u2404:~# less /var/lib/openvswitch/conf.db' st kui midagi läheb ovs osakonna seadistamisel valesti sobib uuesti algamiseks lõpetada protsessid, kustutada failid ja käivitada protsessid

root@dpdp-u2404:~# systemctl stop ovs-vswitchd
root@dpdp-u2404:~# systemctl stop ovsdb-server
root@dpdp-u2404:~# rm /var/lib/openvswitch/.conf.db.~lock~
root@dpdp-u2404:~# rm /var/lib/openvswitch/conf.db
root@dpdp-u2404:~# systemctl stop ovs-vswitchd

Peale tarkvara paigaldamist on ovs käivitatud olekus ning sobib seda edasi seadistada

root@dpdp-u2404:~# echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
root@dpdp-u2404:~# update-alternatives --get-selections
root@dpdp-u2404:~# update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk
root@dpdp-u2404:~# update-alternatives --get-selections
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1"
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048"
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0"
root@dpdp-u2404:~# systemctl restart openvswitch-switch

other_config elemendi eemaldamiseks sobib öelda nt

ovs-vsctl remove Open_vSwitch . other_config dpdk-socket-mem

OVS rakenduses sisemiste sadistuste tegemiseks

root@dpdp-u2404:~# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
root@dpdp-u2404:~# ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0
root@dpdp-u2404:~# ovs-vsctl add-port br0 vlan11 tag=11 -- set interface vlan11 type=internal
root@dpdp-u2404:~# ifconfig vlan11 192.168.1.57/24

Märkused

  • lahendusele on iseloomulik, et üks CPU on 100% koormatud ('us' user load top väljundis)

Jõudluse hindamine

root@dpdp-u2404:~# timeout 30 hping3 -S -c 400000000 --flood -p 53 192.168.1.254

root@dpdp-u2404:~# vnstat -l vlan11
vlan11  /  traffic statistics

                           rx         |       tx
--------------------------------------+------------------
  bytes                     8.06 KiB  |      456.04 MiB
--------------------------------------+------------------
          max            3.32 kbit/s  |   133.65 Mbit/s
      average            1.83 kbit/s  |   106.27 Mbit/s
          min              944 bit/s  |         0 bit/s
--------------------------------------+------------------
  packets                        138  |         8855498
--------------------------------------+------------------
          max                  7 p/s  |      309383 p/s
      average                  3 p/s  |      245986 p/s
          min                  2 p/s  |           0 p/s
--------------------------------------+------------------
  time                    36 seconds

Tuunimiseks sobib kasutada nt selliseid hoobasid

root@dpdp-u2404:~# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4"
root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0

Tulemusena

root@dpdp-u2404:~# ovs-vsctl show
09d915bd-744b-4ff1-a223-983c02f05f3b
    Bridge br0
        datapath_type: netdev
        Port dpdk-p0
            Interface dpdk-p0
                type: dpdk
                options: {dpdk-devargs="0000:0f:00.0", n_rxq="2"}
...

ning 'top -> 1' peal on paista, et 0x00f0 määrab, et 4-7 cpu komplektist tuleb kasutada kõik protsessorid. Tulemusena on siseneva st kohale jõudva ja vastuvõetava liikluse maht mitu korda suurem, nt

root@dpdp-u2404:~# vnstat -l -i vlan11
Monitoring vlan11...    (press CTRL-C to stop)

      rx:  382.62 Mbit/s 854067 p/s       tx:  198.13 Mbit/s 427007 p/s^C
...

Kasulikud lisamaterjalid

dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vhost user protokoll

Väited

  • vhost-user-client
  • tegevused toimuvad PVE v. 8.2.2 keskkonnas
  • tulemusena saab PVE webgui jms naturaalsel viisil virtuaalse arvutit opereerida (mh kävitada-seisata) kuid nt tema võrguosakond töötab koostöös dpdk lahendusega
  • midagi ei kompileerita st kõik paigaldatakse Debian v. 12 ja PVE tava apt repodest
  • lahtiseks jääb, et kuidas peaks valmistama ette PVE guesti võrgu osakonna, et ta toimiks dpdk abil kohale jõudnud paketttidega efektiivselt (praegu on see tavaline debian v. 12 arvuti virtio võrgundusega jne)

Lahendus jätkab ja täiendab eelmise punkti lahendust. PVE host ettevalmistamine

  • dpdk tarkvara paigaldamine - vt eelmine Ubuntu v. 24.04 punkt
  • ovs dpdk omaduste seadistamine
root@pve-moraal-x570:~/20240706# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
root@pve-moraal-x570:~/20240706# grep -i hugepages /proc/meminfo
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    2048
HugePages_Free:     2048
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

root@pve-moraal-x570:~/20240706# findmnt | grep huge
│ ├─/dev/hugepages                                      hugetlbfs                  hugetlbfs   rw,relatime,pagesize=2M
│ ├─/run/hugepages/kvm/2048kB                           hugetlbfs                  hugetlbfs   rw,relatime,pagesize=2M
│ └─/run/hugepages/kvm/1048576kB                        hugetlbfs                  hugetlbfs   rw,relatime,pagesize=1024M

root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1"
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048"
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0"
root@pve-moraal-x570:~/20240706# systemctl restart openvswitch-switch

root@pve-moraal-x570:~/20240706# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev
root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0

root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vlan14 tag=14 -- set interface vlan14 type=internal
root@pve-moraal-x570:~/20240706# ifconfig vlan14 192.168.112.169/24

root@pve-moraal-x570:~/20240706# mkdir /var/run/vhostuserclient
root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuserclient "options:vhost-server-path=/var/run/vhostuserclient/vhost-user-client-1"
root@pve-moraal-x570:~/20240706# ovs-vsctl set port vhost-user-1 tag=10
(see vist ei ole asjakohane root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true )

root@pve-moraal-x570:~/20240706# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4"
root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0

PVE guest seadistusfail

root@pve-moraal-x570:~/20240706# cat /etc/pve/qemu-server/123.conf 
agent: 1
bios: ovmf
boot: order=virtio0;ide2;net0
cores: 4
cpu: host
efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K
ide2: none,media=cdrom
machine: q35
memory: 1024
name: deb11-tm-tartu-btrfs
# net0: virtio=12:4A:8D:1E:33:3D,bridge=vmbr0,firewall=1,tag=10
numa: 1
onboot: 0
ostype: l26
rng0: source=/dev/urandom
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940
sockets: 1
virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G
virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G
virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G
vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1
hugepages: 1024
args: -machine q35+pve0,kernel_irqchip=split \
  -device intel-iommu,intremap=on,caching-mode=on \
  -chardev socket,id=char1,path=/var/run/vhostuserclient/vhost-user-client-1,server=on \
  -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce=on,queues=4 \
  -device virtio-net-pci,mac=12:4A:8D:1E:33:3D,netdev=mynet1,mq=on,vectors=10,rx_queue_size=1024,tx_queue_size=256

kus

  • PVE webgui peal ei ole võrguliidest seadistatud, selle asemel on võrguliides moodustatud 'arg' abil käsitsi
  • 'args' abil on lisatud muid vajalikke elemente qemu guestile (võib olla 2024 seisuga ei ole see kõik vajalik, nt tundub, et 'kernel_irqchip=split' on niikuinii default
  • 'args' jätel olev komplekt chardev/netdev/device põhineb eeldusel, et eelpool kirjeldatud ovs ettevalmistused vhostuserclient osas on tehtud

Tulemus

  • TODO

Kasulikud lisamaterjalid

dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vf representor

Esmalt veendutakse, et Mellanox füüsiline võrgukaart on olemas; ja eswitch legacy režiimis

root@pve-moraal-x570:~# lspci | grep Mellanox
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]

root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.1
pci/0000:0f:00.1: mode legacy inline-mode none encap-mode basic

root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.0
pci/0000:0f:00.0: mode legacy inline-mode none encap-mode basic

Lülitatakse kaks VF sisse

root@pve-moraal-x570:~# echo 2 > /sys/class/net/enp15s0f0np0/device/sriov_numvfs

root@pve-moraal-x570:~# lspci | grep Mellanox
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function

Omistatakse mac aadressid VF jaoks

root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 0 mac e4:11:22:33:46:50
root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 1 mac e4:11:22:33:46:51

root@pve-moraal-x570:~# ip link show dev enp15s0f0np0
3: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether e4:11:22:33:46:50 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether e4:11:22:33:46:51 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off

Seotakse lahti mlx5_core draiver VF seadmest

root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use
	Kernel driver in use: mlx5_core
	Kernel driver in use: mlx5_core
	Kernel driver in use: mlx5_core
	Kernel driver in use: mlx5_core

root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind 
root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind 

root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use
	Kernel driver in use: mlx5_core
	Kernel driver in use: mlx5_core

Lülitatakse legacy asemel switchdev mode

root@pve-moraal-x570:~# devlink dev eswitch set pci/0000:0f:00.0 mode switchdev

Seostatakse VF seadmetega mlx5_core draiver

root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/bind 
root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/bind
 
root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use
	Kernel driver in use: mlx5_core
	Kernel driver in use: mlx5_core
	Kernel driver in use: mlx5_core
	Kernel driver in use: mlx5_core

Tehakse restart ovs protsessidele

root@pve-moraal-x570:~# ovs-vsctl show
9ffe33a7-4486-4701-b4c0-880c61f407b7
    ovs_version: "3.1.0"

root@pve-moraal-x570:~# systemctl restart openvswitch-switch

root@pve-moraal-x570:~# ovs-vsctl show
9ffe33a7-4486-4701-b4c0-880c61f407b7
    ovs_version: "3.1.0"

Võetakse kasutusele hugepage ressurss

# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

Rakendatakse ovs üldised seadistused

root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true"
root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1"
root@pve-moraal-x570:~# systemctl restart openvswitch-switch

Tekitatakse ovs sisu

root@pve-moraal-x570:~# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev
root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0
root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 representor -- set Interface representor type=dpdk options:dpdk-devargs=0000:0f:00.0,representor=0
root@pve-moraal-x570:~# ovs-vsctl set port representor tag=10

Tulemusena

root@pve-moraal-x570:~# ovs-vsctl list open_vSwitch
_uuid               : 9ffe33a7-4486-4701-b4c0-880c61f407b7
bridges             : [1d7619bb-7b33-4959-9266-325808d72c13]
cur_cfg             : 7
datapath_types      : [netdev, system]
datapaths           : {}
db_version          : "8.3.1"
dpdk_initialized    : true
dpdk_version        : "DPDK 22.11.5"
external_ids        : {hostname=pve-moraal-x570.sise.moraal.ee, rundir="/var/run/openvswitch", system-id="9e871242-44eb-499f-a049-32f089e65f68"}
iface_types         : [afxdp, afxdp-nonpmd, bareudp, dpdk, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan]
manager_options     : []
next_cfg            : 7
other_config        : {dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1", dpdk-init="true", hw-offload="true"}
ovs_version         : "3.1.0"
ssl                 : []
statistics          : {}
system_type         : debian
system_version      : "12"

ja

root@pve-moraal-x570:~# ovs-vsctl show
9ffe33a7-4486-4701-b4c0-880c61f407b7
    Bridge vmbr0
        datapath_type: netdev
        Port dpdk-p0
            Interface dpdk-p0
                type: dpdk
                options: {dpdk-devargs="0000:0f:00.0"}
        Port vmbr0
            Interface vmbr0
                type: internal
        Port representor
            tag: 10
            Interface representor
                type: dpdk
                options: {dpdk-devargs="0000:0f:00.0,representor=0"}
    ovs_version: "3.1.0"

PVE virtuaalse arvuti seadistus, representor-lahendus töötab selliselt, et vf seade tuleb nö ededal ja tavalisel kujul anda virtuaalsele arvutile kasutada

root@pve-moraal-x570:~# cat /etc/pve/qemu-server/123.conf 
agent: 1
bios: ovmf
boot: order=virtio0;ide2;net0
cores: 4
cpu: host
efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 0000:0f:00.2
hugepages: 1024
ide2: none,media=cdrom
machine: q35,viommu=intel
memory: 1024
name: deb11-tm-tartu-btrfs
numa: 1
onboot: 0
ostype: l26
rng0: source=/dev/urandom
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940
sockets: 1
vga: virtio
virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G
virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G
virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G
vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1

Tulemusena

  • võrk töötab
  • jõudlus on mõõdukas
  • virtuaalses arvutis kasutatakse võrguseadmel mlx5_core tavalist draiverit
root@tm-tartu-x570:~# lspci | grep Mella
06:10.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function

PVE host peal on paista iseloomulik nö 100% ühe cpu kasutus

20240708-mlnx-pmd-01.png

Kasulikud lisamaterjalid

dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vdpa

TODO

Kasulikud lisamaterjalid

kernel tls

TODO

inbox draiver ja utiliidid

# apt-get install mstflint

Kaardi SRIOV muutmiseks sobib öelda

root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=False

Device #1:
----------

Device type:    ConnectX6LX     
Name:           MCX631102AN-ADA_Ax
Description:    ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto
Device:         0000:0f:00.0    

Configurations:                              Next Boot       New
         SRIOV_EN                            True(1)         False(0)        

 Apply new Configuration? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.

Tulemusena ei ole enam VF võimekust

root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s*
lrwxrwxrwx 1 root root    0 Jun 30 03:32 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci
-r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_device
-r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_vendor

Peale tagasi sisse lülitamist on VF võimekus tagasi

root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=True

root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s*
-rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_drivers_autoprobe
-rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_numvfs
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_offset
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_stride
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_totalvfs
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_device
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_total_msix
lrwxrwxrwx 1 root root    0 Jun 30 03:39 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_device
-r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_vendor

root@pve-moraal-x570:~# cat /sys/class/net/enp15s0f0np0/device/sriov_totalvfs
8

Nende muudatuste tegemiseks peab olema secure boot välja lülitatud.

mstlink utiliit

root@pve-moraal-x570:~# mstlink -d 0000:0f:00.0 --show_device

Operational Info
----------------
State                           : Polling
Physical state                  : ETH_AN_FSM_ENABLE
Speed                           : N/A
Width                           : N/A
FEC                             : N/A
Loopback Mode                   : No Loopback
Auto Negotiation                : FORCE - 25G,10G,1G

Supported Info
--------------
Enabled Link Speed (Ext.)       : 0x00000052 (25G,10G,1G)
Supported Cable Speed (Ext.)    : 0x00000003 (1G,100M)

Troubleshooting Info
--------------------
Status Opcode                   : 36
Group Opcode                    : PHY FW
Recommendation                  : Force Mode no partner detected.

Tool Information
----------------
Firmware Version                : 26.41.1000
amBER Version                   : 2.05
MSTFLINT Version                : mstflint 4.21.0

Device Info
-----------
Part Number                     : N/A
Part Name                       : N/A
Serial Number                   : N/A
Revision                        : N/A
FW Version                      : 26.41.1000
 
Note: P/N, Product Name, S/N and Revision are supported only in switches

Virtual Functions kasutamiseks, lähtepunt on nö tavaolek

root@pve-moraal-x570:~# lspci | grep Mellanox
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]

4 funktsiooni kasutamiseks sobib öelda

root@pve-moraal-x570:~# lspci | grep Mellanox
0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0f:00.4 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0f:00.5 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function

Kusjuures moodustatakse sellised seadmed

root@pve-moraal-x570:~# ip link show dev enp15s0f0np0
50: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
root@pve-moraal-x570:~# ip link show dev enp15s0f0v0
56: enp15s0f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 3a:48:d1:d6:57:dd brd ff:ff:ff:ff:ff:ff

VF seadet saab host peal kasutama asuda praktiliselt nagu tavalist võrguseadet. Teine variant on ta saata edasi pcie passthru abil PVE virtuaalsele arvutile. Virtuaalses arvutis paistab kaart selline

root@pve-sdn-01:~# lspci | grep Mell
01:00.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
root@pve-sdn-01:~# ethtool -i enp1s0
driver: mlx5_core
version: 6.8.8-1-pve
firmware-version: 26.41.1000 (MT_0000000531)
expansion-rom-version: 
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

kus

  • virtuaalses arvutis sobib kasutada sama mlx5 draiverit mida kasutatakse pve host peal

dpdk kasutamine

Paigaldatakse dpdk ja dpdk-dev paketid

# apt-get install dpdk dpdk-dev

Tulemusena on süsteemis muu hulgas utiliidid

  • dpdk-devbind.py
  • dpdk-hugepages.py
  • dpdk-testpmd

Olukorra hindamine, vt https://doc.dpdk.org/guides/nics/mlx5.html -> 'Usage example'

root@pve-moraal-x570:~# ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d / -f 5
enp15s0f0np0
enp15s0f1np1

testpmd käivitamine, täpslt ei ole saada aru, kas see on edukas käivitamine

root@pve-moraal-x570:~# dpdk-testpmd -l 8-15 -n 4 -a 0f:00.0 -a 0f:00.1 -- --rxq=2 --txq=2 -i
EAL: Detected CPU lcores: 24
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.0 (socket -1)
EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.1 (socket -1)
Interactive-mode selected
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=203456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: E8:EB:D3:0B:78:74
Configuring Port 1 (socket 0)
Port 1: E8:EB:D3:0B:78:75
Checking link statuses...
Done
testpmd> 

Näiteks

root@pve-moraal-x570:~# dpdk-testpmd -l 6-9 -n 4 -a 0f:00.0 -- --rxq=4 --txq=4 -i
testpmd> set fwd txonly

testpmd> show port stats all

  ######################## NIC statistics for port 0  ########################
  RX-packets: 0          RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0         
  TX-packets: 5153664    TX-errors: 0          TX-bytes:  329834496

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:      1420582          Tx-bps:    727338208
  ############################################################################

ja teisel pordil võrku pealt kuulates paistab

root@pve-moraal-x570:~# tcpdump -c 4 -nei enp15s0f1np1
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp15s0f1np1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
09:07:53.700765 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22

Kasulikud lisamaterjalid

Misc

# lspci | grep 3d:00
3d:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
3d:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]

devlink show andmed

# devlink dev show
pci/0000:3d:00.0
pci/0000:3d:00.1

ja devlink info

# devlink dev info
pci/0000:3d:00.0:
  driver mlx5_core
  versions:
      fixed:
        fw.psid SM_1281000001000
      running:
        fw.version 26.35.2000
        fw 26.35.2000
      stored:
        fw.version 26.35.2000
        fw 26.35.2000
pci/0000:3d:00.1:
  driver mlx5_core
  versions:
      fixed:
        fw.psid SM_1281000001000
      running:
        fw.version 26.35.2000
        fw 26.35.2000
      stored:
        fw.version 26.35.2000
        fw 26.35.2000

ethtool andmed

# ethtool -i ens7f0np0
driver: mlx5_core
version: 5.15.0-92-generic
firmware-version: 26.35.2000 (SM_1281000001000)
expansion-rom-version: 
bus-info: 0000:3d:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

kus

  • ethtool ja devlink-dev-info väljundis klapib kaardil oleva firmware versioon - 26.35.2000

Tootja MLNX EN tarkvara kasutamine

Mõisted

  • MFT - NVIDIA Firmware Tools, tõenäoliselt algupäraselt Mellanox Firmware Tools

Kasutamine füüsilise seadme tervikuna passthru režiimis

Väited

  • üldiselt proxmox v. 8 keskkonnas saab mellanox seadme anda üle virtuaalsele arvuti tavalisel viisil (valides pve webgui liidses 'Add -> PCI device' ja näidates esimese MLNX seadme; teine lisatakse automaatselt
  • tundub, et füüsilist mellanox seadet ei saa tervikuna nö täiuslikult virtuaalsele arvutile edasi anda, ühe asjana puudub jääb sr-iov võimekus
  • virtuaalses arvutis saab võrguseadet kasutada tema PF osas, VF ei ole ligipääsetav
  • virtuaalsele arvutile saab lisada tavalise pve webgui peal vIOMMU ning siis paigutatakse virtuaalse arvuti seadmed sh erinevad MLNX adapteri füüsilised pordid erinevatesse IOMMU gruppidesse

Sellisele asjakorraldusele on üldiselt iseloomulik, et host arvutis tegeleb edasi vfio draiver seadmega

root@pve-moraal-x570:~# lspci -vvv |  grep vfio
	Kernel driver in use: vfio-pci
	Kernel driver in use: vfio-pci

Ilma rebootida host peale seadme koos mlx driveri kasutamisega tagasi saamiseks sobib

  • lõpetada virtuaalse arvuti töötamine
  • öelda host peal
echo 1 > /sys/bus/pci/devices/0000\:0f\:00.0/remove
echo 1 > /sys/bus/pci/devices/0000\:0f\:00.1/remove
echo 1 > /sys/bus/pci/rescan

Tarkvara paigaldamine

TODO

root@debian-mlnx-01:~# mount /root/mlnx-en-24.04-0.6.6.0-debian12.1-x86_64.iso /mnt/mlnx
root@debian-mlnx-01:~# find /lib/modules/6.1.0-22-amd64/ -type f -mmin -20 -ls | grep dkms
   661960   4312 -rw-r--r--   1 root     root      4415205 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_core.ko
   661959     28 -rw-r--r--   1 root     root        25237 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx_compat.ko
   661961      8 -rw-r--r--   1 root     root         5565 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_ib.ko
   661962     52 -rw-r--r--   1 root     root        49445 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxfw.ko
   661963    208 -rw-r--r--   1 root     root       210797 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxdevm.ko

Firmware uuendamine

root@debian-mlnx-01:~# apt-get install mlnx-fw-updater
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following package was automatically installed and is no longer required:
  linux-image-6.1.0-15-amd64
Use 'apt autoremove' to remove it.
The following NEW packages will be installed:
  mlnx-fw-updater
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/50.2 MB of archives.
After this operation, 87.9 MB of additional disk space will be used.
Get:1 file:/mnt/mlnx/DEBS_ETH ./ mlnx-fw-updater 24.04-0.6.6.0 [50.2 MB]
Selecting previously unselected package mlnx-fw-updater.
(Reading database ... 63847 files and directories currently installed.)
Preparing to unpack .../mlnx-fw-updater_24.04-0.6.6.0_amd64.deb ...
Unpacking mlnx-fw-updater (24.04-0.6.6.0) ...
Setting up mlnx-fw-updater (24.04-0.6.6.0) ...
Initializing...
Attempting to perform Firmware update...
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX6LX
  Part Number:      MCX631102AN-ADA_Ax
  Description:      ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto
  PSID:             MT_0000000531
  PCI Device Name:  01:00.0
  Base GUID:        e8ebd303000b7874
  Base MAC:         e8ebd30b7874
  Versions:         Current        Available     
     FW             26.32.2004     26.41.1000    
     PXE            3.6.0502       3.7.0400      
     UEFI           14.25.0018     14.34.0012    

  Status:           Update required

---------
Found 1 device(s) requiring firmware update...

Device #1: Updating FW ...     
FSMST_INITIALIZE -   OK          
Writing Boot image component -   OK          
Done

Restart needed for updates to take effect.
Log File: /tmp/oaFVUkaJsl
Real log file: /tmp/mlnx_fw_update.log

root@debian-mlnx-01:~# less /tmp/mlnx_fw_update.log
CMD: mlxup -u --log-on-update --ssl-certificate /tmp/OloIGrYWuz/mlxfwmanager_sriov_dis_x86_64_4127-dir/ca-bundle.crt --current-dir /opt/mellanox/mlnx-fw-updater/  -L /tmp/oaFVUkaJsl -y -d 01:00.0 
Querying Mellanox devices firmware ...

Device #1:
----------

...

Paistab, et tulemusena on kaardil olemas kaks versiooni firmwarest

root@debian-mlnx-01:~# devlink dev info
pci/0000:01:00.0:
  driver mlx5_core
  versions:
      fixed:
        fw.psid MT_0000000531
      running:
        fw.version 26.32.2004
        fw 26.32.2004
      stored:
        fw.version 26.41.1000
        fw 26.41.1000
pci/0000:01:00.1:
  driver mlx5_core
  versions:
      fixed:
        fw.psid MT_0000000531
      running:
        fw.version 26.32.2004
        fw 26.32.2004
      stored:
        fw.version 26.41.1000
        fw 26.41.1000

kus

  • running version - 26.32.2004
  • stored version - 26.41.1000

systemd unit mlnx-en.d

Tundub, et mlx driveritega tegeleb systemd unit

root@debian-mlnx-01:~# dpkg -S /etc/mlnx-en.conf 
mlnx-en-utils: /etc/mlnx-en.conf

root@debian-mlnx-01:~# cat /etc/mlnx-en.conf 
# Allow calling the service script with the option 'stop' for unloading the driver stack.
# This flag should be disabled when the OS root file system is on remote storage.
ALLOW_STOP=yes

# Run sysctl performance tuning script
RUN_SYSCTL=no

# Run /usr/sbin/mlnx_tune
RUN_MLNX_TUNE=no

# Load MLX4 modules
MLX4_LOAD=no

# Load MLX5 modules
MLX5_LOAD=yes

root@debian-mlnx-01:~# systemctl start mlnx-en.d
root@debian-mlnx-01:~# systemctl status mlnx-en.d
● mlnx-en.d.service - mlnx-en.d - configure Mellanox devices
     Loaded: loaded (/lib/systemd/system/mlnx-en.d.service; enabled; preset: enabled)
     Active: active (exited) since Sun 2024-06-30 00:49:29 EEST; 6s ago
       Docs: file:/etc/mlnx-en.conf
    Process: 1505 ExecStart=/etc/init.d/mlnx-en.d start (code=exited, status=0/SUCCESS)
   Main PID: 1505 (code=exited, status=0/SUCCESS)
        CPU: 385ms

Jun 30 00:49:26 debian-mlnx-01 systemd[1]: Starting mlnx-en.d.service - mlnx-en.d - configure Mellanox devices...
Jun 30 00:49:28 debian-mlnx-01 mlnx-en.d[1505]: [32B blob data]
Jun 30 00:49:29 debian-mlnx-01 systemd[1]: Finished mlnx-en.d.service - mlnx-en.d - configure Mellanox devices.

samal ajal dmesg väljundis

# dmesg -T -w

[Sun Jun 30 00:49:26 2024] Compat-mlnx-ofed backport release: 7037b8d
[Sun Jun 30 00:49:26 2024] Backport based on https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git 7037b8d
[Sun Jun 30 00:49:26 2024] compat.git: https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: firmware version: 26.32.2004
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link)
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Port module event: module 0, Cable plugged
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: mlx5_pcie_event:304:(pid 1398): PCIe slot advertised sufficient power (75W).
[Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic)
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.0 enp1s0f0np0: renamed from eth0
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: firmware version: 26.32.2004
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link)
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Port module event: module 1, Cable plugged
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: mlx5_pcie_event:304:(pid 1391): PCIe slot advertised sufficient power (75W).
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic)
[Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1 enp1s0f1np1: renamed from eth0

Kusjuures 'systemctl stop mlnx-en.d' eemaldab mlx moodulid mälust

root@debian-mlnx-01:~# lsmod | grep mlx
mlx5_core            2269184  0
mlxfw                  36864  1 mlx5_core
mlxdevm               180224  1 mlx5_core
mlx_compat             20480  2 mlxdevm,mlx5_core
psample                20480  1 mlx5_core
tls                   135168  1 mlx5_core
pci_hyperv_intf        16384  1 mlx5_core

root@debian-mlnx-01:~# systemctl stop mlnx-en.d
root@debian-mlnx-01:~# lsmod | grep mlx
root@debian-mlnx-01:~#

devlink kasutamine

root@debian-mlnx-01:~# devlink dev param show pci/0000:01:00.0 name enable_roce
pci/0000:01:00.0:
  name enable_roce type generic
    values:
      cmode driverinit value true

Monitoring

Tundub, et inimese jaoks on see kaarti küljes olev füüsiline radiaator päris kuum (sõrme küljes ei jaoks hoida), ja tundub, et sisuliselt on see ok, 'The adapter card incorporates the ConnectX IC, which operates in the range of temperatures between 0°C and 105°C.', https://docs.nvidia.com/networking/display/connectx6lxen/monitoring

root@debian-mlnx-01:~# devlink dev
pci/0000:01:00.0
pci/0000:01:00.1
root@debian-mlnx-01:~# mget_temp -d 0000:01:00.0
82             
root@debian-mlnx-01:~# mget_temp -d 0000:01:00.1
83       

Kasulikud lisamaterjalid

Misc

E810 kasutamine

Kasulikud lisamaterjalid

Kasulikud lisamaterjalid