Mellanox ConnectX-6 Lx EN: erinevus redaktsioonide vahel
P (Imre teisaldas lehekülje Mellanox ConnectX-6 Lx pealkirja Mellanox ConnectX-6 Lx EN alla) |
|||
(ei näidata sama kasutaja 68 vahepealset redaktsiooni) | |||
1. rida: | 1. rida: | ||
===Sissejuhatus=== |
===Sissejuhatus=== |
||
+ | |||
+ | ===Mellanox riistvara=== |
||
+ | |||
+ | Väited |
||
+ | |||
+ | * Mellanox nö kangemad võrguseadmed jaotatakse kaheks suuremaks osakonnaks: 1. SmartNIC, 2. SuperNIC |
||
+ | * SmartNIC - nvidia connectx seadmed |
||
+ | * SuperNIC - nvidia bluefield seadmed |
||
+ | |||
+ | ConnectX seadmed |
||
+ | |||
+ | * 'connectx-6 lx' ja 'connectx-6 dx' seadmed on kõik ethernet seadmed (st mitte infiniband) |
||
+ | * 'connectx-6' seade on füüsiliselt universaalne ethernet/infiniband seade, st võimalik on tarkvaraliselt kaardi poole pöördudes lülitada ta käima ethernet või infiniband režiimis |
||
+ | |||
+ | ===Mellanox integratsioonid=== |
||
+ | |||
+ | Kasutatakse |
||
+ | |||
+ | * Ubuntu v. 24.04 platvormil |
||
+ | * PVE v. 8.2.2 |
||
+ | * QEMU v. 9.0.0 |
||
+ | * OVS v. 3.1.0 |
||
+ | |||
+ | Mellanox füüsilise kaardi kasutamise variatsioonid |
||
+ | |||
+ | * dual-port adapteri saab PVE webgui peal tervikuna nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutada 'All Functions') - tulemusena antakse valides ühe pcie seadme mõlemad st .0 ja .1 seadmed virtuaalsele arvutile; neid kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse |
||
+ | * dual-port adapterist saab PVE webgui peal ühe füüsilise pordi nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutamata jätta 'All Functions') - tulemusena antakse valides ühe pcie seadme ainult see pcie seade, st .0 ja .1 seadme virtuaalsele arvutile; seda kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse |
||
+ | * tavalisel viisil ovs abil: ovs-tava puutub asjasse |
||
+ | * vhost-user draiveriga; ovs-dpdk puutub asjasse |
||
+ | * representoriga; ovs-dpdk puutub asjasse |
||
+ | * vdpa abil; ovs-dpdk puutub asjasse |
||
+ | |||
+ | ====dpdk abil liikluse kohale toomine ovs switchi juurde==== |
||
+ | |||
+ | Väited |
||
+ | |||
+ | * eesmärgiks on füüsiliselt võrgust kohale tuua ovs switchi peale füüsilise võrgukaardi juures dpdk lahendust kasutades võimalikult palju liiklust |
||
+ | * ei tegelda liikluse edasi jõudmisega ovs switchi külge kinnitatud virtuaalse arvuti juurde |
||
+ | * tegevused toimuvad Ubuntu 24.04 keskkonnas (põhjusel, et alustuseks on nii ehk selgem st võrreldes kohe PVE peal toimetama hakkamisega) |
||
+ | * midagi ei kompileerita st kõik paigaldatakse Ubuntu tava apt repost |
||
+ | |||
+ | Tulemuseks ovs switch paistab selline |
||
+ | |||
+ | <pre> |
||
+ | root@dpdp-u2404:~# ovs-vsctl show |
||
+ | 09d915bd-744b-4ff1-a223-983c02f05f3b |
||
+ | Bridge br0 |
||
+ | datapath_type: netdev |
||
+ | Port dpdk-p0 |
||
+ | Interface dpdk-p0 |
||
+ | type: dpdk |
||
+ | options: {dpdk-devargs="0000:0f:00.0"} |
||
+ | Port br0 |
||
+ | Interface br0 |
||
+ | type: internal |
||
+ | Port vlan11 |
||
+ | tag: 11 |
||
+ | Interface vlan11 |
||
+ | type: internal |
||
+ | ovs_version: "3.3.0" |
||
+ | </pre> |
||
+ | |||
+ | Ning võrk toimib selliselt |
||
+ | |||
+ | <pre> |
||
+ | root@dpdp-u2404:~# ping -c 4 192.168.1.254 |
||
+ | PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data. |
||
+ | 64 bytes from 192.168.1.254: icmp_seq=1 ttl=255 time=0.765 ms |
||
+ | 64 bytes from 192.168.1.254: icmp_seq=2 ttl=255 time=0.344 ms |
||
+ | 64 bytes from 192.168.1.254: icmp_seq=3 ttl=255 time=0.285 ms |
||
+ | 64 bytes from 192.168.1.254: icmp_seq=4 ttl=255 time=0.312 ms |
||
+ | |||
+ | --- 192.168.1.254 ping statistics --- |
||
+ | 4 packets transmitted, 4 received, 0% packet loss, time 3100ms |
||
+ | rtt min/avg/max/mdev = 0.285/0.426/0.765/0.196 ms |
||
+ | </pre> |
||
+ | |||
+ | Samal ajal ei ole nö tavalisel võrguliidesel midagi kuulda, põhjusel, et kernel ei tegele nende pakettidega tavalises mõttes |
||
+ | |||
+ | <pre> |
||
+ | root@dpdp-u2404:~# tcpdump -ni enp15s0f0np0 |
||
+ | libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. |
||
+ | tcpdump: verbose output suppressed, use -v[v]... for full protocol decode |
||
+ | listening on enp15s0f0np0, link-type EN10MB (Ethernet), snapshot length 262144 bytes |
||
+ | ^C |
||
+ | 0 packets captured |
||
+ | 0 packets received by filter |
||
+ | 0 packets dropped by kernel |
||
+ | </pre> |
||
+ | |||
+ | Sellise olukorra saavutamiseks Ubuntu v. 24.04 operatsioonisüsteemil paigaldatakse tarkvara OVS-DPDK, sobib taustaks vaadata juhendeid |
||
+ | |||
+ | * 'How to use DPDK with Open vSwitch' - https://ubuntu.com/server/docs/how-to-use-dpdk-with-open-vswitch |
||
+ | * 'OVS Offload Using ASAP² Direct' - https://docs.nvidia.com/networking/display/mlnxofedv590590/ovs+offload+using+asap%C2%B2+direct#src-2408744435_safe-id-T1ZTT2ZmbG9hZFVzaW5nQVNBUMKyRGlyZWN0LU9WUy1LZXJuZWxIYXJkd2FyZU9mZmxvYWRz |
||
+ | * 'Using Open vSwitch with DPDK' - https://docs.openvswitch.org/en/latest/howto/dpdk/ |
||
+ | |||
+ | root@dpdp-u2404:~# apt-get install openvswitch-switch-dpdk |
||
+ | |||
+ | Muu hulgas paigaldatakse sõltuvustena paketid |
||
+ | |||
+ | * dpdk |
||
+ | * openvswitch-switch |
||
+ | |||
+ | Ja kävitatakse ovs protsessid |
||
+ | |||
+ | <pre> |
||
+ | root@dpdp-u2404:~# systemctl | grep ovs | grep runni |
||
+ | ovs-vswitchd.service loaded active running Open vSwitch Forwarding Unit |
||
+ | ovsdb-server.service loaded active running Open vSwitch Database Unit |
||
+ | </pre> |
||
+ | |||
+ | kusjuures ovs tööd juhib fail 'root@dpdp-u2404:~# less /var/lib/openvswitch/conf.db' st kui midagi läheb ovs osakonna seadistamisel valesti sobib uuesti algamiseks lõpetada protsessid, kustutada failid ja käivitada protsessid |
||
+ | |||
+ | root@dpdp-u2404:~# systemctl stop ovs-vswitchd |
||
+ | root@dpdp-u2404:~# systemctl stop ovsdb-server |
||
+ | root@dpdp-u2404:~# rm /var/lib/openvswitch/.conf.db.~lock~ |
||
+ | root@dpdp-u2404:~# rm /var/lib/openvswitch/conf.db |
||
+ | root@dpdp-u2404:~# systemctl stop ovs-vswitchd |
||
+ | |||
+ | Peale tarkvara paigaldamist on ovs käivitatud olekus ning sobib seda edasi seadistada |
||
+ | |||
+ | root@dpdp-u2404:~# echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages |
||
+ | root@dpdp-u2404:~# update-alternatives --get-selections |
||
+ | root@dpdp-u2404:~# update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk |
||
+ | root@dpdp-u2404:~# update-alternatives --get-selections |
||
+ | root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true" |
||
+ | root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1" |
||
+ | root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048" |
||
+ | root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0" |
||
+ | root@dpdp-u2404:~# systemctl restart openvswitch-switch |
||
+ | |||
+ | other_config elemendi eemaldamiseks sobib öelda nt |
||
+ | |||
+ | ovs-vsctl remove Open_vSwitch . other_config dpdk-socket-mem |
||
+ | |||
+ | OVS rakenduses sisemiste sadistuste tegemiseks |
||
+ | |||
+ | root@dpdp-u2404:~# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev |
||
+ | root@dpdp-u2404:~# ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0 |
||
+ | root@dpdp-u2404:~# ovs-vsctl add-port br0 vlan11 tag=11 -- set interface vlan11 type=internal |
||
+ | root@dpdp-u2404:~# ifconfig vlan11 192.168.1.57/24 |
||
+ | |||
+ | Märkused |
||
+ | |||
+ | * lahendusele on iseloomulik, et üks CPU on 100% koormatud ('us' user load top väljundis) |
||
+ | |||
+ | Jõudluse hindamine |
||
+ | |||
+ | <pre> |
||
+ | root@dpdp-u2404:~# timeout 30 hping3 -S -c 400000000 --flood -p 53 192.168.1.254 |
||
+ | |||
+ | root@dpdp-u2404:~# vnstat -l vlan11 |
||
+ | vlan11 / traffic statistics |
||
+ | |||
+ | rx | tx |
||
+ | --------------------------------------+------------------ |
||
+ | bytes 8.06 KiB | 456.04 MiB |
||
+ | --------------------------------------+------------------ |
||
+ | max 3.32 kbit/s | 133.65 Mbit/s |
||
+ | average 1.83 kbit/s | 106.27 Mbit/s |
||
+ | min 944 bit/s | 0 bit/s |
||
+ | --------------------------------------+------------------ |
||
+ | packets 138 | 8855498 |
||
+ | --------------------------------------+------------------ |
||
+ | max 7 p/s | 309383 p/s |
||
+ | average 3 p/s | 245986 p/s |
||
+ | min 2 p/s | 0 p/s |
||
+ | --------------------------------------+------------------ |
||
+ | time 36 seconds |
||
+ | </pre> |
||
+ | |||
+ | Tuunimiseks sobib kasutada nt selliseid hoobasid |
||
+ | |||
+ | root@dpdp-u2404:~# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4" |
||
+ | root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0 |
||
+ | |||
+ | Tulemusena |
||
+ | |||
+ | <pre> |
||
+ | root@dpdp-u2404:~# ovs-vsctl show |
||
+ | 09d915bd-744b-4ff1-a223-983c02f05f3b |
||
+ | Bridge br0 |
||
+ | datapath_type: netdev |
||
+ | Port dpdk-p0 |
||
+ | Interface dpdk-p0 |
||
+ | type: dpdk |
||
+ | options: {dpdk-devargs="0000:0f:00.0", n_rxq="2"} |
||
+ | ... |
||
+ | </pre> |
||
+ | |||
+ | ning 'top -> 1' peal on paista, et 0x00f0 määrab, et 4-7 cpu komplektist tuleb kasutada kõik protsessorid. Tulemusena on siseneva st kohale jõudva ja vastuvõetava liikluse maht mitu korda suurem, nt |
||
+ | |||
+ | <pre> |
||
+ | root@dpdp-u2404:~# vnstat -l -i vlan11 |
||
+ | Monitoring vlan11... (press CTRL-C to stop) |
||
+ | |||
+ | rx: 382.62 Mbit/s 854067 p/s tx: 198.13 Mbit/s 427007 p/s^C |
||
+ | ... |
||
+ | </pre> |
||
+ | |||
+ | Kasulikud lisamaterjalid |
||
+ | |||
+ | * https://enterprise-support.nvidia.com/s/article/mellanox-dpdk |
||
+ | * https://doc.dpdk.org/guides/nics/mlx5.html |
||
+ | * http://www.virtualopensystems.com/en/solutions/guides/snabbswitch-qemu/ |
||
+ | |||
+ | ====dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vhost user protokoll==== |
||
+ | |||
+ | Väited |
||
+ | |||
+ | * vhost-user-client |
||
+ | * tegevused toimuvad PVE v. 8.2.2 keskkonnas |
||
+ | * tulemusena saab PVE webgui jms naturaalsel viisil virtuaalse arvutit opereerida (mh kävitada-seisata) kuid nt tema võrguosakond töötab koostöös dpdk lahendusega |
||
+ | * midagi ei kompileerita st kõik paigaldatakse Debian v. 12 ja PVE tava apt repodest |
||
+ | * lahtiseks jääb, et kuidas peaks valmistama ette PVE guesti võrgu osakonna, et ta toimiks dpdk abil kohale jõudnud paketttidega efektiivselt (praegu on see tavaline debian v. 12 arvuti virtio võrgundusega jne) |
||
+ | |||
+ | Lahendus jätkab ja täiendab eelmise punkti lahendust. PVE host ettevalmistamine |
||
+ | |||
+ | * dpdk tarkvara paigaldamine - vt eelmine Ubuntu v. 24.04 punkt |
||
+ | * ovs dpdk omaduste seadistamine |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~/20240706# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages |
||
+ | root@pve-moraal-x570:~/20240706# grep -i hugepages /proc/meminfo |
||
+ | AnonHugePages: 0 kB |
||
+ | ShmemHugePages: 0 kB |
||
+ | FileHugePages: 0 kB |
||
+ | HugePages_Total: 2048 |
||
+ | HugePages_Free: 2048 |
||
+ | HugePages_Rsvd: 0 |
||
+ | HugePages_Surp: 0 |
||
+ | Hugepagesize: 2048 kB |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# findmnt | grep huge |
||
+ | │ ├─/dev/hugepages hugetlbfs hugetlbfs rw,relatime,pagesize=2M |
||
+ | │ ├─/run/hugepages/kvm/2048kB hugetlbfs hugetlbfs rw,relatime,pagesize=2M |
||
+ | │ └─/run/hugepages/kvm/1048576kB hugetlbfs hugetlbfs rw,relatime,pagesize=1024M |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true" |
||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1" |
||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048" |
||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0" |
||
+ | root@pve-moraal-x570:~/20240706# systemctl restart openvswitch-switch |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev |
||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0 |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vlan14 tag=14 -- set interface vlan14 type=internal |
||
+ | root@pve-moraal-x570:~/20240706# ifconfig vlan14 192.168.112.169/24 |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# mkdir /var/run/vhostuserclient |
||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuserclient "options:vhost-server-path=/var/run/vhostuserclient/vhost-user-client-1" |
||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl set port vhost-user-1 tag=10 |
||
+ | (see vist ei ole asjakohane root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true ) |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4" |
||
+ | root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0 |
||
+ | </pre> |
||
+ | |||
+ | PVE guest seadistusfail |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~/20240706# cat /etc/pve/qemu-server/123.conf |
||
+ | agent: 1 |
||
+ | bios: ovmf |
||
+ | boot: order=virtio0;ide2;net0 |
||
+ | cores: 4 |
||
+ | cpu: host |
||
+ | efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K |
||
+ | ide2: none,media=cdrom |
||
+ | machine: q35 |
||
+ | memory: 1024 |
||
+ | name: deb11-tm-tartu-btrfs |
||
+ | # net0: virtio=12:4A:8D:1E:33:3D,bridge=vmbr0,firewall=1,tag=10 |
||
+ | numa: 1 |
||
+ | onboot: 0 |
||
+ | ostype: l26 |
||
+ | rng0: source=/dev/urandom |
||
+ | scsihw: virtio-scsi-pci |
||
+ | serial0: socket |
||
+ | smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940 |
||
+ | sockets: 1 |
||
+ | virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G |
||
+ | virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G |
||
+ | virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G |
||
+ | vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1 |
||
+ | hugepages: 1024 |
||
+ | args: -machine q35+pve0,kernel_irqchip=split \ |
||
+ | -device intel-iommu,intremap=on,caching-mode=on \ |
||
+ | -chardev socket,id=char1,path=/var/run/vhostuserclient/vhost-user-client-1,server=on \ |
||
+ | -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce=on,queues=4 \ |
||
+ | -device virtio-net-pci,mac=12:4A:8D:1E:33:3D,netdev=mynet1,mq=on,vectors=10,rx_queue_size=1024,tx_queue_size=256 |
||
+ | </pre> |
||
+ | |||
+ | kus |
||
+ | |||
+ | * PVE webgui peal ei ole võrguliidest seadistatud, selle asemel on võrguliides moodustatud 'arg' abil käsitsi |
||
+ | * 'args' abil on lisatud muid vajalikke elemente qemu guestile (võib olla 2024 seisuga ei ole see kõik vajalik, nt tundub, et 'kernel_irqchip=split' on niikuinii default |
||
+ | * 'args' jätel olev komplekt chardev/netdev/device põhineb eeldusel, et eelpool kirjeldatud ovs ettevalmistused vhostuserclient osas on tehtud |
||
+ | |||
+ | Tulemus |
||
+ | |||
+ | * TODO |
||
+ | |||
+ | Kasulikud lisamaterjalid |
||
+ | |||
+ | * https://stackoverflow.com/questions/69710907/connect-qemu-kvm-vms-using-vhost-user-client-and-ovs-dpdk |
||
+ | * https://docs.redhat.com/en/documentation/red_hat_openstack_platform/10/html/network_functions_virtualization_planning_guide/ch-vhost-user-ports |
||
+ | * https://forum.proxmox.com/threads/tutorial-run-open-vswitch-ovs-dpdk-on-pve-7-0.97116/ |
||
+ | * https://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/ |
||
+ | * https://docs.nvidia.com/networking/display/mlnxenv24040700/ovs+offload+using+asap%C2%B2+direct#src-2958624697_safe-id-T1ZTT2ZmbG9hZFVzaW5nQVNBUMKyRGlyZWN0LWh3dmRwYQ |
||
+ | * https://www.youtube.com/watch?v=y0ASTg3VCCc |
||
+ | * https://www.intel.com/content/www/us/en/developer/articles/technical/data-plane-development-kit-vhost-user-client-mode-with-open-vswitch.html |
||
+ | * https://www.redhat.com/en/blog/journey-vhost-users-realm |
||
+ | * https://www.redhat.com/en/virtio-networking-series |
||
+ | |||
+ | ====dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vf representor==== |
||
+ | |||
+ | Esmalt veendutakse, et Mellanox füüsiline võrgukaart on olemas; ja eswitch legacy režiimis |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# lspci | grep Mellanox |
||
+ | 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.1 |
||
+ | pci/0000:0f:00.1: mode legacy inline-mode none encap-mode basic |
||
+ | |||
+ | root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.0 |
||
+ | pci/0000:0f:00.0: mode legacy inline-mode none encap-mode basic |
||
+ | </pre> |
||
+ | |||
+ | Lülitatakse kaks VF sisse |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# echo 2 > /sys/class/net/enp15s0f0np0/device/sriov_numvfs |
||
+ | |||
+ | root@pve-moraal-x570:~# lspci | grep Mellanox |
||
+ | 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | 0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | 0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | </pre> |
||
+ | |||
+ | Omistatakse mac aadressid VF jaoks |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 0 mac e4:11:22:33:46:50 |
||
+ | root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 1 mac e4:11:22:33:46:51 |
||
+ | |||
+ | root@pve-moraal-x570:~# ip link show dev enp15s0f0np0 |
||
+ | 3: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 |
||
+ | link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff |
||
+ | vf 0 link/ether e4:11:22:33:46:50 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off |
||
+ | vf 1 link/ether e4:11:22:33:46:51 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off |
||
+ | </pre> |
||
+ | |||
+ | Seotakse lahti mlx5_core draiver VF seadmest |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use |
||
+ | Kernel driver in use: mlx5_core |
||
+ | Kernel driver in use: mlx5_core |
||
+ | Kernel driver in use: mlx5_core |
||
+ | Kernel driver in use: mlx5_core |
||
+ | |||
+ | root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind |
||
+ | root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind |
||
+ | |||
+ | root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use |
||
+ | Kernel driver in use: mlx5_core |
||
+ | Kernel driver in use: mlx5_core |
||
+ | </pre> |
||
+ | |||
+ | Lülitatakse legacy asemel switchdev mode |
||
+ | |||
+ | root@pve-moraal-x570:~# devlink dev eswitch set pci/0000:0f:00.0 mode switchdev |
||
+ | |||
+ | Seostatakse VF seadmetega mlx5_core draiver |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/bind |
||
+ | root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/bind |
||
+ | |||
+ | root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use |
||
+ | Kernel driver in use: mlx5_core |
||
+ | Kernel driver in use: mlx5_core |
||
+ | Kernel driver in use: mlx5_core |
||
+ | Kernel driver in use: mlx5_core |
||
+ | </pre> |
||
+ | |||
+ | Tehakse restart ovs protsessidele |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ovs-vsctl show |
||
+ | 9ffe33a7-4486-4701-b4c0-880c61f407b7 |
||
+ | ovs_version: "3.1.0" |
||
+ | |||
+ | root@pve-moraal-x570:~# systemctl restart openvswitch-switch |
||
+ | |||
+ | root@pve-moraal-x570:~# ovs-vsctl show |
||
+ | 9ffe33a7-4486-4701-b4c0-880c61f407b7 |
||
+ | ovs_version: "3.1.0" |
||
+ | </pre> |
||
+ | |||
+ | Võetakse kasutusele hugepage ressurss |
||
+ | |||
+ | <pre> |
||
+ | # echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages |
||
+ | </pre> |
||
+ | |||
+ | Rakendatakse ovs üldised seadistused |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true" |
||
+ | root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true |
||
+ | root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1" |
||
+ | root@pve-moraal-x570:~# systemctl restart openvswitch-switch |
||
+ | </pre> |
||
+ | |||
+ | Tekitatakse ovs sisu |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev |
||
+ | root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0 |
||
+ | root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 representor -- set Interface representor type=dpdk options:dpdk-devargs=0000:0f:00.0,representor=0 |
||
+ | root@pve-moraal-x570:~# ovs-vsctl set port representor tag=10 |
||
+ | </pre> |
||
+ | |||
+ | Tulemusena |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ovs-vsctl list open_vSwitch |
||
+ | _uuid : 9ffe33a7-4486-4701-b4c0-880c61f407b7 |
||
+ | bridges : [1d7619bb-7b33-4959-9266-325808d72c13] |
||
+ | cur_cfg : 7 |
||
+ | datapath_types : [netdev, system] |
||
+ | datapaths : {} |
||
+ | db_version : "8.3.1" |
||
+ | dpdk_initialized : true |
||
+ | dpdk_version : "DPDK 22.11.5" |
||
+ | external_ids : {hostname=pve-moraal-x570.sise.moraal.ee, rundir="/var/run/openvswitch", system-id="9e871242-44eb-499f-a049-32f089e65f68"} |
||
+ | iface_types : [afxdp, afxdp-nonpmd, bareudp, dpdk, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan] |
||
+ | manager_options : [] |
||
+ | next_cfg : 7 |
||
+ | other_config : {dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1", dpdk-init="true", hw-offload="true"} |
||
+ | ovs_version : "3.1.0" |
||
+ | ssl : [] |
||
+ | statistics : {} |
||
+ | system_type : debian |
||
+ | system_version : "12" |
||
+ | </pre> |
||
+ | |||
+ | ja |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ovs-vsctl show |
||
+ | 9ffe33a7-4486-4701-b4c0-880c61f407b7 |
||
+ | Bridge vmbr0 |
||
+ | datapath_type: netdev |
||
+ | Port dpdk-p0 |
||
+ | Interface dpdk-p0 |
||
+ | type: dpdk |
||
+ | options: {dpdk-devargs="0000:0f:00.0"} |
||
+ | Port vmbr0 |
||
+ | Interface vmbr0 |
||
+ | type: internal |
||
+ | Port representor |
||
+ | tag: 10 |
||
+ | Interface representor |
||
+ | type: dpdk |
||
+ | options: {dpdk-devargs="0000:0f:00.0,representor=0"} |
||
+ | ovs_version: "3.1.0" |
||
+ | </pre> |
||
+ | |||
+ | PVE virtuaalse arvuti seadistus, representor-lahendus töötab selliselt, et vf seade tuleb nö ededal ja tavalisel kujul anda virtuaalsele arvutile kasutada |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# cat /etc/pve/qemu-server/123.conf |
||
+ | agent: 1 |
||
+ | bios: ovmf |
||
+ | boot: order=virtio0;ide2;net0 |
||
+ | cores: 4 |
||
+ | cpu: host |
||
+ | efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K |
||
+ | hostpci0: 0000:0f:00.2 |
||
+ | hugepages: 1024 |
||
+ | ide2: none,media=cdrom |
||
+ | machine: q35,viommu=intel |
||
+ | memory: 1024 |
||
+ | name: deb11-tm-tartu-btrfs |
||
+ | numa: 1 |
||
+ | onboot: 0 |
||
+ | ostype: l26 |
||
+ | rng0: source=/dev/urandom |
||
+ | scsihw: virtio-scsi-pci |
||
+ | serial0: socket |
||
+ | smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940 |
||
+ | sockets: 1 |
||
+ | vga: virtio |
||
+ | virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G |
||
+ | virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G |
||
+ | virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G |
||
+ | vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1 |
||
+ | </pre> |
||
+ | |||
+ | Tulemusena |
||
+ | |||
+ | * võrk töötab |
||
+ | * jõudlus on mõõdukas |
||
+ | * virtuaalses arvutis kasutatakse võrguseadmel mlx5_core tavalist draiverit |
||
+ | |||
+ | <pre> |
||
+ | root@tm-tartu-x570:~# lspci | grep Mella |
||
+ | 06:10.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | </pre> |
||
+ | |||
+ | PVE host peal on paista iseloomulik nö 100% ühe cpu kasutus |
||
+ | |||
+ | [[Fail:20240708-mlnx-pmd-01.png]] |
||
+ | |||
+ | Kasulikud lisamaterjalid |
||
+ | |||
+ | * https://docs.openvswitch.org/en/latest/topics/dpdk/phy/# |
||
+ | * https://github.com/Mellanox/scalablefunctions/wiki/Upstream-step-by-step-guide |
||
+ | * https://www.youtube.com/watch?v=37MN8C_MNuQ |
||
+ | |||
+ | ====dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vdpa==== |
||
+ | |||
+ | TODO |
||
+ | |||
+ | Kasulikud lisamaterjalid |
||
+ | |||
+ | * https://metonymical.hatenablog.com/entry/2021/04/14/002638 |
||
+ | |||
+ | ====kernel tls==== |
||
+ | |||
+ | <pre> |
||
+ | TODO |
||
+ | </pre> |
||
+ | |||
+ | ====inbox draiver ja utiliidid==== |
||
+ | |||
+ | # apt-get install mstflint |
||
+ | |||
+ | Kaardi SRIOV muutmiseks sobib öelda |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=False |
||
+ | |||
+ | Device #1: |
||
+ | ---------- |
||
+ | |||
+ | Device type: ConnectX6LX |
||
+ | Name: MCX631102AN-ADA_Ax |
||
+ | Description: ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto |
||
+ | Device: 0000:0f:00.0 |
||
+ | |||
+ | Configurations: Next Boot New |
||
+ | SRIOV_EN True(1) False(0) |
||
+ | |||
+ | Apply new Configuration? (y/n) [n] : y |
||
+ | Applying... Done! |
||
+ | -I- Please reboot machine to load new configurations. |
||
+ | </pre> |
||
+ | |||
+ | Tulemusena ei ole enam VF võimekust |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s* |
||
+ | lrwxrwxrwx 1 root root 0 Jun 30 03:32 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_device |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_vendor |
||
+ | </pre> |
||
+ | |||
+ | Peale tagasi sisse lülitamist on VF võimekus tagasi |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=True |
||
+ | |||
+ | root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s* |
||
+ | -rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_drivers_autoprobe |
||
+ | -rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_numvfs |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_offset |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_stride |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_totalvfs |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_device |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_total_msix |
||
+ | lrwxrwxrwx 1 root root 0 Jun 30 03:39 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_device |
||
+ | -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_vendor |
||
+ | |||
+ | root@pve-moraal-x570:~# cat /sys/class/net/enp15s0f0np0/device/sriov_totalvfs |
||
+ | 8 |
||
+ | </pre> |
||
+ | |||
+ | Nende muudatuste tegemiseks peab olema secure boot välja lülitatud. |
||
+ | |||
+ | mstlink utiliit |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# mstlink -d 0000:0f:00.0 --show_device |
||
+ | |||
+ | Operational Info |
||
+ | ---------------- |
||
+ | State : Polling |
||
+ | Physical state : ETH_AN_FSM_ENABLE |
||
+ | Speed : N/A |
||
+ | Width : N/A |
||
+ | FEC : N/A |
||
+ | Loopback Mode : No Loopback |
||
+ | Auto Negotiation : FORCE - 25G,10G,1G |
||
+ | |||
+ | Supported Info |
||
+ | -------------- |
||
+ | Enabled Link Speed (Ext.) : 0x00000052 (25G,10G,1G) |
||
+ | Supported Cable Speed (Ext.) : 0x00000003 (1G,100M) |
||
+ | |||
+ | Troubleshooting Info |
||
+ | -------------------- |
||
+ | Status Opcode : 36 |
||
+ | Group Opcode : PHY FW |
||
+ | Recommendation : Force Mode no partner detected. |
||
+ | |||
+ | Tool Information |
||
+ | ---------------- |
||
+ | Firmware Version : 26.41.1000 |
||
+ | amBER Version : 2.05 |
||
+ | MSTFLINT Version : mstflint 4.21.0 |
||
+ | |||
+ | Device Info |
||
+ | ----------- |
||
+ | Part Number : N/A |
||
+ | Part Name : N/A |
||
+ | Serial Number : N/A |
||
+ | Revision : N/A |
||
+ | FW Version : 26.41.1000 |
||
+ | |||
+ | Note: P/N, Product Name, S/N and Revision are supported only in switches |
||
+ | </pre> |
||
+ | |||
+ | Virtual Functions kasutamiseks, lähtepunt on nö tavaolek |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# lspci | grep Mellanox |
||
+ | 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | </pre> |
||
+ | |||
+ | 4 funktsiooni kasutamiseks sobib öelda |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# lspci | grep Mellanox |
||
+ | 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] |
||
+ | 0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | 0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | 0f:00.4 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | 0f:00.5 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | </pre> |
||
+ | |||
+ | Kusjuures moodustatakse sellised seadmed |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ip link show dev enp15s0f0np0 |
||
+ | 50: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 |
||
+ | link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff |
||
+ | vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off |
||
+ | vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off |
||
+ | vf 2 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off |
||
+ | vf 3 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off |
||
+ | root@pve-moraal-x570:~# ip link show dev enp15s0f0v0 |
||
+ | 56: enp15s0f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 |
||
+ | link/ether 3a:48:d1:d6:57:dd brd ff:ff:ff:ff:ff:ff |
||
+ | </pre> |
||
+ | |||
+ | VF seadet saab host peal kasutama asuda praktiliselt nagu tavalist võrguseadet. Teine variant on ta saata edasi pcie passthru abil PVE virtuaalsele arvutile. Virtuaalses arvutis paistab kaart selline |
||
+ | |||
+ | <pre> |
||
+ | root@pve-sdn-01:~# lspci | grep Mell |
||
+ | 01:00.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function |
||
+ | root@pve-sdn-01:~# ethtool -i enp1s0 |
||
+ | driver: mlx5_core |
||
+ | version: 6.8.8-1-pve |
||
+ | firmware-version: 26.41.1000 (MT_0000000531) |
||
+ | expansion-rom-version: |
||
+ | bus-info: 0000:01:00.0 |
||
+ | supports-statistics: yes |
||
+ | supports-test: yes |
||
+ | supports-eeprom-access: no |
||
+ | supports-register-dump: no |
||
+ | supports-priv-flags: yes |
||
+ | </pre> |
||
+ | |||
+ | kus |
||
+ | |||
+ | * virtuaalses arvutis sobib kasutada sama mlx5 draiverit mida kasutatakse pve host peal |
||
+ | |||
+ | ===dpdk kasutamine=== |
||
+ | |||
+ | Paigaldatakse dpdk ja dpdk-dev paketid |
||
+ | |||
+ | # apt-get install dpdk dpdk-dev |
||
+ | |||
+ | Tulemusena on süsteemis muu hulgas utiliidid |
||
+ | |||
+ | * dpdk-devbind.py |
||
+ | * dpdk-hugepages.py |
||
+ | * dpdk-testpmd |
||
+ | |||
+ | Olukorra hindamine, vt https://doc.dpdk.org/guides/nics/mlx5.html -> 'Usage example' |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d / -f 5 |
||
+ | enp15s0f0np0 |
||
+ | enp15s0f1np1 |
||
+ | </pre> |
||
+ | |||
+ | testpmd käivitamine, täpslt ei ole saada aru, kas see on edukas käivitamine |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# dpdk-testpmd -l 8-15 -n 4 -a 0f:00.0 -a 0f:00.1 -- --rxq=2 --txq=2 -i |
||
+ | EAL: Detected CPU lcores: 24 |
||
+ | EAL: Detected NUMA nodes: 1 |
||
+ | EAL: Detected shared linkage of DPDK |
||
+ | EAL: Multi-process socket /var/run/dpdk/rte/mp_socket |
||
+ | EAL: Selected IOVA mode 'VA' |
||
+ | EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.0 (socket -1) |
||
+ | EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.1 (socket -1) |
||
+ | Interactive-mode selected |
||
+ | Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa. |
||
+ | testpmd: create a new mbuf pool <mb_pool_0>: n=203456, size=2176, socket=0 |
||
+ | testpmd: preferred mempool ops selected: ring_mp_mc |
||
+ | Configuring Port 0 (socket 0) |
||
+ | Port 0: E8:EB:D3:0B:78:74 |
||
+ | Configuring Port 1 (socket 0) |
||
+ | Port 1: E8:EB:D3:0B:78:75 |
||
+ | Checking link statuses... |
||
+ | Done |
||
+ | testpmd> |
||
+ | </pre> |
||
+ | |||
+ | Näiteks |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# dpdk-testpmd -l 6-9 -n 4 -a 0f:00.0 -- --rxq=4 --txq=4 -i |
||
+ | testpmd> set fwd txonly |
||
+ | |||
+ | testpmd> show port stats all |
||
+ | |||
+ | ######################## NIC statistics for port 0 ######################## |
||
+ | RX-packets: 0 RX-missed: 0 RX-bytes: 0 |
||
+ | RX-errors: 0 |
||
+ | RX-nombuf: 0 |
||
+ | TX-packets: 5153664 TX-errors: 0 TX-bytes: 329834496 |
||
+ | |||
+ | Throughput (since last show) |
||
+ | Rx-pps: 0 Rx-bps: 0 |
||
+ | Tx-pps: 1420582 Tx-bps: 727338208 |
||
+ | ############################################################################ |
||
+ | </pre> |
||
+ | |||
+ | ja teisel pordil võrku pealt kuulates paistab |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# tcpdump -c 4 -nei enp15s0f1np1 |
||
+ | tcpdump: verbose output suppressed, use -v[v]... for full protocol decode |
||
+ | listening on enp15s0f1np1, link-type EN10MB (Ethernet), snapshot length 262144 bytes |
||
+ | 09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22 |
||
+ | 09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22 |
||
+ | 09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22 |
||
+ | 09:07:53.700765 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22 |
||
+ | </pre> |
||
+ | |||
+ | Kasulikud lisamaterjalid |
||
+ | |||
+ | * https://www.youtube.com/watch?v=KX1QOqMtchg |
||
+ | * https://www.youtube.com/watch?v=0yDdMWQPCOI |
||
+ | * https://www.youtube.com/watch?v=Un5-AN4nb9s |
||
===Misc=== |
===Misc=== |
||
64. rida: | 843. rida: | ||
* ethtool ja devlink-dev-info väljundis klapib kaardil oleva firmware versioon - 26.35.2000 |
* ethtool ja devlink-dev-info väljundis klapib kaardil oleva firmware versioon - 26.35.2000 |
||
+ | |||
+ | ===Tootja MLNX EN tarkvara kasutamine=== |
||
+ | |||
+ | ====Mõisted==== |
||
+ | |||
+ | * MFT - NVIDIA Firmware Tools, tõenäoliselt algupäraselt Mellanox Firmware Tools |
||
+ | |||
+ | ====Kasutamine füüsilise seadme tervikuna passthru režiimis==== |
||
+ | |||
+ | Väited |
||
+ | |||
+ | * üldiselt proxmox v. 8 keskkonnas saab mellanox seadme anda üle virtuaalsele arvuti tavalisel viisil (valides pve webgui liidses 'Add -> PCI device' ja näidates esimese MLNX seadme; teine lisatakse automaatselt |
||
+ | * tundub, et füüsilist mellanox seadet ei saa tervikuna nö täiuslikult virtuaalsele arvutile edasi anda, ühe asjana puudub jääb sr-iov võimekus |
||
+ | * virtuaalses arvutis saab võrguseadet kasutada tema PF osas, VF ei ole ligipääsetav |
||
+ | * virtuaalsele arvutile saab lisada tavalise pve webgui peal vIOMMU ning siis paigutatakse virtuaalse arvuti seadmed sh erinevad MLNX adapteri füüsilised pordid erinevatesse IOMMU gruppidesse |
||
+ | |||
+ | Sellisele asjakorraldusele on üldiselt iseloomulik, et host arvutis tegeleb edasi vfio draiver seadmega |
||
+ | |||
+ | <pre> |
||
+ | root@pve-moraal-x570:~# lspci -vvv | grep vfio |
||
+ | Kernel driver in use: vfio-pci |
||
+ | Kernel driver in use: vfio-pci |
||
+ | </pre> |
||
+ | |||
+ | Ilma rebootida host peale seadme koos mlx driveri kasutamisega tagasi saamiseks sobib |
||
+ | |||
+ | * lõpetada virtuaalse arvuti töötamine |
||
+ | * öelda host peal |
||
+ | |||
+ | <pre> |
||
+ | echo 1 > /sys/bus/pci/devices/0000\:0f\:00.0/remove |
||
+ | echo 1 > /sys/bus/pci/devices/0000\:0f\:00.1/remove |
||
+ | echo 1 > /sys/bus/pci/rescan |
||
+ | </pre> |
||
+ | |||
+ | ====Tarkvara paigaldamine==== |
||
+ | |||
+ | TODO |
||
+ | |||
+ | root@debian-mlnx-01:~# mount /root/mlnx-en-24.04-0.6.6.0-debian12.1-x86_64.iso /mnt/mlnx |
||
+ | |||
+ | <pre> |
||
+ | root@debian-mlnx-01:~# find /lib/modules/6.1.0-22-amd64/ -type f -mmin -20 -ls | grep dkms |
||
+ | 661960 4312 -rw-r--r-- 1 root root 4415205 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_core.ko |
||
+ | 661959 28 -rw-r--r-- 1 root root 25237 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx_compat.ko |
||
+ | 661961 8 -rw-r--r-- 1 root root 5565 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_ib.ko |
||
+ | 661962 52 -rw-r--r-- 1 root root 49445 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxfw.ko |
||
+ | 661963 208 -rw-r--r-- 1 root root 210797 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxdevm.ko |
||
+ | </pre> |
||
+ | |||
+ | ====Firmware uuendamine==== |
||
+ | |||
+ | <pre> |
||
+ | root@debian-mlnx-01:~# apt-get install mlnx-fw-updater |
||
+ | Reading package lists... Done |
||
+ | Building dependency tree... Done |
||
+ | Reading state information... Done |
||
+ | The following package was automatically installed and is no longer required: |
||
+ | linux-image-6.1.0-15-amd64 |
||
+ | Use 'apt autoremove' to remove it. |
||
+ | The following NEW packages will be installed: |
||
+ | mlnx-fw-updater |
||
+ | 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded. |
||
+ | Need to get 0 B/50.2 MB of archives. |
||
+ | After this operation, 87.9 MB of additional disk space will be used. |
||
+ | Get:1 file:/mnt/mlnx/DEBS_ETH ./ mlnx-fw-updater 24.04-0.6.6.0 [50.2 MB] |
||
+ | Selecting previously unselected package mlnx-fw-updater. |
||
+ | (Reading database ... 63847 files and directories currently installed.) |
||
+ | Preparing to unpack .../mlnx-fw-updater_24.04-0.6.6.0_amd64.deb ... |
||
+ | Unpacking mlnx-fw-updater (24.04-0.6.6.0) ... |
||
+ | Setting up mlnx-fw-updater (24.04-0.6.6.0) ... |
||
+ | Initializing... |
||
+ | Attempting to perform Firmware update... |
||
+ | Querying Mellanox devices firmware ... |
||
+ | |||
+ | Device #1: |
||
+ | ---------- |
||
+ | |||
+ | Device Type: ConnectX6LX |
||
+ | Part Number: MCX631102AN-ADA_Ax |
||
+ | Description: ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto |
||
+ | PSID: MT_0000000531 |
||
+ | PCI Device Name: 01:00.0 |
||
+ | Base GUID: e8ebd303000b7874 |
||
+ | Base MAC: e8ebd30b7874 |
||
+ | Versions: Current Available |
||
+ | FW 26.32.2004 26.41.1000 |
||
+ | PXE 3.6.0502 3.7.0400 |
||
+ | UEFI 14.25.0018 14.34.0012 |
||
+ | |||
+ | Status: Update required |
||
+ | |||
+ | --------- |
||
+ | Found 1 device(s) requiring firmware update... |
||
+ | |||
+ | Device #1: Updating FW ... |
||
+ | FSMST_INITIALIZE - OK |
||
+ | Writing Boot image component - OK |
||
+ | Done |
||
+ | |||
+ | Restart needed for updates to take effect. |
||
+ | Log File: /tmp/oaFVUkaJsl |
||
+ | Real log file: /tmp/mlnx_fw_update.log |
||
+ | |||
+ | root@debian-mlnx-01:~# less /tmp/mlnx_fw_update.log |
||
+ | CMD: mlxup -u --log-on-update --ssl-certificate /tmp/OloIGrYWuz/mlxfwmanager_sriov_dis_x86_64_4127-dir/ca-bundle.crt --current-dir /opt/mellanox/mlnx-fw-updater/ -L /tmp/oaFVUkaJsl -y -d 01:00.0 |
||
+ | Querying Mellanox devices firmware ... |
||
+ | |||
+ | Device #1: |
||
+ | ---------- |
||
+ | |||
+ | ... |
||
+ | </pre> |
||
+ | |||
+ | Paistab, et tulemusena on kaardil olemas kaks versiooni firmwarest |
||
+ | |||
+ | <pre> |
||
+ | root@debian-mlnx-01:~# devlink dev info |
||
+ | pci/0000:01:00.0: |
||
+ | driver mlx5_core |
||
+ | versions: |
||
+ | fixed: |
||
+ | fw.psid MT_0000000531 |
||
+ | running: |
||
+ | fw.version 26.32.2004 |
||
+ | fw 26.32.2004 |
||
+ | stored: |
||
+ | fw.version 26.41.1000 |
||
+ | fw 26.41.1000 |
||
+ | pci/0000:01:00.1: |
||
+ | driver mlx5_core |
||
+ | versions: |
||
+ | fixed: |
||
+ | fw.psid MT_0000000531 |
||
+ | running: |
||
+ | fw.version 26.32.2004 |
||
+ | fw 26.32.2004 |
||
+ | stored: |
||
+ | fw.version 26.41.1000 |
||
+ | fw 26.41.1000 |
||
+ | </pre> |
||
+ | |||
+ | kus |
||
+ | |||
+ | * running version - 26.32.2004 |
||
+ | * stored version - 26.41.1000 |
||
+ | |||
+ | ====systemd unit mlnx-en.d==== |
||
+ | |||
+ | Tundub, et mlx driveritega tegeleb systemd unit |
||
+ | |||
+ | <pre> |
||
+ | root@debian-mlnx-01:~# dpkg -S /etc/mlnx-en.conf |
||
+ | mlnx-en-utils: /etc/mlnx-en.conf |
||
+ | |||
+ | root@debian-mlnx-01:~# cat /etc/mlnx-en.conf |
||
+ | # Allow calling the service script with the option 'stop' for unloading the driver stack. |
||
+ | # This flag should be disabled when the OS root file system is on remote storage. |
||
+ | ALLOW_STOP=yes |
||
+ | |||
+ | # Run sysctl performance tuning script |
||
+ | RUN_SYSCTL=no |
||
+ | |||
+ | # Run /usr/sbin/mlnx_tune |
||
+ | RUN_MLNX_TUNE=no |
||
+ | |||
+ | # Load MLX4 modules |
||
+ | MLX4_LOAD=no |
||
+ | |||
+ | # Load MLX5 modules |
||
+ | MLX5_LOAD=yes |
||
+ | |||
+ | root@debian-mlnx-01:~# systemctl start mlnx-en.d |
||
+ | root@debian-mlnx-01:~# systemctl status mlnx-en.d |
||
+ | ● mlnx-en.d.service - mlnx-en.d - configure Mellanox devices |
||
+ | Loaded: loaded (/lib/systemd/system/mlnx-en.d.service; enabled; preset: enabled) |
||
+ | Active: active (exited) since Sun 2024-06-30 00:49:29 EEST; 6s ago |
||
+ | Docs: file:/etc/mlnx-en.conf |
||
+ | Process: 1505 ExecStart=/etc/init.d/mlnx-en.d start (code=exited, status=0/SUCCESS) |
||
+ | Main PID: 1505 (code=exited, status=0/SUCCESS) |
||
+ | CPU: 385ms |
||
+ | |||
+ | Jun 30 00:49:26 debian-mlnx-01 systemd[1]: Starting mlnx-en.d.service - mlnx-en.d - configure Mellanox devices... |
||
+ | Jun 30 00:49:28 debian-mlnx-01 mlnx-en.d[1505]: [32B blob data] |
||
+ | Jun 30 00:49:29 debian-mlnx-01 systemd[1]: Finished mlnx-en.d.service - mlnx-en.d - configure Mellanox devices. |
||
+ | </pre> |
||
+ | |||
+ | samal ajal dmesg väljundis |
||
+ | |||
+ | <pre> |
||
+ | # dmesg -T -w |
||
+ | |||
+ | [Sun Jun 30 00:49:26 2024] Compat-mlnx-ofed backport release: 7037b8d |
||
+ | [Sun Jun 30 00:49:26 2024] Backport based on https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git 7037b8d |
||
+ | [Sun Jun 30 00:49:26 2024] compat.git: https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git |
||
+ | [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: firmware version: 26.32.2004 |
||
+ | [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link) |
||
+ | [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps |
||
+ | [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048) |
||
+ | [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Port module event: module 0, Cable plugged |
||
+ | [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: mlx5_pcie_event:304:(pid 1398): PCIe slot advertised sufficient power (75W). |
||
+ | [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic) |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.0 enp1s0f0np0: renamed from eth0 |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: firmware version: 26.32.2004 |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link) |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048) |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Port module event: module 1, Cable plugged |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: mlx5_pcie_event:304:(pid 1391): PCIe slot advertised sufficient power (75W). |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic) |
||
+ | [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1 enp1s0f1np1: renamed from eth0 |
||
+ | </pre> |
||
+ | |||
+ | Kusjuures 'systemctl stop mlnx-en.d' eemaldab mlx moodulid mälust |
||
+ | |||
+ | <pre> |
||
+ | root@debian-mlnx-01:~# lsmod | grep mlx |
||
+ | mlx5_core 2269184 0 |
||
+ | mlxfw 36864 1 mlx5_core |
||
+ | mlxdevm 180224 1 mlx5_core |
||
+ | mlx_compat 20480 2 mlxdevm,mlx5_core |
||
+ | psample 20480 1 mlx5_core |
||
+ | tls 135168 1 mlx5_core |
||
+ | pci_hyperv_intf 16384 1 mlx5_core |
||
+ | |||
+ | root@debian-mlnx-01:~# systemctl stop mlnx-en.d |
||
+ | root@debian-mlnx-01:~# lsmod | grep mlx |
||
+ | root@debian-mlnx-01:~# |
||
+ | </pre> |
||
+ | |||
+ | ====devlink kasutamine==== |
||
+ | |||
+ | <pre> |
||
+ | root@debian-mlnx-01:~# devlink dev param show pci/0000:01:00.0 name enable_roce |
||
+ | pci/0000:01:00.0: |
||
+ | name enable_roce type generic |
||
+ | values: |
||
+ | cmode driverinit value true |
||
+ | </pre> |
||
+ | |||
+ | ====Monitoring==== |
||
+ | |||
+ | Tundub, et inimese jaoks on see kaarti küljes olev füüsiline radiaator päris kuum (sõrme küljes ei jaoks hoida), ja tundub, et sisuliselt on see ok, 'The adapter card incorporates the ConnectX IC, which operates in the range of temperatures between 0°C and 105°C.', https://docs.nvidia.com/networking/display/connectx6lxen/monitoring |
||
+ | |||
+ | <pre> |
||
+ | root@debian-mlnx-01:~# devlink dev |
||
+ | pci/0000:01:00.0 |
||
+ | pci/0000:01:00.1 |
||
+ | root@debian-mlnx-01:~# mget_temp -d 0000:01:00.0 |
||
+ | 82 |
||
+ | root@debian-mlnx-01:~# mget_temp -d 0000:01:00.1 |
||
+ | 83 |
||
+ | </pre> |
||
+ | |||
+ | Kasulikud lisamaterjalid |
||
+ | |||
+ | * https://docs.nvidia.com/networking/display/connectx6lxen/monitoring |
||
===Misc=== |
===Misc=== |
||
* https://www.youtube.com/watch?v=XLPgDEbUMgk - 'How to set Mellanox ConnectX VPI to Ethernet or Infiniband in Linux' |
* https://www.youtube.com/watch?v=XLPgDEbUMgk - 'How to set Mellanox ConnectX VPI to Ethernet or Infiniband in Linux' |
||
+ | |||
+ | ===E810 kasutamine=== |
||
+ | |||
+ | Kasulikud lisamaterjalid |
||
+ | |||
+ | * https://bugzilla.redhat.com/show_bug.cgi?id=2082528 |
||
+ | * https://bugzilla.redhat.com/show_bug.cgi?id=1878026 |
||
===Kasulikud lisamaterjalid=== |
===Kasulikud lisamaterjalid=== |
Viimane redaktsioon: 15. juuli 2024, kell 01:48
Sissejuhatus
Mellanox riistvara
Väited
- Mellanox nö kangemad võrguseadmed jaotatakse kaheks suuremaks osakonnaks: 1. SmartNIC, 2. SuperNIC
- SmartNIC - nvidia connectx seadmed
- SuperNIC - nvidia bluefield seadmed
ConnectX seadmed
- 'connectx-6 lx' ja 'connectx-6 dx' seadmed on kõik ethernet seadmed (st mitte infiniband)
- 'connectx-6' seade on füüsiliselt universaalne ethernet/infiniband seade, st võimalik on tarkvaraliselt kaardi poole pöördudes lülitada ta käima ethernet või infiniband režiimis
Mellanox integratsioonid
Kasutatakse
- Ubuntu v. 24.04 platvormil
- PVE v. 8.2.2
- QEMU v. 9.0.0
- OVS v. 3.1.0
Mellanox füüsilise kaardi kasutamise variatsioonid
- dual-port adapteri saab PVE webgui peal tervikuna nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutada 'All Functions') - tulemusena antakse valides ühe pcie seadme mõlemad st .0 ja .1 seadmed virtuaalsele arvutile; neid kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse
- dual-port adapterist saab PVE webgui peal ühe füüsilise pordi nö pass-thru viisil anda edasi virtuaalsele arvutile (linnutamata jätta 'All Functions') - tulemusena antakse valides ühe pcie seadme ainult see pcie seade, st .0 ja .1 seadme virtuaalsele arvutile; seda kasutatakse siis virtuaalsest arvutist mlx5_core draiveriga jne; ovs ei puutu üldse asjasse
- tavalisel viisil ovs abil: ovs-tava puutub asjasse
- vhost-user draiveriga; ovs-dpdk puutub asjasse
- representoriga; ovs-dpdk puutub asjasse
- vdpa abil; ovs-dpdk puutub asjasse
dpdk abil liikluse kohale toomine ovs switchi juurde
Väited
- eesmärgiks on füüsiliselt võrgust kohale tuua ovs switchi peale füüsilise võrgukaardi juures dpdk lahendust kasutades võimalikult palju liiklust
- ei tegelda liikluse edasi jõudmisega ovs switchi külge kinnitatud virtuaalse arvuti juurde
- tegevused toimuvad Ubuntu 24.04 keskkonnas (põhjusel, et alustuseks on nii ehk selgem st võrreldes kohe PVE peal toimetama hakkamisega)
- midagi ei kompileerita st kõik paigaldatakse Ubuntu tava apt repost
Tulemuseks ovs switch paistab selline
root@dpdp-u2404:~# ovs-vsctl show 09d915bd-744b-4ff1-a223-983c02f05f3b Bridge br0 datapath_type: netdev Port dpdk-p0 Interface dpdk-p0 type: dpdk options: {dpdk-devargs="0000:0f:00.0"} Port br0 Interface br0 type: internal Port vlan11 tag: 11 Interface vlan11 type: internal ovs_version: "3.3.0"
Ning võrk toimib selliselt
root@dpdp-u2404:~# ping -c 4 192.168.1.254 PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data. 64 bytes from 192.168.1.254: icmp_seq=1 ttl=255 time=0.765 ms 64 bytes from 192.168.1.254: icmp_seq=2 ttl=255 time=0.344 ms 64 bytes from 192.168.1.254: icmp_seq=3 ttl=255 time=0.285 ms 64 bytes from 192.168.1.254: icmp_seq=4 ttl=255 time=0.312 ms --- 192.168.1.254 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3100ms rtt min/avg/max/mdev = 0.285/0.426/0.765/0.196 ms
Samal ajal ei ole nö tavalisel võrguliidesel midagi kuulda, põhjusel, et kernel ei tegele nende pakettidega tavalises mõttes
root@dpdp-u2404:~# tcpdump -ni enp15s0f0np0 libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on enp15s0f0np0, link-type EN10MB (Ethernet), snapshot length 262144 bytes ^C 0 packets captured 0 packets received by filter 0 packets dropped by kernel
Sellise olukorra saavutamiseks Ubuntu v. 24.04 operatsioonisüsteemil paigaldatakse tarkvara OVS-DPDK, sobib taustaks vaadata juhendeid
- 'How to use DPDK with Open vSwitch' - https://ubuntu.com/server/docs/how-to-use-dpdk-with-open-vswitch
- 'OVS Offload Using ASAP² Direct' - https://docs.nvidia.com/networking/display/mlnxofedv590590/ovs+offload+using+asap%C2%B2+direct#src-2408744435_safe-id-T1ZTT2ZmbG9hZFVzaW5nQVNBUMKyRGlyZWN0LU9WUy1LZXJuZWxIYXJkd2FyZU9mZmxvYWRz
- 'Using Open vSwitch with DPDK' - https://docs.openvswitch.org/en/latest/howto/dpdk/
root@dpdp-u2404:~# apt-get install openvswitch-switch-dpdk
Muu hulgas paigaldatakse sõltuvustena paketid
- dpdk
- openvswitch-switch
Ja kävitatakse ovs protsessid
root@dpdp-u2404:~# systemctl | grep ovs | grep runni ovs-vswitchd.service loaded active running Open vSwitch Forwarding Unit ovsdb-server.service loaded active running Open vSwitch Database Unit
kusjuures ovs tööd juhib fail 'root@dpdp-u2404:~# less /var/lib/openvswitch/conf.db' st kui midagi läheb ovs osakonna seadistamisel valesti sobib uuesti algamiseks lõpetada protsessid, kustutada failid ja käivitada protsessid
root@dpdp-u2404:~# systemctl stop ovs-vswitchd root@dpdp-u2404:~# systemctl stop ovsdb-server root@dpdp-u2404:~# rm /var/lib/openvswitch/.conf.db.~lock~ root@dpdp-u2404:~# rm /var/lib/openvswitch/conf.db root@dpdp-u2404:~# systemctl stop ovs-vswitchd
Peale tarkvara paigaldamist on ovs käivitatud olekus ning sobib seda edasi seadistada
root@dpdp-u2404:~# echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages root@dpdp-u2404:~# update-alternatives --get-selections root@dpdp-u2404:~# update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk root@dpdp-u2404:~# update-alternatives --get-selections root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true" root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1" root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048" root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0" root@dpdp-u2404:~# systemctl restart openvswitch-switch
other_config elemendi eemaldamiseks sobib öelda nt
ovs-vsctl remove Open_vSwitch . other_config dpdk-socket-mem
OVS rakenduses sisemiste sadistuste tegemiseks
root@dpdp-u2404:~# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev root@dpdp-u2404:~# ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0 root@dpdp-u2404:~# ovs-vsctl add-port br0 vlan11 tag=11 -- set interface vlan11 type=internal root@dpdp-u2404:~# ifconfig vlan11 192.168.1.57/24
Märkused
- lahendusele on iseloomulik, et üks CPU on 100% koormatud ('us' user load top väljundis)
Jõudluse hindamine
root@dpdp-u2404:~# timeout 30 hping3 -S -c 400000000 --flood -p 53 192.168.1.254 root@dpdp-u2404:~# vnstat -l vlan11 vlan11 / traffic statistics rx | tx --------------------------------------+------------------ bytes 8.06 KiB | 456.04 MiB --------------------------------------+------------------ max 3.32 kbit/s | 133.65 Mbit/s average 1.83 kbit/s | 106.27 Mbit/s min 944 bit/s | 0 bit/s --------------------------------------+------------------ packets 138 | 8855498 --------------------------------------+------------------ max 7 p/s | 309383 p/s average 3 p/s | 245986 p/s min 2 p/s | 0 p/s --------------------------------------+------------------ time 36 seconds
Tuunimiseks sobib kasutada nt selliseid hoobasid
root@dpdp-u2404:~# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4" root@dpdp-u2404:~# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0
Tulemusena
root@dpdp-u2404:~# ovs-vsctl show 09d915bd-744b-4ff1-a223-983c02f05f3b Bridge br0 datapath_type: netdev Port dpdk-p0 Interface dpdk-p0 type: dpdk options: {dpdk-devargs="0000:0f:00.0", n_rxq="2"} ...
ning 'top -> 1' peal on paista, et 0x00f0 määrab, et 4-7 cpu komplektist tuleb kasutada kõik protsessorid. Tulemusena on siseneva st kohale jõudva ja vastuvõetava liikluse maht mitu korda suurem, nt
root@dpdp-u2404:~# vnstat -l -i vlan11 Monitoring vlan11... (press CTRL-C to stop) rx: 382.62 Mbit/s 854067 p/s tx: 198.13 Mbit/s 427007 p/s^C ...
Kasulikud lisamaterjalid
- https://enterprise-support.nvidia.com/s/article/mellanox-dpdk
- https://doc.dpdk.org/guides/nics/mlx5.html
- http://www.virtualopensystems.com/en/solutions/guides/snabbswitch-qemu/
dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vhost user protokoll
Väited
- vhost-user-client
- tegevused toimuvad PVE v. 8.2.2 keskkonnas
- tulemusena saab PVE webgui jms naturaalsel viisil virtuaalse arvutit opereerida (mh kävitada-seisata) kuid nt tema võrguosakond töötab koostöös dpdk lahendusega
- midagi ei kompileerita st kõik paigaldatakse Debian v. 12 ja PVE tava apt repodest
- lahtiseks jääb, et kuidas peaks valmistama ette PVE guesti võrgu osakonna, et ta toimiks dpdk abil kohale jõudnud paketttidega efektiivselt (praegu on see tavaline debian v. 12 arvuti virtio võrgundusega jne)
Lahendus jätkab ja täiendab eelmise punkti lahendust. PVE host ettevalmistamine
- dpdk tarkvara paigaldamine - vt eelmine Ubuntu v. 24.04 punkt
- ovs dpdk omaduste seadistamine
root@pve-moraal-x570:~/20240706# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages root@pve-moraal-x570:~/20240706# grep -i hugepages /proc/meminfo AnonHugePages: 0 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 2048 HugePages_Free: 2048 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB root@pve-moraal-x570:~/20240706# findmnt | grep huge │ ├─/dev/hugepages hugetlbfs hugetlbfs rw,relatime,pagesize=2M │ ├─/run/hugepages/kvm/2048kB hugetlbfs hugetlbfs rw,relatime,pagesize=2M │ └─/run/hugepages/kvm/1048576kB hugetlbfs hugetlbfs rw,relatime,pagesize=1024M root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true" root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1" root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048" root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--allow=0000:0f:00.0" root@pve-moraal-x570:~/20240706# systemctl restart openvswitch-switch root@pve-moraal-x570:~/20240706# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0 root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vlan14 tag=14 -- set interface vlan14 type=internal root@pve-moraal-x570:~/20240706# ifconfig vlan14 192.168.112.169/24 root@pve-moraal-x570:~/20240706# mkdir /var/run/vhostuserclient root@pve-moraal-x570:~/20240706# ovs-vsctl add-port vmbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuserclient "options:vhost-server-path=/var/run/vhostuserclient/vhost-user-client-1" root@pve-moraal-x570:~/20240706# ovs-vsctl set port vhost-user-1 tag=10 (see vist ei ole asjakohane root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true ) root@pve-moraal-x570:~/20240706# ovs-vsctl set Interface dpdk-p0 "options:n_rxq=4" root@pve-moraal-x570:~/20240706# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x00f0
PVE guest seadistusfail
root@pve-moraal-x570:~/20240706# cat /etc/pve/qemu-server/123.conf agent: 1 bios: ovmf boot: order=virtio0;ide2;net0 cores: 4 cpu: host efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K ide2: none,media=cdrom machine: q35 memory: 1024 name: deb11-tm-tartu-btrfs # net0: virtio=12:4A:8D:1E:33:3D,bridge=vmbr0,firewall=1,tag=10 numa: 1 onboot: 0 ostype: l26 rng0: source=/dev/urandom scsihw: virtio-scsi-pci serial0: socket smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940 sockets: 1 virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1 hugepages: 1024 args: -machine q35+pve0,kernel_irqchip=split \ -device intel-iommu,intremap=on,caching-mode=on \ -chardev socket,id=char1,path=/var/run/vhostuserclient/vhost-user-client-1,server=on \ -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce=on,queues=4 \ -device virtio-net-pci,mac=12:4A:8D:1E:33:3D,netdev=mynet1,mq=on,vectors=10,rx_queue_size=1024,tx_queue_size=256
kus
- PVE webgui peal ei ole võrguliidest seadistatud, selle asemel on võrguliides moodustatud 'arg' abil käsitsi
- 'args' abil on lisatud muid vajalikke elemente qemu guestile (võib olla 2024 seisuga ei ole see kõik vajalik, nt tundub, et 'kernel_irqchip=split' on niikuinii default
- 'args' jätel olev komplekt chardev/netdev/device põhineb eeldusel, et eelpool kirjeldatud ovs ettevalmistused vhostuserclient osas on tehtud
Tulemus
- TODO
Kasulikud lisamaterjalid
- https://stackoverflow.com/questions/69710907/connect-qemu-kvm-vms-using-vhost-user-client-and-ovs-dpdk
- https://docs.redhat.com/en/documentation/red_hat_openstack_platform/10/html/network_functions_virtualization_planning_guide/ch-vhost-user-ports
- https://forum.proxmox.com/threads/tutorial-run-open-vswitch-ovs-dpdk-on-pve-7-0.97116/
- https://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/
- https://docs.nvidia.com/networking/display/mlnxenv24040700/ovs+offload+using+asap%C2%B2+direct#src-2958624697_safe-id-T1ZTT2ZmbG9hZFVzaW5nQVNBUMKyRGlyZWN0LWh3dmRwYQ
- https://www.youtube.com/watch?v=y0ASTg3VCCc
- https://www.intel.com/content/www/us/en/developer/articles/technical/data-plane-development-kit-vhost-user-client-mode-with-open-vswitch.html
- https://www.redhat.com/en/blog/journey-vhost-users-realm
- https://www.redhat.com/en/virtio-networking-series
dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vf representor
Esmalt veendutakse, et Mellanox füüsiline võrgukaart on olemas; ja eswitch legacy režiimis
root@pve-moraal-x570:~# lspci | grep Mellanox 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.1 pci/0000:0f:00.1: mode legacy inline-mode none encap-mode basic root@pve-moraal-x570:~/20240706# devlink dev eswitch show pci/0000:0f:00.0 pci/0000:0f:00.0: mode legacy inline-mode none encap-mode basic
Lülitatakse kaks VF sisse
root@pve-moraal-x570:~# echo 2 > /sys/class/net/enp15s0f0np0/device/sriov_numvfs root@pve-moraal-x570:~# lspci | grep Mellanox 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function 0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
Omistatakse mac aadressid VF jaoks
root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 0 mac e4:11:22:33:46:50 root@pve-moraal-x570:~# ip link set enp15s0f0np0 vf 1 mac e4:11:22:33:46:51 root@pve-moraal-x570:~# ip link show dev enp15s0f0np0 3: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff vf 0 link/ether e4:11:22:33:46:50 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 1 link/ether e4:11:22:33:46:51 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
Seotakse lahti mlx5_core draiver VF seadmest
root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use Kernel driver in use: mlx5_core Kernel driver in use: mlx5_core Kernel driver in use: mlx5_core Kernel driver in use: mlx5_core root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use Kernel driver in use: mlx5_core Kernel driver in use: mlx5_core
Lülitatakse legacy asemel switchdev mode
root@pve-moraal-x570:~# devlink dev eswitch set pci/0000:0f:00.0 mode switchdev
Seostatakse VF seadmetega mlx5_core draiver
root@pve-moraal-x570:~# echo 0000:0f:00.2 > /sys/bus/pci/drivers/mlx5_core/bind root@pve-moraal-x570:~# echo 0000:0f:00.3 > /sys/bus/pci/drivers/mlx5_core/bind root@pve-moraal-x570:~# lspci -vvv | grep mlx5_ | grep use Kernel driver in use: mlx5_core Kernel driver in use: mlx5_core Kernel driver in use: mlx5_core Kernel driver in use: mlx5_core
Tehakse restart ovs protsessidele
root@pve-moraal-x570:~# ovs-vsctl show 9ffe33a7-4486-4701-b4c0-880c61f407b7 ovs_version: "3.1.0" root@pve-moraal-x570:~# systemctl restart openvswitch-switch root@pve-moraal-x570:~# ovs-vsctl show 9ffe33a7-4486-4701-b4c0-880c61f407b7 ovs_version: "3.1.0"
Võetakse kasutusele hugepage ressurss
# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
Rakendatakse ovs üldised seadistused
root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true" root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true root@pve-moraal-x570:~# ovs-vsctl set Open_vSwitch . other_config:dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1" root@pve-moraal-x570:~# systemctl restart openvswitch-switch
Tekitatakse ovs sisu
root@pve-moraal-x570:~# ovs-vsctl add-br vmbr0 -- set bridge vmbr0 datapath_type=netdev root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk options:dpdk-devargs=0000:0f:00.0 root@pve-moraal-x570:~# ovs-vsctl add-port vmbr0 representor -- set Interface representor type=dpdk options:dpdk-devargs=0000:0f:00.0,representor=0 root@pve-moraal-x570:~# ovs-vsctl set port representor tag=10
Tulemusena
root@pve-moraal-x570:~# ovs-vsctl list open_vSwitch _uuid : 9ffe33a7-4486-4701-b4c0-880c61f407b7 bridges : [1d7619bb-7b33-4959-9266-325808d72c13] cur_cfg : 7 datapath_types : [netdev, system] datapaths : {} db_version : "8.3.1" dpdk_initialized : true dpdk_version : "DPDK 22.11.5" external_ids : {hostname=pve-moraal-x570.sise.moraal.ee, rundir="/var/run/openvswitch", system-id="9e871242-44eb-499f-a049-32f089e65f68"} iface_types : [afxdp, afxdp-nonpmd, bareudp, dpdk, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan] manager_options : [] next_cfg : 7 other_config : {dpdk-extra="-a 0000:0f:00.0,representor=0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1", dpdk-init="true", hw-offload="true"} ovs_version : "3.1.0" ssl : [] statistics : {} system_type : debian system_version : "12"
ja
root@pve-moraal-x570:~# ovs-vsctl show 9ffe33a7-4486-4701-b4c0-880c61f407b7 Bridge vmbr0 datapath_type: netdev Port dpdk-p0 Interface dpdk-p0 type: dpdk options: {dpdk-devargs="0000:0f:00.0"} Port vmbr0 Interface vmbr0 type: internal Port representor tag: 10 Interface representor type: dpdk options: {dpdk-devargs="0000:0f:00.0,representor=0"} ovs_version: "3.1.0"
PVE virtuaalse arvuti seadistus, representor-lahendus töötab selliselt, et vf seade tuleb nö ededal ja tavalisel kujul anda virtuaalsele arvutile kasutada
root@pve-moraal-x570:~# cat /etc/pve/qemu-server/123.conf agent: 1 bios: ovmf boot: order=virtio0;ide2;net0 cores: 4 cpu: host efidisk0: sn_srv_btrfs:123/vm-123-disk-5.raw,efitype=4m,pre-enrolled-keys=1,size=528K hostpci0: 0000:0f:00.2 hugepages: 1024 ide2: none,media=cdrom machine: q35,viommu=intel memory: 1024 name: deb11-tm-tartu-btrfs numa: 1 onboot: 0 ostype: l26 rng0: source=/dev/urandom scsihw: virtio-scsi-pci serial0: socket smbios1: uuid=0d52990c-3194-4c80-942f-d14621a6e940 sockets: 1 vga: virtio virtio0: sn_srv_btrfs:123/vm-123-disk-0.raw,size=16G virtio1: sn_srv_btrfs:123/vm-123-disk-1.raw,size=32G virtio2: sn_srv_btrfs:123/vm-123-disk-2.raw,size=24G vmgenid: fb558ebc-7d59-4c13-8eae-505349adf9a1
Tulemusena
- võrk töötab
- jõudlus on mõõdukas
- virtuaalses arvutis kasutatakse võrguseadmel mlx5_core tavalist draiverit
root@tm-tartu-x570:~# lspci | grep Mella 06:10.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
PVE host peal on paista iseloomulik nö 100% ühe cpu kasutus
Kasulikud lisamaterjalid
- https://docs.openvswitch.org/en/latest/topics/dpdk/phy/#
- https://github.com/Mellanox/scalablefunctions/wiki/Upstream-step-by-step-guide
- https://www.youtube.com/watch?v=37MN8C_MNuQ
dpdk abil liikluse kohale toomine ovs switchi kaudu qemu virtuaalse arvuti juurde - vdpa
TODO
Kasulikud lisamaterjalid
kernel tls
TODO
inbox draiver ja utiliidid
# apt-get install mstflint
Kaardi SRIOV muutmiseks sobib öelda
root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=False Device #1: ---------- Device type: ConnectX6LX Name: MCX631102AN-ADA_Ax Description: ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto Device: 0000:0f:00.0 Configurations: Next Boot New SRIOV_EN True(1) False(0) Apply new Configuration? (y/n) [n] : y Applying... Done! -I- Please reboot machine to load new configurations.
Tulemusena ei ole enam VF võimekust
root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s* lrwxrwxrwx 1 root root 0 Jun 30 03:32 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci -r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_device -r--r--r-- 1 root root 4096 Jun 30 03:35 /sys/class/net/enp15s0f0np0/device/subsystem_vendor
Peale tagasi sisse lülitamist on VF võimekus tagasi
root@pve-moraal-x570:~# mstconfig -d 0000:0f:00.0 set SRIOV_EN=True root@pve-moraal-x570:~# ls -ld /sys/class/net/enp15s0f0np0/device/s* -rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_drivers_autoprobe -rw-r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_numvfs -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_offset -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_stride -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_totalvfs -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_device -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/sriov_vf_total_msix lrwxrwxrwx 1 root root 0 Jun 30 03:39 /sys/class/net/enp15s0f0np0/device/subsystem -> ../../../../bus/pci -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_device -r--r--r-- 1 root root 4096 Jun 30 03:43 /sys/class/net/enp15s0f0np0/device/subsystem_vendor root@pve-moraal-x570:~# cat /sys/class/net/enp15s0f0np0/device/sriov_totalvfs 8
Nende muudatuste tegemiseks peab olema secure boot välja lülitatud.
mstlink utiliit
root@pve-moraal-x570:~# mstlink -d 0000:0f:00.0 --show_device Operational Info ---------------- State : Polling Physical state : ETH_AN_FSM_ENABLE Speed : N/A Width : N/A FEC : N/A Loopback Mode : No Loopback Auto Negotiation : FORCE - 25G,10G,1G Supported Info -------------- Enabled Link Speed (Ext.) : 0x00000052 (25G,10G,1G) Supported Cable Speed (Ext.) : 0x00000003 (1G,100M) Troubleshooting Info -------------------- Status Opcode : 36 Group Opcode : PHY FW Recommendation : Force Mode no partner detected. Tool Information ---------------- Firmware Version : 26.41.1000 amBER Version : 2.05 MSTFLINT Version : mstflint 4.21.0 Device Info ----------- Part Number : N/A Part Name : N/A Serial Number : N/A Revision : N/A FW Version : 26.41.1000 Note: P/N, Product Name, S/N and Revision are supported only in switches
Virtual Functions kasutamiseks, lähtepunt on nö tavaolek
root@pve-moraal-x570:~# lspci | grep Mellanox 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
4 funktsiooni kasutamiseks sobib öelda
root@pve-moraal-x570:~# lspci | grep Mellanox 0f:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 0f:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 0f:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function 0f:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function 0f:00.4 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function 0f:00.5 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
Kusjuures moodustatakse sellised seadmed
root@pve-moraal-x570:~# ip link show dev enp15s0f0np0 50: enp15s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether e8:eb:d3:0b:78:74 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 2 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off vf 3 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off root@pve-moraal-x570:~# ip link show dev enp15s0f0v0 56: enp15s0f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 3a:48:d1:d6:57:dd brd ff:ff:ff:ff:ff:ff
VF seadet saab host peal kasutama asuda praktiliselt nagu tavalist võrguseadet. Teine variant on ta saata edasi pcie passthru abil PVE virtuaalsele arvutile. Virtuaalses arvutis paistab kaart selline
root@pve-sdn-01:~# lspci | grep Mell 01:00.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function root@pve-sdn-01:~# ethtool -i enp1s0 driver: mlx5_core version: 6.8.8-1-pve firmware-version: 26.41.1000 (MT_0000000531) expansion-rom-version: bus-info: 0000:01:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes
kus
- virtuaalses arvutis sobib kasutada sama mlx5 draiverit mida kasutatakse pve host peal
dpdk kasutamine
Paigaldatakse dpdk ja dpdk-dev paketid
# apt-get install dpdk dpdk-dev
Tulemusena on süsteemis muu hulgas utiliidid
- dpdk-devbind.py
- dpdk-hugepages.py
- dpdk-testpmd
Olukorra hindamine, vt https://doc.dpdk.org/guides/nics/mlx5.html -> 'Usage example'
root@pve-moraal-x570:~# ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d / -f 5 enp15s0f0np0 enp15s0f1np1
testpmd käivitamine, täpslt ei ole saada aru, kas see on edukas käivitamine
root@pve-moraal-x570:~# dpdk-testpmd -l 8-15 -n 4 -a 0f:00.0 -a 0f:00.1 -- --rxq=2 --txq=2 -i EAL: Detected CPU lcores: 24 EAL: Detected NUMA nodes: 1 EAL: Detected shared linkage of DPDK EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.0 (socket -1) EAL: Probe PCI driver: mlx5_pci (15b3:101f) device: 0000:0f:00.1 (socket -1) Interactive-mode selected Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa. testpmd: create a new mbuf pool <mb_pool_0>: n=203456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) Port 0: E8:EB:D3:0B:78:74 Configuring Port 1 (socket 0) Port 1: E8:EB:D3:0B:78:75 Checking link statuses... Done testpmd>
Näiteks
root@pve-moraal-x570:~# dpdk-testpmd -l 6-9 -n 4 -a 0f:00.0 -- --rxq=4 --txq=4 -i testpmd> set fwd txonly testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 0 RX-missed: 0 RX-bytes: 0 RX-errors: 0 RX-nombuf: 0 TX-packets: 5153664 TX-errors: 0 TX-bytes: 329834496 Throughput (since last show) Rx-pps: 0 Rx-bps: 0 Tx-pps: 1420582 Tx-bps: 727338208 ############################################################################
ja teisel pordil võrku pealt kuulates paistab
root@pve-moraal-x570:~# tcpdump -c 4 -nei enp15s0f1np1 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on enp15s0f1np1, link-type EN10MB (Ethernet), snapshot length 262144 bytes 09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22 09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22 09:07:53.700764 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22 09:07:53.700765 e8:eb:d3:0b:78:74 > 02:00:00:00:00:00, ethertype IPv4 (0x0800), length 64: 198.18.0.1.9 > 198.18.0.2.9: UDP, length 22
Kasulikud lisamaterjalid
- https://www.youtube.com/watch?v=KX1QOqMtchg
- https://www.youtube.com/watch?v=0yDdMWQPCOI
- https://www.youtube.com/watch?v=Un5-AN4nb9s
Misc
# lspci | grep 3d:00 3d:00.0 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] 3d:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
devlink show andmed
# devlink dev show pci/0000:3d:00.0 pci/0000:3d:00.1
ja devlink info
# devlink dev info pci/0000:3d:00.0: driver mlx5_core versions: fixed: fw.psid SM_1281000001000 running: fw.version 26.35.2000 fw 26.35.2000 stored: fw.version 26.35.2000 fw 26.35.2000 pci/0000:3d:00.1: driver mlx5_core versions: fixed: fw.psid SM_1281000001000 running: fw.version 26.35.2000 fw 26.35.2000 stored: fw.version 26.35.2000 fw 26.35.2000
ethtool andmed
# ethtool -i ens7f0np0 driver: mlx5_core version: 5.15.0-92-generic firmware-version: 26.35.2000 (SM_1281000001000) expansion-rom-version: bus-info: 0000:3d:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes
kus
- ethtool ja devlink-dev-info väljundis klapib kaardil oleva firmware versioon - 26.35.2000
Tootja MLNX EN tarkvara kasutamine
Mõisted
- MFT - NVIDIA Firmware Tools, tõenäoliselt algupäraselt Mellanox Firmware Tools
Kasutamine füüsilise seadme tervikuna passthru režiimis
Väited
- üldiselt proxmox v. 8 keskkonnas saab mellanox seadme anda üle virtuaalsele arvuti tavalisel viisil (valides pve webgui liidses 'Add -> PCI device' ja näidates esimese MLNX seadme; teine lisatakse automaatselt
- tundub, et füüsilist mellanox seadet ei saa tervikuna nö täiuslikult virtuaalsele arvutile edasi anda, ühe asjana puudub jääb sr-iov võimekus
- virtuaalses arvutis saab võrguseadet kasutada tema PF osas, VF ei ole ligipääsetav
- virtuaalsele arvutile saab lisada tavalise pve webgui peal vIOMMU ning siis paigutatakse virtuaalse arvuti seadmed sh erinevad MLNX adapteri füüsilised pordid erinevatesse IOMMU gruppidesse
Sellisele asjakorraldusele on üldiselt iseloomulik, et host arvutis tegeleb edasi vfio draiver seadmega
root@pve-moraal-x570:~# lspci -vvv | grep vfio Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci
Ilma rebootida host peale seadme koos mlx driveri kasutamisega tagasi saamiseks sobib
- lõpetada virtuaalse arvuti töötamine
- öelda host peal
echo 1 > /sys/bus/pci/devices/0000\:0f\:00.0/remove echo 1 > /sys/bus/pci/devices/0000\:0f\:00.1/remove echo 1 > /sys/bus/pci/rescan
Tarkvara paigaldamine
TODO
root@debian-mlnx-01:~# mount /root/mlnx-en-24.04-0.6.6.0-debian12.1-x86_64.iso /mnt/mlnx
root@debian-mlnx-01:~# find /lib/modules/6.1.0-22-amd64/ -type f -mmin -20 -ls | grep dkms 661960 4312 -rw-r--r-- 1 root root 4415205 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_core.ko 661959 28 -rw-r--r-- 1 root root 25237 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx_compat.ko 661961 8 -rw-r--r-- 1 root root 5565 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlx5_ib.ko 661962 52 -rw-r--r-- 1 root root 49445 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxfw.ko 661963 208 -rw-r--r-- 1 root root 210797 Jun 29 22:07 /lib/modules/6.1.0-22-amd64/updates/dkms/mlxdevm.ko
Firmware uuendamine
root@debian-mlnx-01:~# apt-get install mlnx-fw-updater Reading package lists... Done Building dependency tree... Done Reading state information... Done The following package was automatically installed and is no longer required: linux-image-6.1.0-15-amd64 Use 'apt autoremove' to remove it. The following NEW packages will be installed: mlnx-fw-updater 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 0 B/50.2 MB of archives. After this operation, 87.9 MB of additional disk space will be used. Get:1 file:/mnt/mlnx/DEBS_ETH ./ mlnx-fw-updater 24.04-0.6.6.0 [50.2 MB] Selecting previously unselected package mlnx-fw-updater. (Reading database ... 63847 files and directories currently installed.) Preparing to unpack .../mlnx-fw-updater_24.04-0.6.6.0_amd64.deb ... Unpacking mlnx-fw-updater (24.04-0.6.6.0) ... Setting up mlnx-fw-updater (24.04-0.6.6.0) ... Initializing... Attempting to perform Firmware update... Querying Mellanox devices firmware ... Device #1: ---------- Device Type: ConnectX6LX Part Number: MCX631102AN-ADA_Ax Description: ConnectX-6 Lx EN adapter card; 25GbE ; Dual-port SFP28; PCIe 4.0 x8; No Crypto PSID: MT_0000000531 PCI Device Name: 01:00.0 Base GUID: e8ebd303000b7874 Base MAC: e8ebd30b7874 Versions: Current Available FW 26.32.2004 26.41.1000 PXE 3.6.0502 3.7.0400 UEFI 14.25.0018 14.34.0012 Status: Update required --------- Found 1 device(s) requiring firmware update... Device #1: Updating FW ... FSMST_INITIALIZE - OK Writing Boot image component - OK Done Restart needed for updates to take effect. Log File: /tmp/oaFVUkaJsl Real log file: /tmp/mlnx_fw_update.log root@debian-mlnx-01:~# less /tmp/mlnx_fw_update.log CMD: mlxup -u --log-on-update --ssl-certificate /tmp/OloIGrYWuz/mlxfwmanager_sriov_dis_x86_64_4127-dir/ca-bundle.crt --current-dir /opt/mellanox/mlnx-fw-updater/ -L /tmp/oaFVUkaJsl -y -d 01:00.0 Querying Mellanox devices firmware ... Device #1: ---------- ...
Paistab, et tulemusena on kaardil olemas kaks versiooni firmwarest
root@debian-mlnx-01:~# devlink dev info pci/0000:01:00.0: driver mlx5_core versions: fixed: fw.psid MT_0000000531 running: fw.version 26.32.2004 fw 26.32.2004 stored: fw.version 26.41.1000 fw 26.41.1000 pci/0000:01:00.1: driver mlx5_core versions: fixed: fw.psid MT_0000000531 running: fw.version 26.32.2004 fw 26.32.2004 stored: fw.version 26.41.1000 fw 26.41.1000
kus
- running version - 26.32.2004
- stored version - 26.41.1000
systemd unit mlnx-en.d
Tundub, et mlx driveritega tegeleb systemd unit
root@debian-mlnx-01:~# dpkg -S /etc/mlnx-en.conf mlnx-en-utils: /etc/mlnx-en.conf root@debian-mlnx-01:~# cat /etc/mlnx-en.conf # Allow calling the service script with the option 'stop' for unloading the driver stack. # This flag should be disabled when the OS root file system is on remote storage. ALLOW_STOP=yes # Run sysctl performance tuning script RUN_SYSCTL=no # Run /usr/sbin/mlnx_tune RUN_MLNX_TUNE=no # Load MLX4 modules MLX4_LOAD=no # Load MLX5 modules MLX5_LOAD=yes root@debian-mlnx-01:~# systemctl start mlnx-en.d root@debian-mlnx-01:~# systemctl status mlnx-en.d ● mlnx-en.d.service - mlnx-en.d - configure Mellanox devices Loaded: loaded (/lib/systemd/system/mlnx-en.d.service; enabled; preset: enabled) Active: active (exited) since Sun 2024-06-30 00:49:29 EEST; 6s ago Docs: file:/etc/mlnx-en.conf Process: 1505 ExecStart=/etc/init.d/mlnx-en.d start (code=exited, status=0/SUCCESS) Main PID: 1505 (code=exited, status=0/SUCCESS) CPU: 385ms Jun 30 00:49:26 debian-mlnx-01 systemd[1]: Starting mlnx-en.d.service - mlnx-en.d - configure Mellanox devices... Jun 30 00:49:28 debian-mlnx-01 mlnx-en.d[1505]: [32B blob data] Jun 30 00:49:29 debian-mlnx-01 systemd[1]: Finished mlnx-en.d.service - mlnx-en.d - configure Mellanox devices.
samal ajal dmesg väljundis
# dmesg -T -w [Sun Jun 30 00:49:26 2024] Compat-mlnx-ofed backport release: 7037b8d [Sun Jun 30 00:49:26 2024] Backport based on https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git 7037b8d [Sun Jun 30 00:49:26 2024] compat.git: https://:@git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: firmware version: 26.32.2004 [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link) [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048) [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: Port module event: module 0, Cable plugged [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: mlx5_pcie_event:304:(pid 1398): PCIe slot advertised sufficient power (75W). [Sun Jun 30 00:49:26 2024] mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.0 enp1s0f0np0: renamed from eth0 [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: firmware version: 26.32.2004 [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: 126.024 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x8 link) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: Port module event: module 1, Cable plugged [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: mlx5_pcie_event:304:(pid 1391): PCIe slot advertised sufficient power (75W). [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 basic) [Sun Jun 30 00:49:27 2024] mlx5_core 0000:01:00.1 enp1s0f1np1: renamed from eth0
Kusjuures 'systemctl stop mlnx-en.d' eemaldab mlx moodulid mälust
root@debian-mlnx-01:~# lsmod | grep mlx mlx5_core 2269184 0 mlxfw 36864 1 mlx5_core mlxdevm 180224 1 mlx5_core mlx_compat 20480 2 mlxdevm,mlx5_core psample 20480 1 mlx5_core tls 135168 1 mlx5_core pci_hyperv_intf 16384 1 mlx5_core root@debian-mlnx-01:~# systemctl stop mlnx-en.d root@debian-mlnx-01:~# lsmod | grep mlx root@debian-mlnx-01:~#
devlink kasutamine
root@debian-mlnx-01:~# devlink dev param show pci/0000:01:00.0 name enable_roce pci/0000:01:00.0: name enable_roce type generic values: cmode driverinit value true
Monitoring
Tundub, et inimese jaoks on see kaarti küljes olev füüsiline radiaator päris kuum (sõrme küljes ei jaoks hoida), ja tundub, et sisuliselt on see ok, 'The adapter card incorporates the ConnectX IC, which operates in the range of temperatures between 0°C and 105°C.', https://docs.nvidia.com/networking/display/connectx6lxen/monitoring
root@debian-mlnx-01:~# devlink dev pci/0000:01:00.0 pci/0000:01:00.1 root@debian-mlnx-01:~# mget_temp -d 0000:01:00.0 82 root@debian-mlnx-01:~# mget_temp -d 0000:01:00.1 83
Kasulikud lisamaterjalid
Misc
- https://www.youtube.com/watch?v=XLPgDEbUMgk - 'How to set Mellanox ConnectX VPI to Ethernet or Infiniband in Linux'
E810 kasutamine
Kasulikud lisamaterjalid
- https://bugzilla.redhat.com/show_bug.cgi?id=2082528
- https://bugzilla.redhat.com/show_bug.cgi?id=1878026