Intel E810
Sissejuhatus
Mõisted
- epct - ethernet port configuration tool
Riistvara seadistamine
arvuti BIOS
TODO
HII
TODO
devlink utiliit
TODO
epct utiliit - os käsurida
epct võiks olla väga asjakohane tööriist, aga 2024 aasta kevadel ta praktiliselt eriti ei tööta, ei linux kernel v 6.x ega v. 5.x puhul
kui ice draiver on laaditud
# ./epct64e -devices Ethernet Port Configuration Tool EPCT version: v1.40.05.05 Copyright 2019 - 2023 Intel Corporation. Cannot initialize port: [00:129:00:00] Intel(R) Ethernet Controller E810-XXV for SFP Cannot initialize port: [00:129:00:01] Intel(R) Ethernet Controller E810-XXV for SFP Error: Cannot initialize adapter.
kui ice draiver ei ole laaditud
# ./epct64e -devices Ethernet Port Configuration Tool EPCT version: v1.40.05.05 Copyright 2019 - 2023 Intel Corporation. Base driver not supported or not present: [00:129:00:00] Intel(R) Ethernet Controller E810-XXV for SFP NIC Seg:Bus:Fun Ven-Dev Connector Ports Speed Quads Lanes per PF === ============= ========= ========= ===== ======== ====== ============ 1) 000:129:00-01 8086-159B SFP 2 - Gbps N/A N/A Error: Base driver is not available for one or more adapters. Please ensure the driver is correctly attached to the device.
epct utiliit - efi rakendus
epct efi rakendus natuke töötab, aga ei võimalda siis 'svio enable' teha, https://www.intel.com/content/www/us/en/download/19440/ethernet-port-configuration-tool-efi.html
nvmeupdate efi rakendus iseenesest ei anna viga, aga ei saa ka aru, et süsteemis oleks e810 kaart, https://www.intel.com/content/www/us/en/download/19629/non-volatile-memory-nvm-update-utility-for-intel-ethernet-network-adapters-e810-series-efi.html
Misc
Kasutada on üks füüsiline kahe pordiga võrgukaart
# lspci | grep -i net 81:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) 81:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
Operatsioonisüsteemi tavalised võrguseadmed on sellised
root@pve-02:~# ethtool enp129s0f1np1
Settings for enp129s0f1np1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
25000baseCR/Full
25000baseSR/Full
1000baseX/Full
10000baseCR/Full
10000baseSR/Full
10000baseLR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: None RS BASER
Advertised link modes: 25000baseCR/Full
10000baseCR/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: None RS BASER
Link partner advertised link modes: Not reported
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 25000Mb/s
Duplex: Full
Auto-negotiation: on
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Supports Wake-on: g
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
root@pve-02:~# ethtool -i enp129s0f1np1
driver: ice
version: 6.5.13-5-pve
firmware-version: 4.40 0x8001c7d4 1.3534.0
expansion-rom-version:
bus-info: 0000:81:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
Paistab devlink vaatest sedasi
# devlink dev info
pci/0000:81:00.0:
driver ice
serial_number 00-01-00-ff-ff-00-00-00
versions:
fixed:
board.id K58132-000
running:
fw.mgmt 7.4.13
fw.mgmt.api 1.7.11
fw.mgmt.build 0xded4446f
fw.undi 1.3534.0
fw.psid.api 4.40
fw.bundle_id 0x8001c7d4
fw.app.name ICE OS Default Package
fw.app 1.3.36.0
fw.app.bundle_id 0xc0000001
fw.netlist 4.4.5000-2.15.0
fw.netlist.build 0x0ba411b9
stored:
fw.undi 1.3534.0
fw.psid.api 4.40
fw.bundle_id 0x8001c7d4
fw.netlist 4.4.5000-2.15.0
fw.netlist.build 0x0ba411b9
...
RoCEv2 kasutamine
Oluline on, et see ip konf millega opereeritakse järgnevalt oleks kinnitatud füüsilise seadme külge (st mitte nii, et arvutis on kasutusel nt OVS virtual switch ning ip on vlan47 küljes, sama ovs bridge küljes on füüsiline enp129s0f1np1 jne, st siis ei toimi)
root@pve-02:~# ip addr show dev enp129s0f1np1:
3: enp129s0f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 3c:ec:ef:e6:69:b9 brd ff:ff:ff:ff:ff:ff
inet 10.47.218.226/24 scope global enp129s0f1np1
valid_lft forever preferred_lft forever
inet6 fe80::3eec:efff:fee6:69b9/64 scope link
valid_lft forever preferred_lft forever
Muu hulgas toetab see seade iwarp ja rocev2 protokolle, devlink vaatest paistab see nii
# devlink dev param
pci/0000:81:00.0:
name enable_roce type generic
values:
cmode runtime value true
name enable_iwarp type generic
values:
cmode runtime value false
pci/0000:81:00.1:
name enable_roce type generic
values:
cmode runtime value false
name enable_iwarp type generic
values:
cmode runtime value true
Korraga saab olla aktiivne üks või teine, nende vahel valmine toimub nt selliselt devlink abil
# devlink dev param set pci/0000:81:00.1 name enable_iwarp value false cmode runtime
rocev2 kasutamiseks on asjakohen süsteemidesse paigaldada omajagu rdma ja infiniband traditsiooniga tarkvara, nt
# apt-get install rdma-core # apt-get install perftest # apt-get install ibverbs-utils # apt-get install infiniband-diags # apt-get install ibverbs-providers
Asjasse puutuvad sellised driverid
- ice - põhi draiver
- irdma - intel võrgukaartide (2024 aastal e810 ja midagi vanemat ka) - 'intel rdma'
Seadmed kui rdma seadmed paistavad
root@pve-01:~# rdma link link rocep129s0f0/1 state ACTIVE physical_state LINK_UP netdev enp129s0f0np0 link rocep129s0f1/1 state ACTIVE physical_state LINK_UP netdev enp129s0f1np1
Kahe otse kaabliga süsteemi uurimiseks sobib öelda ühes ja teises arvutis nt nii, pve-02 on nö rping server ja pve-01 on rping klient; seejuures on iseloomulik, et tavalise võrguliidese peal tavalisel viisil võrku pealt kuulates ei ole midagi näha
root@pve-02:~# rping -s -a 10.47.218.226 -v server ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr server ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs server ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst server ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu server ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv server DISCONNECT EVENT... wait for RDMA_READ_ADV state 10 root@pve-01:~# rping -c -a 10.47.218.226 -v -C 5 ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv client DISCONNECT EVENT...
Veel huvitavaid utiliite, seejuures on esimene seade 'devlink dev param ...' abil seadistatud rocev2 režiimi ja teine iwarp režiimi
# ibv_devices
device node GUID
------ ----------------
rocep129s0f0 3eeceffffee667b2
iwp129s0f1 3eeceffffee667b3
# ibv_devinfo
hca_id: rocep129s0f0
transport: InfiniBand (0)
fw_ver: 1.71
node_guid: 3eec:efff:fee6:67b2
sys_image_guid: 3eec:efff:fee6:67b2
vendor_id: 0x8086
vendor_part_id: 5531
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet
hca_id: iwp129s0f1
transport: iWARP (1)
fw_ver: 1.71
node_guid: 3eec:efff:fee6:67b3
sys_image_guid: 3eec:efff:fee6:67b3
vendor_id: 0x8086
vendor_part_id: 5531
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet
Töötaval juhul roce seadmega
root@pve-02:~# ib_send_bw -i 1 -d rocep129s0f1
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : rocep129s0f1
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : OFF
RX depth : 512
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 1
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x01 QPN 0x0004 PSN 0x807cd5
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:47:218:226
remote address: LID 0x01 QPN 0x0004 PSN 0xba4fee
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:47:218:225
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 3696.643000 != 3000.000000. CPU Frequency is not max.
65536 1000 0.00 2762.05 0.044193
---------------------------------------------------------------------------------------
ja esimene arvuti kliendina
root@pve-01:~# ib_send_bw -i 1 10.47.218.226 -d rocep129s0f1
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : rocep129s0f1
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : OFF
TX depth : 128
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 1
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x01 QPN 0x0004 PSN 0xba4fee
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:47:218:225
remote address: LID 0x01 QPN 0x0004 PSN 0x807cd5
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:47:218:226
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 3698.021000 != 3000.000000. CPU Frequency is not max.
65536 1000 2757.88 2757.84 0.044126
---------------------------------------------------------------------------------------
kus
- tundub, et ib_send_bw on infiniband põlvnevusega utiliit
- Transport type on IB ehk infiniband, st nii nagu roce protokolli tööpõhimõte on
- Link type on ethernet
Mittetöötav juhum - kuna kasutatakse iwarp seadmel
root@pve-02:~# ib_send_bw -i 1 -d iwp129s0f1
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : iwp129s0f1
Number of qps : 1 Transport type : IW
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : OFF
RX depth : 512
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 0
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
ethernet_read_keys: Couldn't read remote address
Unable to read to socket/rdma_cm
Failed to exchange data between server and clients
Failed to deallocate PD - Device or resource busy
Failed to destroy resources
Mittetöötav juhtum - kuna seade on küll arvutis olemas, aga seal ei ole kinnitatud kõnealust ip aadressi
root@pve-02:~# ib_send_bw -i 1 -d rocep129s0f0
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : rocep129s0f0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : OFF
RX depth : 512
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 0
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
Failed to modify QP 5 to RTR
Unable to Connect the HCA's through the link
Kõige praktilisem kasutusjuht on iscsi kasutamine iser + rocev2 viisil. Tulemusena on andmevahetuse kiirus 2x suurem võrreldes nö tavalisega. Target näeb välja selline
/> ls / o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- block .................................................................................................. [Storage Objects: 1] | | o- iscsi_block_md127 ............................................................. [/dev/md127 (27.9TiB) write-thru activated] | | o- alua ................................................................................................... [ALUA Groups: 1] | | o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized] | o- fileio ................................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................................ [Storage Objects: 0] o- iscsi ............................................................................................................ [Targets: 1] | o- iqn.2003-01.org.setup.lun.test .................................................................................... [TPGs: 1] | o- tpg1 .......................................................................................... [no-gen-acls, auth per-acl] | o- acls .......................................................................................................... [ACLs: 1] | | o- iqn.1993-08.org.debian:01:b65e1ba35869 ................................................... [1-way auth, Mapped LUNs: 1] | | o- mapped_lun0 ..................................................................... [lun0 block/iscsi_block_md127 (rw)] | o- luns .......................................................................................................... [LUNs: 1] | | o- lun0 ........................................................ [block/iscsi_block_md127 (/dev/md127) (default_tg_pt_gp)] | o- portals .................................................................................................... [Portals: 1] | o- 10.47.218.226:3261 ............................................................................................. [iser] o- loopback ......................................................................................................... [Targets: 0] o- srpt ............................................................................................................. [Targets: 0] o- vhost ............................................................................................................ [Targets: 0] o- xen-pvscsi ....................................................................................................... [Targets: 0]
ning initiator
root@pve-01:~# iscsiadm -m discovery -t st -p 10.47.218.226:3261 root@pve-01:~# iscsiadm -m node -T iqn.2003-01.org.setup.lun.test -p 10.47.218.226:3261 -o update -n iface.transport_name -v iser root@pve-01:~# iscsiadm -m node -T iqn.2003-01.org.setup.lun.test -p 10.47.218.226:3261 -l root@pve-01:~# lsscsi -s ... root@pve-01:~# iscsiadm -m node -T iqn.2003-01.org.setup.lun.test -p 10.47.218.226:3261 -u root@pve-01:~# iscsiadm -m discoverydb -t sendtargets -p 10.47.218.226:3261 -o delete root@pve-01:~# iscsiadm -m discoverydb
Kasulikud lisamaterjalid
- 'Intel E800 devices now support iWARP and RoCE protocols' - https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/8.7_release_notes/new-features
- https://forum.proxmox.com/threads/mellanox-connectx-4-lx-and-brigde-vlan-aware-on-proxmox-8-0-1.130902/
- https://docs.nvidia.com/networking/display/mftv4270/updating+firmware+using+ethtool/devlink+and+-mfa2+file
- https://edc.intel.com/content/www/ca/fr/design/products/ethernet/appnote-e810-eswitch-switchdev-mode-config-guide/eswitch-mode-switchdev-and-legacy/
- nvidia-mlnx-ofed-documentation-v23-07.pdf
- https://enterprise-support.nvidia.com/s/article/How-To-Enable-Verify-and-Troubleshoot-RDMA
- https://enterprise-support.nvidia.com/s/article/howto-configure-lio-enabled-with-iser-for-rhel7-inbox-driver
- https://dri.freedesktop.org/docs/drm/networking/devlink/ice.html

