Hi,
according to MS in windows server 2016 RTM (hyper-v) when I create vswitch with Switch Embedded Teaming I should be able to use SR-IOV for VMs.
Is it currently supported with ConnectX-4 and WinOF-2 ? (v1.50 at this time).
Hi,
according to MS in windows server 2016 RTM (hyper-v) when I create vswitch with Switch Embedded Teaming I should be able to use SR-IOV for VMs.
Is it currently supported with ConnectX-4 and WinOF-2 ? (v1.50 at this time).
Hi Viki,
Thanks for sharing the useful tip. Much obliged. Learned some new things in a New Year
Best,
Chin
The solution was to remove MLNX_OFED and use the distribution's drivers/kernel modules.
on my machine, perfquery -x returns 64-bit values for the port counters, but i am unable to determine where these counters are. e.g. the counter /sys/class/infiniband/mlx4_0/ports/1/counters/port_rcv_data is only a 32-bit value and is maxed out at 4294967295. according to mlx5 docs there should be a counters_ext directory but that is not present on my system. is there a way to enable that with mlx4 or how am i to get the correct value?
I have a ConnectX-3 dual port. (HP branded)
When I open Device Manager > System Devices > Mellanox NIC > Port Protocol
Only Port 1 is available to change between IB, ETH and AUTO. Port 2 is greyed out.
When we installed the NIC in the server, both ports were set to IB - We then changed both to ETH. But now I need to change it back, with no luck.
I have tried reinstalling the driver, changing the settings with MLXTOOL and restoring the NIC back to defaults with powershell.
Anyone know what to do?
Thanks for the response,
What kind of performance hit should I expect on an SX1024 due to packet fragmentation that happens during the inter-vlan-routing ?
This is a diagram of my current set up.
+----------------+
| Linux Router |
| ConnectX-3 |
| port 1 port 2 |
+----------------+
/ \
+---------------+ / \ +---------------+
| Host 1 | / A A \ | Host 2 |
| ConnectX-4-LX | / \ | ConnectX-4-LX |
| Port 1 |- -| Port 1 |
| Port 2 |----------------| Port 2 |
+---------------+ B +---------------+
The Linux router has the ConnectX-3 (not PRO) card in Ethernet mode and is using a breakout cable (port 1 only) to connect to the ConnectX-4-LX cards at 10 Gb as path 'A'. The second port of the ConnectX-4-LX cards are connected directly at 25 Gb as path 'B'. Host 1 & 2 are running CentOS 7.2 with 3.10.0-327.36.3.el7.x86_64 and OFED 3.4. Linux router is running CentOS 7.2 with 4.9.0 kernel.
Iser and RDMA works fine over path 'B' and path 'A' (in either bridge or router mode) and now I want to add latency and drop packets to understand the effects. I'm using tc and netem to add the latency into the path. When I add .5 ms of latency in both directions, iSER slows to a crawl, throws errors in dmesg and sometimes even causes the file system to go read only. If I set the latency back to zero then things clear up and full 10 Gb is achieved. Iperf performs the same with the latency set to 0 or .5 ms for each direction. We would like to get RoCE to work over high-latency high-bandwidth links. If someone has some ideas on how to resolve this issue, I'd love to hear them.
Commands run on the router server:
for i in 2 3; do tc qdisc change dev eth${i} root netem delay .5ms; done
# brctl show
bridge name bridge id STP enabled interfaces
rleblanc 8000.f452147ce541 no eth2
eth3
The iser target is a 100 GB RAM disk exported via iser. I format the disk on the initiator with ext4 and then run this fio command:
echo "3" > /proc/sys/vm/drop_caches; fio --rw=read --bs=4K --size=1G --numjobs=40 --name=worker.matt --group_reporting
I see these messages on the initiator:
[25863.623453] 00000000 00000000 00000000 00000000
[25863.628564] 00000000 00000000 00000000 00000000
[25863.633634] 00000000 00000000 00000000 00000000
[25863.638619] 00000000 08007806 250003c7 0b0190d3
[25863.643593] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[25863.651180] connection40:0: detected conn error (1011)
[25874.368881] mlx5_warn:mlx5_1:dump_cqe:257:(pid 0): dump error cqe
[25874.375619] 00000000 00000000 00000000 00000000
[25874.380690] 00000000 00000000 00000000 00000000
[25874.385712] 00000000 00000000 00000000 00000000
[25874.390693] 00000000 08007806 250003c8 0501ddd3
[25874.395681] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[25874.403283] connection40:0: detected conn error (1011)
[25923.829903] mlx5_warn:mlx5_1:dump_cqe:257:(pid 0): dump error cqe
[25923.836663] 00000000 00000000 00000000 00000000
[25923.841724] 00000000 00000000 00000000 00000000
[25923.846752] 00000000 00000000 00000000 00000000
[25923.851733] 00000000 08007806 250003c9 510134d3
[25923.856709] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[25923.864308] connection40:0: detected conn error (1011)
[25943.184313] mlx5_warn:mlx5_1:dump_cqe:257:(pid 0): dump error cqe
[25943.191079] 00000000 00000000 00000000 00000000
[25943.196208] 00000000 00000000 00000000 00000000
[25943.201287] 00000000 00000000 00000000 00000000
[25943.206281] 00000000 08007806 250003ca 1afdbdd3
[25943.211272] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[25943.218901] connection40:0: detected conn error (1011)
[25962.538633] mlx5_warn:mlx5_1:dump_cqe:257:(pid 0): dump error cqe
[25962.545396] 00000000 00000000 00000000 00000000
[25962.550475] 00000000 00000000 00000000 00000000
[25962.555551] 00000000 00000000 00000000 00000000
[25962.560533] 00000000 08007806 250003cb 21012ed3
[25962.565526] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[25962.573155] connection40:0: detected conn error (1011)
[25973.291038] mlx5_warn:mlx5_1:dump_cqe:257:(pid 0): dump error cqe
[25973.297861] 00000000 00000000 00000000 00000000
[25973.302978] 00000000 00000000 00000000 00000000
[25973.308025] 00000000 00000000 00000000 00000000
[25973.313014] 00000000 08007806 250003cc 1901d2d3
[25973.318004] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[25973.325601] connection40:0: detected conn error (1011)
[26039.955899] mlx5_warn:mlx5_1:dump_cqe:257:(pid 0): dump error cqe
[26039.962690] 00000000 00000000 00000000 00000000
[26039.967825] 00000000 00000000 00000000 00000000
[26039.972894] 00000000 00000000 00000000 00000000
[26039.977891] 00000000 08007806 250003cd 850172d3
[26039.982905] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[26039.990512] connection40:0: detected conn error (1011)
[26067.411753] mlx5_warn:mlx5_1:dump_cqe:257:(pid 0): dump error cqe
[26067.418598] 00000000 00000000 00000000 00000000
[26067.423733] 00000000 00000000 00000000 00000000
[26067.428832] 00000000 00000000 00000000 00000000
[26067.433826] 00000000 08007806 250003ce 092977d3
[26067.438818] iser: iser_handle_wc: wr id ffffffffffffffff status 6 vend_err 78
[26067.446462] connection40:0: detected conn error (1011)
There are no messages on the target server.
Hello,
I have two Windows servers, directly connected with Mellanox ConnectX-3 without switch infrastructure between.
Is this a supported scenario with functioning RDMA/RoCE?
If so, do I still need to implement Datacenter Bridging in Windows or is only required when switches are in play?
Starting from Windows 2012 and latter (including windows 2016 of course ) - all teaming drivers and support is now within Microsoft native OS NetLBFO
Mellanox is not involved whatsoever with providing module, packet etc...so it's all with MS to check whether CX4 or any other adapter adapter is in their support compatibility matrix
see https://technet.microsoft.com/en-us/library/jj130849.aspx
see also relevant NSDN documentation on that
Learn to Develop with Microsoft Developer Network | MSDN
Can anyone help in above issue?
Hi Oskar,
Can you provide the contents of the logfile which is mentioned in the log /tmp/MLNX_OFED_LINUX-3.4-2.0.0.0-3.10.0-514.2.2.rt56.424.el7.x86_64/mlnx_ofed_iso.21610.log?
The logfile name should contain *.rpmbuild.log
Thanks and regards,
~Martijn
Aviap,
SRIOV is not supported on SET team.
If a device PMA supports the extended port counters (which is your case), it depends on which kernel is being used. There were recent kernel changes to utilize the optional PortCountersExtended rather than the mandatory PortCounters. So either a recent kernel with these changes would be needed to see this or the relevant changes backported to some older kernel.
Dear All,
Are you aware of any compatibility issues between Veritas LLT on RedHat 6.8 and Mellanox? I see suspicious messages during system boot (no issues with the functioning of LLT have been noticed).
LLT package is VRTSllt-6.2.1.500-RHEL6.x86_64 on Red Hat 6.8, kernel 2.6.32-642.4.2.el6.x86_64 (mlx4_en.ko 2.2-1 came with the kernel)
[nep179@prdctlscthdb01-20161206]$ sudo egrep "Nov 23(.)*kernel: llt" messages-20161127
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol ib_create_cq
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol ib_create_cq
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_resolve_addr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_resolve_addr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol ib_dereg_mr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol ib_dereg_mr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_reject
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_reject
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_disconnect
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_disconnect
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_resolve_route
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_resolve_route
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_bind_addr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_bind_addr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_create_qp
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_create_qp
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol ib_destroy_cq
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol ib_destroy_cq
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_create_id
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_create_id
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_listen
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_listen
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_destroy_qp
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_destroy_qp
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol ib_get_dma_mr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol ib_get_dma_mr
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol ib_alloc_pd
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol ib_alloc_pd
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_connect
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_connect
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_destroy_id
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_destroy_id
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol ib_resize_cq
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol ib_resize_cq
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol rdma_accept
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol rdma_accept
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: disagrees about version of symbol ib_dealloc_pd
Nov 23 15:09:36 prdctlscthdb01 kernel: llt: Unknown symbol ib_dealloc_pd
[nep179@prdctlscthdb01-20161206]$
[nep179@prdctlscthdb01-20161206]$
[nep179@prdctlscthdb01-20161206]$ sudo egrep "Nov 23(.)*kernel: llt" messages-20161127 | wc -l
38
[nep179@prdctlscthdb01-20161206]$
Thank you,
Aleksandr
Hi,
After rechecking this issue, I figured out that we don't support fragmentation on the switches, it should be done on the adapter.
packets that will arrived with larger MTU directed to smaller MTU port will be dropped.
There is a DF (don't fragment) flag on the IP header that allow of forbid packet fragments. your problem is when running 9K packets from the Storage to some of the 1500 MTU servers.
See also:IPv4 - Wikipedia
Ophir.
I deleted the old reply to avoid confusion.
Fragmentation is not supported on the switch. So there is no issue with latency.
We are using CentOS 6.8 with kernel 2.6.32-642.11.1.el6.x86_64 (latest available), and the CentOS mlx4 kernel modules (I tried using OFED but they wouldn't support NFSoRDMA)
Hi Oskar,
Are you able to provide us with the requested log?
Thanks and regards,
~Martijn
The changes for this are relatively recent and went into some 4.x kernel.
Hello Mads,
Please try the following:
1. Connect a cable to the second port and then try to change the port type (if you don't have a spare cable you can connect the cable in loop back from port 1 to 2)
2. Please make sure that the other end also configured to work in IB
3. You can try changing the port type form the powershell using our MFT tool:
C:\Program Files\Mellanox\WinMFT>.\mst status
C:\Program Files\Mellanox\WinMFT> .\mlxconfig.exe -d <device id> set LINK_TYPE_P1=1
C:\Program Files\Mellanox\WinMFT> .\mlxconfig.exe -d <device id> set LINK_TYPE_P2=1
Thanks,
Karen.