Re: "Priority trust-mode is not supported on your system"?
Re: "Priority trust-mode is not supported on your system"?
Hi
Can you show me ibdev2netdev output
Can u also try
mlnx_qos -i <interface>
Thanks
Marc
Re: "Priority trust-mode is not supported on your system"?
Re: "Priority trust-mode is not supported on your system"?
Hi,
Can you make the try to modify the buffer size and send me the output.
ibdev2devnet also , please.
Marc
Re: "Priority trust-mode is not supported on your system"?
Re: "Priority trust-mode is not supported on your system"?
Hi,
After a first check on my card ConnectX-3, I got the same behavior
It seems to be supported only from ConnectX-4 and above.
If you want me to investigate it more, please open a ticket.
# mlnx_qos -i ens6
Buffers commands are not supported on your system
Marc
Re: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%
If you are seeing the same behaviour without VMA, why to complicate the problem? Start tuning the system and see if it helps. Adding more components will not help to troubleshoot. After tuning, I would suggest to check netstat -s/nstat and 'netstat -unp' to check the receive queue size.
The tuning guides are available from Mellanox site - Performance Tuning for Mellanox Adapters
You also might check what is the current number of send/receive queues configured on interface and try to limit it to 16
ethtool -L <IFS> rx 16 tx 16
Re: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%
Hi Alkx,
Thanks for your reply. I've done all the performance tuning steps from the site you recommend. I tried VMA because I was expecting someone would say "Have you tried VMA?", also vma_stats seems to give more visibility of the various buffer sizes (and errors) than available via the kernel.
I monitor /proc/net/udp. With VMA off, it shows no drops and rarely more than a few MB in the UDP buffer (I think this equivalent to netstat -unp).
Thanks for the tip on ethtool -L. Below are my current settings. I'll have a play with it and see if things improve. I hadn't seen that before. I wonder why it isn't in the tuning guides?
Also:
- What's the difference between the 'rings' (ethtool -g) and 'channels' (ethtool -L)?
- Why does making the channels smaller help?
ban115@tethys:~$ /sbin/ethtool -g enp132s0
Ring parameters for enp132s0:
Pre-set maximums:
RX: 8192
RX Mini: 0
RX Jumbo: 0
TX: 8192
Current hardware settings:
RX: 8192
RX Mini: 0
RX Jumbo: 0
TX: 512
ban115@tethys:~$ /sbin/ethtool -L enp132s0
no channel parameters changed, aborting
current values: tx 8 rx 32 other 0 combined 0
Re: MLNX+NVIDIA ASYNC GPUDirect - Segmentation fault: invalid permissions for mapped object running mpi with CUDA
Hi Jainkun yang,
Sorry for very late reply.
I am getting 7 micro seconds latency for the starting Bytes.
When i run osu_bw test, i am seeing that System memory is also getting used along with GPU Memory. These seems strange right. With GPUDirect RDMA, we should not see any system memory usage right? Am i missing something?
lspcu -tv output is for both the systems
+-[0000:80]-+-00.0-[81]--
| +-01.0-[82]--
| +-01.1-[83]--
| +-02.0-[84]--
| +-02.2-[85]----00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
| +-03.0-[86]----00.0 NVIDIA Corporation Device 15f8
On Host Systems:
80:02.2 PCI bridge: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 PCI Express Root Port 2 (rev 02) (prog-if 00 [Normal decode])
80:03.0 PCI bridge: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 PCI Express Root Port 3 (rev 02) (prog-if 00 [Normal decode])
On Peer System:
80:02.2 PCI bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 2 (rev 01) (prog-if 00 [Normal decode])
80:03.0 PCI bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 3 (rev 01) (prog-if 00 [Normal decode])
Host CPU:
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 72
On-line CPU(s) list: 0-71
Thread(s) per core: 2
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Stepping: 2
CPU MHz: 1202.199
CPU max MHz: 3600.0000
CPU min MHz: 1200.0000
BogoMIPS: 4590.86
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 46080K
NUMA node0 CPU(s): 0-71
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb invpcid_single retpoline kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts
Peer CPU:
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
Stepping: 1
CPU MHz: 1201.019
CPU max MHz: 3000.0000
CPU min MHz: 1200.0000
BogoMIPS: 4191.23
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 20480K
NUMA node0 CPU(s): 0-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single intel_pt retpoline kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
RoCEv2 PFC/ECN Issues
We have two servers with ConnectX-4 100Ge cards and two Cisco C3232C switches with routing between them and are trying to get RoCEv2 routing through with PFC/ECN to provide the best performance during periods of congestion.
The funny thing is using base configuration with no other servers on the switches, we get terrible performance (1.6 Gbps) across the routed link using iSER when we are only using about 20 Gbps (1 iSER connection and test workload configuration). By using multiple iSER connections and PFC, we can get about 95 Gbps, so we know that the hardware is capable of the performance in routing mode. We can't understand why in the default case the performance is so bad. The fio test shows that a lot of IO happens, then there is none and it just cycles back and forth.
We would like to use both PFC and ECN for our configuration, but we are trying to validate that ECN will work without PFC and when we disable PFC, we can't test ECN most likely because of the above issue.
On the Cisco switches, we have policy maps that places our traffic with the DSCP markings into a group that has ECN enabled (I'm not a Cisco person, so I may not be getting the terminology quite right) and we can see the group counters on the Cisco incrementing. We don't ever see any packets marked with congestion, probably because the switch never sees any due to the above problem.
When we have the client set to 40 Gbps and do a read test with PFC, we get pause frames and great performance. We have the Cisco switches match the DSCP value and remark the COS for packets that traverse the router (interesting enough Cisco sends PFC pause frames on the routed link even though there are no VLANs configured. We captured it in wireshark, but with the adapters set to --trust=pcp, the performance in terrible, but --trust=dscp works well). The Cisco switches also show pause frame counters incrementing when we are 100g end to end. I'm not sure why it would be incrementing when there is no congestion.
We have done so many permutations of tests, that I may be getting fuzzy in some details. Here is a matrix of some tests that I can be sure of. This is all 100g end to end.
switch PFC mode (ports) | trust mode | pfc prio 3 enabled | skprio -> cos mapping | Result |
static on/off | mlnx_qos --trust=X | mlnx_qos --pfc=0,0,0,X,0,0,0,0 | ip link set rsY.Z type vlan egress 2:3 | |
on | pcp | yes | yes | Good |
on | pcp | yes | no | Good |
on | pcp | no | yes | Bad |
on | pcp | no | no | Bad |
on | dscp | yes | yes | Good |
on | dscp | yes | no | Good |
on | dscp | no | yes | Bad |
on | dscp | no | no | Bad |
off | pcp | yes | yes | Bad |
off | pcp | yes | no | Bad |
off | pcp | no | yes | Bad |
off | pcp | no | no | Bad |
off | dscp | yes | yes | Bad |
off | dscp | yes | no | Bad |
off | dscp | no | yes | Bad |
off | dscp | no | no | Bad |
We are using OFED 4.4-1.0.0.0 on both nodes, one is CentOS 7.3, the other CentOS 7.4, running 4.9.116 and the firmware is 12.23.1000 on one card and 12.23.1020 on the other. In addition to the above matrix, we have only changed:
echo 26 > /sys/class/net/rs8bp2/ecn/roce_np/cnp_dscp
echo 106 > /sys/kernel/config/rdma_cm/mlx5_3/ports/1/default_roce_tos
If you have any ideas that we can try, we would appreciate it.
Thank you.
Re: "Priority trust-mode is not supported on your system"?
Thanks for your help!
Problem installing MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-x86_64
Hi,
I am running linux mint 19 which is basically ubuntu 18.04. I recently bought a ConnectX-3 CX311A and am trying to get it running.
I downloaded the MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-x86_64 and tried to run it:
sudo ./mlnxofedinstall --add-kernel-support --distro ubuntu18.04
Result:
Note: This program will create MLNX_OFED_LINUX TGZ for ubuntu18.04 under /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic directory.
See log file /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/mlnx_ofed_iso.5496.log
Checking if all needed packages are installed...
Building MLNX_OFED_LINUX RPMS . Please wait...
find: 'MLNX_OFED_SRC-4.4-1.0.0.0/RPMS': No such file or directory
Creating metadata-rpms for 4.15.0-29-generic ...
ERROR: Failed executing "/usr/bin/perl /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext/create_mlnx_ofed_installers.pl --with-hpc --tmpdir /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs --mofed /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext --rpms-tdir /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext/RPMS --output /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496/MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-ext --kernel 4.15.0-29-generic --ignore-groups eth-only"
ERROR: See /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/mlnx_ofed_iso.5496.log
Failed to build MLNX_OFED_LINUX for 4.15.0-29-generic
Once I check this log it says:
[33mUnsupported package: kmp [0m
Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/OFED.5926.logs
General log file: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0-4.15.0-29-generic/mlnx_iso.5496_logs/OFED.5926.logs/general.log
[32m
Below is the list of OFED packages that you have chosen
(some may have been added by the installer due to package dependencies):
[0m
ofed-scripts
mlnx-ofed-kernel-utils
mlnx-ofed-kernel-dkms
iser-dkms
isert-dkms
srp-dkms
mlnx-nfsrdma-dkms
mlnx-nvme-dkms
mlnx-rdma-rxe-dkms
kernel-mft-dkms
knem-dkms
knem
Checking SW Requirements...
This program will install the OFED package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with OFED, do not reinstall them.
Installing new packages
Building DEB for ofed-scripts-4.4 (ofed-scripts)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for mlnx-ofed-kernel-utils-4.4 (mlnx-ofed-kernel)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for iser-dkms-4.0 (iser)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for isert-dkms-4.0 (isert)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for srp-dkms-4.0 (srp)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for mlnx-nfsrdma-dkms-3.4 (mlnx-nfsrdma)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for mlnx-nvme-dkms-4.0 (mlnx-nvme)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for mlnx-rdma-rxe-dkms-4.0 (mlnx-rdma-rxe)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for kernel-mft-dkms-4.10.0 (kernel-mft)...
Running /usr/bin/dpkg-buildpackage -us -uc
Building DEB for knem-dkms-1.1.3.90mlnx1 (knem)...
Running /usr/bin/dpkg-buildpackage -us -uc
[32mBuild passed successfully [0m
-E- '' dir does not exist!
Strange!
Then I tried
sudo ./mlnxofedinstall --distro ubuntu18.04
which gives:
Logs dir: /tmp/MLNX_OFED_LINUX.12121.logs
General log file: /tmp/MLNX_OFED_LINUX.12121.logs/general.log
Below is the list of MLNX_OFED_LINUX packages that you have chosen
(some may have been added by the installer due to package dependencies):
ofed-scripts
mlnx-ofed-kernel-utils
mlnx-ofed-kernel-dkms
iser-dkms
isert-dkms
srp-dkms
mlnx-nfsrdma-dkms
mlnx-rdma-rxe-dkms
libibverbs1
ibverbs-utils
libibverbs-dev
libibverbs1-dbg
libmlx4-1
libmlx4-dev
libmlx4-1-dbg
libmlx5-1
libmlx5-dev
libmlx5-1-dbg
librxe-1
librxe-dev
librxe-1-dbg
libibumad
libibumad-static
libibumad-devel
ibacm
ibacm-dev
librdmacm1
librdmacm-utils
librdmacm-dev
mstflint
ibdump
libibmad
libibmad-static
libibmad-devel
libopensm
opensm
opensm-doc
libopensm-devel
infiniband-diags
infiniband-diags-compat
mft
kernel-mft-dkms
libibcm1
libibcm-dev
perftest
ibutils2
libibdm1
cc-mgr
ar-mgr
dump-pr
ibsim
ibsim-doc
knem-dkms
mxm
ucx
sharp
hcoll
openmpi
mpitests
knem
libdapl2
dapl2-utils
libdapl-dev
srptools
mlnx-ethtool
mlnx-iproute2
This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.
Do you want to continue?[y/N]:y
Checking SW Requirements...
Removing old packages...
Installing new packages
Installing ofed-scripts-4.4...
Installing mlnx-ofed-kernel-utils-4.4...
Installing mlnx-ofed-kernel-dkms-4.4...
Error: mlnx-ofed-kernel-dkms installation failed!
Collecting debug info...
See:
/tmp/MLNX_OFED_LINUX.12121.logs/mlnx-ofed-kernel-dkms.debinstall.log
Removing newly installed packages...
How can I install the drivers? Thank you for your help
Need help updating firmware/speed for MNPA19-XTR adapters
I need faster peer to peer access between a server and desktop computer. I installed an MNPA19-XTR 10Gb adapter in each machine with a Peer to Peer configuration w/SFP+ copper cable. The issue/problem is they are not performing as they should. When a large transfer is started, the speed starts at just under 700Mb/s (which is expected or even better than expected with SATA HD's in use). But after 5-6 seconds, the speed starts dropping to 100-150Mb/s. Intermittently, the speed will jump up to 400-500Mb/s for a second or two, then drop down again. Both systems have an SSD, Single SATA and a SATA array of disks. So, I have tried the tests from SSD to SSD, SATA to SATA, etc, and the results are pretty much the same. All offloading is enabled, Jumbo Packet is at 9000, send/recv buffers are at max and I have tried tuning as many settings as I can find including specifying both 10Gb adapters in the Hosts file. It almost seems like it is a heat issue even though there is plenty of air movement.
I don't know if speed will improve with a firmware update, but MLXUP.exe does not recognize my adapter (not sure if I am using the correct switch(es) in the command line (windows machines). Any help with either or both speed and firmware updating would be highly appreciated. The cards currently have firmware rev 2.9.1000 and I have 2.9.1200 on hand to update. I will be extremely happy if I can get a reliable 400-500Mb/s out of this setup, which is what I believe it should be at.
System 1:
ASUS X370-A MB, Ryzen 5 1600, 16Mb RAM, 500Gb SSD, 6-4Tb NAS SATA disk array, 1-3Tb NAS SATA disk.
System 2:
ASUS X370-Pro MB, Ryzen 7 1800X, 32Mb RAM, 240Gb SSD, 6-4Tb NAS SATA disk array, 1-6Tb NAS SATA disk.
Also it looks like I have to choose a group for this discussion, so I am just choosing the closest fit.
Thank you in advance for any help.
Re: mlx5_core enable hca failed, mlx5_load_one failed with error code -22
Hi Pharthiphan,
I am not sure if your issue is still relevant as it was posted on 6/11, however what Mellanox OFED Drivers did you installed and have you validated the FW version/compatibility?
You can download the MFT package from the following link:
http://www.mellanox.com/page/management_tools
To query the FW:
#mst start
#mst status -v
#flint -d <mst device> q
Note: Check based on the RN of the Drivers that the FW is supported/compatible. If not, I would suggest to align the FW to a supported version.
Sophie.
Is Mellanox ConnectX-4 compatible with VPP 18.07?
Hello,
I have built VPP 18.07 according to the instructions from this page: How to Build VPP FD.IO Development Environment with Mellanox DPDK PMD for ConnectX-4 and ConnectX-5
VPP recognizes Mellanox ports, but it hangs after "up" command applied to one of those ports.
VPP log (/var/log/vpp/vpp.log) shows no errors.
Is VPP 18.07 compatible with ConnectX-4?
Thank you.
Re: Ceph with OVS Offload
Hi Lazuardi,
I am not sure if you found a solution for your deployment however, have you read about ASAP2 solution.
ASAP2 is GA as part of MLNX_OFED 4.4 and has a separate page with more details:
https://www.mellanox.com/page/asap2?mtag=asap2&ssn=s2vtqtqjl6k87i8k5niimk2gl5
http://www.mellanox.com/page/asap2?mtag=asap2
Getting started with Mellanox ASAP^2
Sophie.
Re: Ceph with OVS Offload
Hi Sophie,
I have read all about ASAP2 on Mellanox website. My question is about the performance of running Ceph with ASAP2 OVS offload and VXLAN offload.
Best regards,
Re: Ceph with OVS Offload
Hi Lazuardi,
Ceph has not been tested against the ASAP2 OVS offload solution.
Sophie.
Re: Ceph with OVS Offload
Hi Sophie,
How can I request that test to Mellanox as reference? I'm looking for reference design of link redundancy for Ceph but without MLAG on switch and maximazing offload features of ConnectX-5 EN.
Best regards,
Re: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%
number of channels - is how many queues show be created
ring size - what is the size of the queue
Generally, you shouldn't be changing the default as they are based on the vendor experience (any vendor), however sometimes it is better to play with these settings. For example, setting number of receive queue to the number of CPUs on the host might be not a bad idea as larger number of queue will cause to more context switches that might cause to degradation.
The same with queue size - setting it to maximum means increase amount of memory used by the queue and that might cause to page swapping, that also might cause to degradation.
Bottom line, there is no single recipe, but optimum defaults. Every change, need to be validated by running benchmarks that close to mimics behaviour of the real-time application or by application itself.
Do you still have dropped packets after changing these parameters?
I would recommend to check also RedHat network performance tuning guide if you work with TCP/UDP. For VMA is is not really applicable as VMA bypass the kernel.