Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

July 3, 2018, 2:00 pm

≫ Next: Re: get a dump cqe when trying to invalid mr in cx4

Greetings,

We have some nodes (Dell R415s running RHEL 6.8) with Connect-IB cards in a PCI 2.0 x16 slot (the only one available) and can't seem to get more than 45 Gbit/s using ib_send_bw. I have two of the nodes connected directly using a new FDR cable and with the SM running on one of the nodes. I have updated the BIOS, the OFED and the HCA firmware on both nodes. Still I can't seem to get the full FDR bandwidth. The connect-ib product page (http://www.mellanox.com/page/products_dyn?product_family=142&mtag=connect_ib ) states the following:

"Connect-IB also enables PCI Express 2.0 x16 systems to take full advantage of FDR, delivering at least twice the bandwidth of existing PCIe 2.0 solutions."

Since the PCIe 2.0 x16 is able to support 64 Gbit/s in one direction, shouldn't I be able to achieve full FDR (~54ish Gbit/s) as the product page implies. Or, am I wrong and there is some extra overhead that reduces the bandwidth for PCIe 2.0 x16 vs. PCIe 3.0 x16?

I have gone through the Performance Tuning for Mellanox Adapters guide and there isn't much more that I can try based on this. The latest BIOS has nowhere near the number of setting that are suggested to be tweaked in the guide. I have also tried mlnx_tune and get one warning:

----------------------------------------------------------

Connect-IB Device Status on PCI 01:00.0

FW version 10.16.1200

OK: PCI Width x16

Warning: PCI Speed 5GT/s >>> PCI width status is below PCI capabilities. Check PCI configuration in BIOS. <--------------

PCI Max Payload Size 128

PCI Max Read Request 512

Local CPUs list [0, 1, 2, 3, 4, 5]

----------------------------------------------------------

But this is probably correct since I am using a PCIe 2.0 x16 slot (PCIe 2.0 can only do 5 GT/s), right?

Here is the output of ibv_devinfo:

-------------------------------------------

hca_id: mlx5_0

transport: InfiniBand (0)

fw_ver: 10.16.1200

node_guid: f452:1403:002e:eb40

sys_image_guid: f452:1403:002e:eb40

vendor_id: 0x02c9

vendor_part_id: 4113

hw_ver: 0x0

board_id: MT_1220110019

phys_port_cnt: 1

Device ports:

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 4096 (5)

sm_lid: 1

port_lid: 2

port_lmc: 0x00

link_layer: InfiniBand

-------------------------------------------

and iblinkinfo:

-------------------------------------------

CA: A HCA-1:

0xf4521403002ee9f0 1 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 2 1[ ] "B" ( )

CA: B HCA-1:

0xf4521403002eeb40 2 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 1 1[ ] "A" ( )

-------------------------------------------

Can anyone tell me if this is the best I can expect or is there something else I can change to achieve FDR bandwidth with these HCAs?

Thanks in advance!

Eric

↧

Re: get a dump cqe when trying to invalid mr in cx4

July 3, 2018, 2:29 pm

≫ Next: Bringing a SX6012 back to life

≪ Previous: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

Hi Edward,

See the link below to the Mellanox Adapters Programmer’s Reference Manual (PRM)

Supporting ConnectX®-4 and ConnectX®-4 Lx as they are different adapter as from the ConnectX-3.

http://www.mellanox.com/related-docs/user_manuals/Ethernet_Adapters_Programming_Manual.pdf

Sophie.

↧

Bringing a SX6012 back to life

July 4, 2018, 1:33 am

≫ Next: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

≪ Previous: Re: get a dump cqe when trying to invalid mr in cx4

Hello gents

Pardon this sheep, this is my first experience with any InfiniBand technologies at all.

I've gotten my hands on a SX6012 through some distant work connections that nobody seems to know what to do with it so I figured let's get cracking.

Upon power up, it throws the message "System is initializing! This may take a few minutes" , "Modules are being configured", then it throws "Internal error, code 1006 (see logs for details) A fatal internal error occurred". Then it kicked me out.

I looked in user manual, there's no mention of code 1006, searched this forum, nada. Almighty Google didn't have much either. How can I get that logs for further detail? And what does this code 1006 mean?

Your assistance is much appreciated.

Thank you.

↧

Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

July 4, 2018, 2:58 am

≫ Next: Re: How to enable VF multi-queue for SR-IOV on KVM?

≪ Previous: Bringing a SX6012 back to life

Hi,

Can you show me your output of

rpm -qf /etc/issue on your Centos 7.4 and 7.5 to see the difference

dist_rpm use this output to check after the lines you refer to.

Thanks

Marc

↧

Re: How to enable VF multi-queue for SR-IOV on KVM?

July 4, 2018, 3:04 am

≫ Next: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

≪ Previous: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

Hi,

How did you set your irq affinity ?

Did you try to use set_irq_affinity_bynode.sh script ?

Try again and let me know

Marc

↧

Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

July 6, 2018, 12:45 am

≫ Next: Re: How to configure host chaining for ConnectX-5 VPI

≪ Previous: Re: How to enable VF multi-queue for SR-IOV on KVM?

Hi Marc,

rpm -qf /etc/issue

centos-release-7-5.1804.el7.centos.a.aarch64

In the CentOS 7.5 release

http://mirror.centos.org/altarch/7.5.1804/updates/aarch64/Packages/centos-release-7-5.1804.el7.centos.a.2.aarch64.rpm

In the last release

Index of /altarch/7.4.1708/updates/aarch64/Packages

centos-release-7-4.1708.el7.centos.altarch.1.aarch64.rpm

You can see that the the naming conventions are a little different of altarch aarch64. I have no idea of this package's name on the Power. Many thanks

↧

Re: How to configure host chaining for ConnectX-5 VPI

July 9, 2018, 2:22 am

≫ Next: ConnectX-2 and ESXi 6.5 slow speed

≪ Previous: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

Hi Simon,

question: do you want to use Storage Spaces Direct in Windows Server 2016 with it? That is at least my problem.

Cheers Carsten Rachfahl

Microsoft Clud & Datacenter Managment MVP

↧

ConnectX-2 and ESXi 6.5 slow speed

July 9, 2018, 2:44 am

≫ Next: MLNX+NVIDIA ASYNC GPUDirect - Segmentation fault: invalid permissions for mapped object running mpi with CUDA

≪ Previous: Re: How to configure host chaining for ConnectX-5 VPI

Hi,

I have ConnectX-2 card in my ESXi 6.5 server

Added this card to FreeNAS (VM) and OpenMediaVault (VM) as VMXNET3 adaptor

When i test the speed between the FreeNAS/OMV (VM) and Windows (PC) i get only 200-300 MBytes/sec

When i test the speed between the FreeNAS (VM) and OMV (VM) i get 4000-5000 MBytes/sec - on the same vSwitch

When i test the speed between the Windows (PC) and Windows (PC) i get only 900-1000 MBytes/sec - Result that i expected to be in ESXi

Tested speeds with iperf

What can be the problem? drivers?

By the way when i passthrough the card to FreeNAS, the FreeNAS doesn't recognize it.

Thank you for help

↧

MLNX+NVIDIA ASYNC GPUDirect - Segmentation fault: invalid permissions for mapped object running mpi with CUDA

July 9, 2018, 4:35 am

≫ Next: Mellanox compatible InfiniBand cards

≪ Previous: ConnectX-2 and ESXi 6.5 slow speed

##Problem: Segmentation fault: invalid permissions for mapped object running mpi with CUDA

##Configurations

OS:

******************************

Centos 7.5 (3.10.0-862.el7.x86_64)

Connetivity:

******************************

Back to Back

Softwares:

******************************

cuda-repo-rhel7-9-2-local-9.2.88-1.x86_64

nccl_2.2.13-1+cuda9.2_x86_64.tar

MLNX_OFED_LINUX-4.3-3.0.2.1-rhel7.5-x86_64.tgz

nvidia-peer-memory_1.0-7.tar.gz

openmpi-3.1.1.tar.bz2

osu-micro-benchmarks-5.4.2.tar.gz

[root@LOCALNODE ~]# lsmod | grep nv_peer_mem

nv_peer_mem 13163 0

ib_core 283851 11 rdma_cm,ib_cm,iw_cm,nv_peer_mem,mlx4_ib,mlx5_ib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib

nvidia 14019833 9 nv_peer_mem,nvidia_modeset,nvidia_uvm

[root@LOCALNODE ~]#

## Steps Followed

Followed document : http://www.mellanox.com/related-docs/prod_software/Mellanox_GPUDirect_User_Manual_v1.5.pdf

Openmpi command: mpirun --allow-run-as-root -host LOCALNODE,REMOTENODE -mca btl_openib_want_cuda_gdr 1 -np 2 -mca btl_openib_if_include mlx5_0:1 -mca -bind-to core -cpu-set 23 -x CUDA_VISIBLE_DEVICES=0 /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency -d cuda D D

## Two issues/problem seen where we need help from MNLX

1. While running osu micro benchmarks between Device to Device (i.e D D ) getting segmentation fault.

2. Though normal RDMA traffic (ib_send_*) is running fine between both the Nodes and on Both the Ports, But while running osu micro benchmarks, traffic is only going through Port 1 (MLX5_1)

Note: NVidia GPU and Mellanox Adapter are in different NUMA Nodes.

[root@LOCALNODE ~]# cat /sys/module/mlx5_core/drivers/pci\:mlx5_core/0000\:*/numa_node

[root@LOCALNODE ~]# cat /sys/module/nvidia/drivers/pci\:nvidia/0000\:*/numa_node

[root@LOCALNODE ~]# lspci -tv | grep -i nvidia

| +-02.0-[19]----00.0 NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB]

[root@LOCALNODE ~]# lspci -tv | grep -i mellanox

-+-[0000:d7]-+-02.0-[d8]--+-00.0 Mellanox Technologies MT27800 Family [ConnectX-5]

| | \-00.1 Mellanox Technologies MT27800 Family [ConnectX-5]

## Issue Details:

******************************

Issue 1:

[root@LOCALNODE nccl-tests]# mpirun --allow-run-as-root -host LOCALNODE,REMOTENODE -mca btl_openib_want_cuda_gdr 1 -np 2 -mca btl_openib_if_include mlx5_0 -mca -bind-to core -cpu-set 23 -x CUDA_VISIBLE_DEVICES=0 /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency -d cuda D D

--------------------------------------------------------------------------

No OpenFabrics connection schemes reported that they were able to be

used on a specific port. As such, the openib BTL (OpenFabrics

support) will be disabled for this port.

Local host: LOCALNODE

Local device: mlx5_0

Local port: 1

CPCs attempted: rdmacm, udcm

--------------------------------------------------------------------------

# OSU MPI-CUDA Latency Test v5.4.1

# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)

# Size Latency (us)

0 1.20

[LOCALNODE:5297 :0:5297] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x7fd69ea00000)

==== backtrace ====

0 0x0000000000045e92 ucs_debug_cleanup() ???:0

1 0x000000000000f6d0 _L_unlock_13() funlockfile.c:0

2 0x0000000000156e50 __memcpy_ssse3_back() :0

3 0x00000000000318e1 uct_rc_mlx5_ep_am_short() ???:0

4 0x0000000000027a5a ucp_tag_send_nbr() ???:0

5 0x0000000000004c71 mca_pml_ucx_send() ???:0

6 0x0000000000080202 MPI_Send() ???:0

7 0x0000000000401d42 main() /home/NVIDIA/osu-micro-benchmarks-5.4.2/mpi/pt2pt/osu_latency.c:116

8 0x0000000000022445 __libc_start_main() ???:0

9 0x000000000040205b _start() ???:0

===================

-------------------------------------------------------

Primary job terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

-------------------------------------------------------

--------------------------------------------------------------------------

mpirun noticed that process rank 0 with PID 0 on node LOCALNODE exited on signal 11 (Segmentation fault).

--------------------------------------------------------------------------

[LOCALNODE:05291] 1 more process has sent help message help-mpi-btl-openib-cpc-base.txt / no cpcs for port

[LOCALNODE:05291] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

[root@LOCALNODE nccl-tests]#

Issue 2:

[root@LOCALNODE ~]# cat /sys/class/infiniband/mlx5_0/ports/1/counters/port_*

[root@LOCALNODE ~]# cat /sys/class/infiniband/mlx5_1/ports/1/counters/port_*

18919889

1011812

9549739941

35318041

[root@LOCALNODE ~]#

Thanks & Regards

Ratan B

↧

Mellanox compatible InfiniBand cards

July 9, 2018, 10:16 am

≫ Next: Re: How to enable VF multi-queue for SR-IOV on KVM?

≪ Previous: MLNX+NVIDIA ASYNC GPUDirect - Segmentation fault: invalid permissions for mapped object running mpi with CUDA

Hello, I would like to know which are the compatible InfiniBand cards with the Supermicro 1028R-TDW serverboard.

More specifically, I would also like to know if anyone knows about compatibility of this serverboard with the MT25408 InfiniBand cards.

Best Regards!
Andrés.

↧

Re: How to enable VF multi-queue for SR-IOV on KVM?

July 9, 2018, 8:01 pm

≫ Next: I want to know the compatibility of Cisco SW and Mellanox SW.

≪ Previous: Mellanox compatible InfiniBand cards

As I wrote above, I have manually set up irq in the virtual machine. You can see that there is only one queue in the hardware queue. I guess the driver does not support SR-IOV multi-queue function in the virtual machine.

VM：

[root@host-01 ~]# ls -la /sys/devices/pci0000\:00/0000\:00\:04.0/net/ib0/queues/

total 0

drwxr-xr-x 4 root root 0 Jun 29 10:11 .

drwxr-xr-x 5 root root 0 Jun 29 10:11 ..

drwxr-xr-x 2 root root 0 Jun 29 10:11 rx-0

drwxr-xr-x 3 root root 0 Jun 29 10:11 tx-0

Guest Host：

[root@testserver-1 ~]# ls -la /sys/devices/pci0000\:80/0000\:80\:01.0/0000\:81\:00.0/net/ib0/queues/

total 0

drwxr-xr-x 35 root root 0 Jun 28 19:59 .

drwxr-xr-x 5 root root 0 Jul 10 10:51 ..

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-0

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-1

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-10

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-11

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-12

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-13

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-14

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-15

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-2

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-3

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-4

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-5

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-6

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-7

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-8

drwxr-xr-x 2 root root 0 Jun 28 19:59 rx-9

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-0

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-1

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-10

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-11

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-12

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-13

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-14

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-15

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-16

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-2

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-3

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-4

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-5

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-6

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-7

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-8

drwxr-xr-x 3 root root 0 Jun 28 19:59 tx-9

↧

I want to know the compatibility of Cisco SW and Mellanox SW.

July 9, 2018, 10:46 pm

≫ Next: Re: How to configure host chaining for ConnectX-5 VPI

≪ Previous: Re: How to enable VF multi-queue for SR-IOV on KVM?

Customer requirements include LRM compatibility verification between Cisco switches and Melanox switches.

cisco switch : cisco catalyst 6509-E 10Gbase-LRM TYPE

mellanox switch : sn 2010 10Gbase LRM TYPE

I have confirmed that 10gbase lrm Gbic is recognized by melanox switch.

However, I did not have a Cisco device and I was not sure if it was compatible between the two devices.

If someone have tried connecting the two devices using 10gbase lrm, please let me know.

↧

Re: How to configure host chaining for ConnectX-5 VPI

July 10, 2018, 6:57 am

≫ Next: Assign a MAC to a VLAN

≪ Previous: I want to know the compatibility of Cisco SW and Mellanox SW.

Putting this out there since we had so many complications with host chaining in order for it to work; and something Google will pick up is infinitely better than nothing.

The idea we had was that we wanted something that would have redundancy. With a switch configuration, we'd have to get two switches, and a lot more cables; very expensive.

HOST_CHAINING_MODE was a great idea, switchless, less cables, and less expense.

You do NOT need a subnet manager for this to work!

In order to get it working:

Aside: There is no solid documentation on this process as of this writing

1. What Marc said was accurate, set HOST_CHAINING_MODE=1 via the mlxconfig utility.

Aside: Both the VPI and EN type cards will work with host chaining. The VPI type does require you to put it into ethernet mode.

2. Restart the servers to set the mode.

3. Put all of the ports on the same subnet. EG. 172.19.50.0/24 Restart networking stack as required.

4. From there, all ports should be pingable from all other ports.

5. Set the MTU up to 9000. (see caveats for bug; lower to 8000 if 9k doesn't work)

Aside: The MTU could be higher; I have been unable to test higher due to a bug in the firmware. Around these forums, I've seen 9k floated about, and it seems like a good standard number.

If you aren't getting the throughput you're expecting, do ALL of the tuning from BIOS (Performance Tuning for Mellanox Adapters , BIOS Performance Tuning Example ) and software (Understanding PCIe Configuration for Maximum Performance , Linux sysctl Tuning ) for all servers. It does make a difference. On our small (under-powered) test boxes, we gained 20 GBit/s from our starting benchmark.

Another thing to make sure is that you have the proper PCI bandwidth to support line rate; and get the socket direct cards if you do not.

There are a lot of caveats.

The bandwidth that is possible IS link speed, only between two directly connected nodes. From our tests, there is a small dip in performance on each hop; and each hop also limits your max theoretical throughput.
FW version 16.22.1002 had a few bugs related to host chaining; one of those was the max MTU supported was 8150. Higher MTU, less IP overhead.
The 'ring' topology is a little funny. It is only one direction. If there is a cable cut scenario, it will NOT route around properly for certain hosts.

Aside: A cable cut is different than a cable disconnect. The transceiver itself registers whether there is a cable attached or not. When there is no cable present on one side, but is on the other, the above scenario is true (not properly routing.) When both sides of the cable are removed, the ring outright stops and does not work at all. I don't have any data to support an actual cable cut.

The ring works as described in the (scant) documentation, but is as follows from the firmware release notes:

Received packets from the wire with DMAC equal to the host MAC are forwarded to the local host
Received traffic from the physical port with DMAC different than the current MAC are forwarded to the other port:
Traffic can be transmitted by the other physical port
Traffic can reach functions on the port's Physical Function
Device allows hosts to transmit traffic only with its permanent MAC
To prevent loops, the received traffic from the wire with SMAC equal to the port permanent MAC is dropped (the packet cannot start a new loop)

If you run into problems, tcpdump is your friend, and ping is a great little tool to check your sanity.

Hope any of this helps anyone in the future,

Daniel

↧

Assign a MAC to a VLAN

July 10, 2018, 2:56 pm

≫ Next: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%

≪ Previous: Re: How to configure host chaining for ConnectX-5 VPI

Hi all.

Sorry my english.

Im using SX 1024 with software version SX_PPC_M460EX SX_3.3.5006.

I can assign a MAC Address to a VLAN? I need create a VLAN and two MAC Address the VLAN.

It sounds pretty simple, but I did not find it in the manual.

Thank you all.

↧

rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%

July 10, 2018, 11:01 pm

≫ Next: Re: mlx5_core - Cable error / Power budget exceeded

≪ Previous: Assign a MAC to a VLAN

Hi,

I'm getting rx_fifo errors and rx_dropped_errors receiving UDP packets. I have 8 applications each receiving ~8000 byte UDP packets from 7 different pieces of hardware with different IP addresses. The packet and data rate is identical for each application - totalling 440k packets/sec and 29 Gbit/sec respectively. The packets are all transmitted synchronously, at a rate of 2x8000 byte packets every 1.5 ms for each of 56 different hardware cards.

In this mode, rx_dropped and rx_fifo_errors increased at a few tens of packets per second. Attached is a dump of what ethtool shows. vma_stats shows no dropped packets. Each application is bound with numactl to NUMA node 1 (which is is where the NIC is attached). top shows each core on that node is running at < 40% CPU. The switch shows no dropped packets.

Libvma configuration as shown below. I had the same problem when not using libvma (i.e. vanilla linux kernel packet processing).

Can anyone give me some hints on where to look to reduce the number of lost packets?

Many thanks in advance,

Keith

export VMA_MTU=9000 #don't need to set - should be intelligent but we'll set it anyway for now
export VMA_RX_BUFS=32768 # number of buffers -each of 1xMTU. Default is 200000 = 1 GB!
export VMA_RX_WRE=4096 # number work requests
export VMA_RX_POLL=0 # Don't waste CPU time polling. WE don't need to
export VMA_TX_BUFS=256 # Dont need many of these, so make it smalle
export VMA_TX_WRE=32 # Don't need to tx so make this small to save memory
export VMA_INTERNAL_THREAD_AFFINITY=15
export VMA_MEM_ALLOC_TYPE=0
export VMA_THREAD_MODE=0 # all socket processing is single threaded
export VMA_CQ_AIM_INTERRUPTS_RATE_PER_SEC=200
export VMA_CQ_KEEP_QP_FULL=0 # this does packet drops according ot the docs??
export VMA_SPEC=throughput

ban115@tethys:~$ lspci -v | grep -A 10 ellanox
84:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]
    Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]
    Flags: bus master, fast devsel, latency 0, IRQ 74, NUMA node 1
    Memory at c9800000 (64-bit, non-prefetchable) [size=1M]
    Memory at c9000000 (64-bit, prefetchable) [size=8M]
    Expansion ROM at <ignored> [disabled]
    Capabilities: <access denied>
    Kernel driver in use: mlx4_core
    Kernel modules: mlx4_core

ban115@tethys:~$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 15968 MB
node 0 free: 129 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 16114 MB
node 1 free: 2106 MB
node distances:
node 0 1
0: 10 21
1: 21 10

↧

Re: mlx5_core - Cable error / Power budget exceeded

July 11, 2018, 10:40 am

≫ Next: Re: How to enable VF multi-queue for SR-IOV on KVM?

≪ Previous: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%

Found solution:

sudo mlxconfig -e -d 04:00.0 set ADVANCED_POWER_SETTINGS=True

sudo mlxconfig -e -d 04:00.0 set DISABLE_SLOT_POWER_LIMITER=True

↧

Re: How to enable VF multi-queue for SR-IOV on KVM?

July 12, 2018, 1:27 am

≫ Next: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

≪ Previous: Re: mlx5_core - Cable error / Power budget exceeded

Hi,

Please open a support ticket.

Best Regards

Marc

↧

Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

July 12, 2018, 4:57 am

≫ Next: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 On Ubuntu 16.04

≪ Previous: Re: How to enable VF multi-queue for SR-IOV on KVM?

Hi,

I have here a machine with Centos 7.5 on ARM and cannot see the same output.

I would like to investigate it even if you already a workaround.

For this purpose, I need you to open a case at support@mellanox.com

Thanks in advance

Marc

↧

Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 On Ubuntu 16.04

July 12, 2018, 3:56 pm

≫ Next: Small redundant MLAG setup

≪ Previous: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

Hi, have a Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 installed server with Ubuntu 16.04 and were able to install the drivers etc.,

But It card doesn't show up using ifconfig -a

Any ideas? Is this version of OS and Kernel supported for ConnectX VPI PCIe 2.0?

Here is more info:

root@ubuntu16-sdc:~# root@ubuntu16-sdc:~# uname -a

Linux ubuntu16-sdc 4.8.0-44-generic #47~16.04.1-Ubuntu SMP Wed Mar 22 18:51:56 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

root@ubuntu16-sdc:~# lspci | grep Mell

03:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)

root@ubuntu16-sdc:~# /etc/init.d/openibd restart

Unloading HCA driver: [ OK ]

Loading HCA driver and Access Layer: [ OK ]

root@ubuntu16-sdc:~# hca_self_test.ofed

---- Performing Adapter Device Self Test ----

Number of CAs Detected ................. 1

PCI Device Check ....................... PASS

Kernel Arch ............................ x86_64

Host Driver Version .................... MLNX_OFED_LINUX-4.4-1.0.0.0 (OFED-4.4-1.0.0): 4.8.0-44-generic

Host Driver RPM Check .................. PASS

Firmware on CA #0 HCA .................. v2.10.0720

Host Driver Initialization ............. PASS

Number of CA Ports Active .............. 0

Kernel Syslog Check .................... PASS

Node GUID on CA #0 (HCA) ............... NA

------------------ DONE ---------------------

root@ubuntu16-sdc:~# mlxfwmanager --online -u -d 0000:03:00.0

Querying Mellanox devices firmware ...

Device #1:

----------

Device Type: ConnectX2

Part Number: MHQH19B-XTR_A1-A3

Description: ConnectX-2 VPI adapter card; single-port 40Gb/s QSFP; PCIe2.0 x8 5.0GT/s; tall bracket; RoHS R6

PSID: MT_0D90110009

PCI Device Name: 0000:03:00.0

Port1 MAC: 0002c94f2ec0

Port2 MAC: 0002c94f2ec1

Versions: Current Available

FW 2.10.0720 N/A

Status: No matching image found

root@ubuntu16-sdc:~# lsmod | grep ib

ib_ucm 20480 0

ib_ipoib 172032 0

ib_cm 53248 3 rdma_cm,ib_ipoib,ib_ucm

ib_uverbs 106496 2 ib_ucm,rdma_ucm

ib_umad 24576 0

mlx5_ib 270336 0

mlx5_core 806912 2 mlx5_fpga_tools,mlx5_ib

mlx4_ib 212992 0

ib_core 286720 10 ib_cm,rdma_cm,ib_umad,ib_uverbs,ib_ipoib,iw_cm,mlx5_ib,ib_ucm,rdma_ucm,mlx4_ib

mlx4_core 348160 2 mlx4_en,mlx4_ib

mlx_compat 20480 15 ib_cm,rdma_cm,ib_umad,ib_core,mlx5_fpga_tools,ib_uverbs,mlx4_en,ib_ipoib,mlx5_core,iw_cm,mlx5_ib,mlx4_core,ib_ucm,rdma_ucm,mlx4_ib

devlink 28672 4 mlx4_en,mlx5_core,mlx4_core,mlx4_ib

libfc 114688 1 tcm_fc

libcomposite 65536 2 usb_f_tcm,tcm_usb_gadget

udc_core 53248 2 usb_f_tcm,libcomposite

scsi_transport_fc 61440 3 qla2xxx,tcm_qla2xxx,libfc

target_core_iblock 20480 0

target_core_mod 356352 9 iscsi_target_mod,usb_f_tcm,vhost_scsi,target_core_iblock,tcm_loop,tcm_qla2xxx,target_core_file,target_core_pscsi,tcm_fc

configfs 40960 6 rdma_cm,iscsi_target_mod,usb_f_tcm,target_core_mod,libcomposite

libiscsi_tcp 24576 1 iscsi_tcp

libiscsi 53248 2 libiscsi_tcp,iscsi_tcp

scsi_transport_iscsi 98304 3 libiscsi,iscsi_tcp

Please let me know if any other info needed..

↧

Small redundant MLAG setup

July 15, 2018, 3:46 am

≫ Next: DPDK with MLX4 VF on Hyper-v VM

≪ Previous: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 On Ubuntu 16.04

Hi there,

first of all thanks for all the great information that can be found here.

I'm trying to build a fully redundant setup for two racks in different locations with the smallest amount of switches. From my understanding MLAG will help me to keep redundancy towards the servers. That means for each rack:

use two switches to create a mlag domain
attach all servers to both switches.

So I need at least 4 switches. Now to my problem. I want to avoid another two spine switches. For my setup (two racks) this seems be a little overpowered and the spines would only use 4-6 ports each. The question mark in the picture shows where the magic must happen.

Looking at other posts I see the following option.

With MLNX-OS 3.6.6102 STP and MLAG can coexist.
I implement a fully redundant interconnect of all 4 switches.
I activate MSTP (as I have multiple VLANs)
MSTP will allow to utilize interconnection as good as possible

Is this ok or am I missing something?

Thanks in advance.

↧