Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6275 articles
Browse latest View live

Re: Can multiple versions of mlnx-ofed exist in the same IB fabric?

$
0
0

Hi Greg,

 

It is recommended to perform the upgrade during a maintenance windows and not in production, because there are a lot of differences between 2.X and 3.X versions and since part of the driver upgrade is a firmware upgrade also.

 

Best Regards,

Viki


Is this the best our FDR adapters can do?

$
0
0

We have a small test setup illustrated below. I have done some ib_write_bw tests. Got "decent" numbers, but not as fast as I anticipated.  First, some background of the setup:

 

ipoib_for_the_network_layout_after.png

 

Two 1U storage servers each has a EDR HCA MCX455A-ECAT. The other four each has a ConnectX-3 VPI FDR 40/50Gb/s HCA mezz OEMed by Mellanox for Dell.  The firmware version: 2.33.5040.  This is not the latest (2.36.5000 according to hca_self_test.ofed) but I am new to IB, and still getting up to speed with updating using Mellanox's firmware tools. The EDR HCA firmware has been updated when the MLNX_OFED was installed.

 

All servers:

CPU: 2 x Intel E5-2620v3 2.4Ghz 6 core/12 HT

RAM: 8 x 16GiB DDR4 1833Mhz DIMMs

OS: CentOS 7.2 Linux ... 3.10.0-327.28.2.el7.x86_64 #1 SMP Wed Aug 3 11:11:39 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

OFED: MLNX_OFED_LINUX-3.3-1.0.4.0 (OFED-3.3-1.0.4)

 

A typical ib_write_bw test:

 

Server:

[root@fs00 ~]# ib_write_bw -R

 

************************************

* Waiting for client to connect... *

************************************

---------------------------------------------------------------------------------------

                    RDMA_Write BW Test

Dual-port       : OFF Device         : mlx5_0

Number of qps   : 1 Transport type : IB

Connection type : RC Using SRQ      : OFF

CQ Moderation   : 100

Mtu             : 4096[B]

Link type       : IB

Max inline data : 0[B]

rdma_cm QPs  : ON

Data ex. method : rdma_cm

---------------------------------------------------------------------------------------

Waiting for client rdma_cm QP to connect

Please run the same command with the IB/RoCE interface IP

---------------------------------------------------------------------------------------

local address: LID 0x03 QPN 0x01a5 PSN 0x1f7290

remote address: LID 0x05 QPN 0x40248 PSN 0xcf02b4

---------------------------------------------------------------------------------------

#bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]

65536      5000             6084.80            6084.72    0.097356

---------------------------------------------------------------------------------------

 

Client:

[root@sc2u0n0 ~]# ib_write_bw -R 192.168.111.150

---------------------------------------------------------------------------------------

                    RDMA_Write BW Test

Dual-port       : OFF Device         : mlx5_1

Number of qps   : 1 Transport type : IB

Connection type : RC Using SRQ      : OFF

TX depth        : 128

CQ Moderation   : 100

Mtu             : 4096[B]

Link type       : Ethernet

Gid index       : 0

Max inline data : 0[B]

rdma_cm QPs  : ON

Data ex. method : rdma_cm

---------------------------------------------------------------------------------------

local address: LID 0x05 QPN 0x40248 PSN 0xcf02b4

GID: 254:128:00:00:00:00:00:00:00:00:00:00:00:00:00:00

remote address: LID 0x03 QPN 0x01a5 PSN 0x1f7290

GID: 03:00:00:00:00:00:00:00:96:199:05:141:123:127:00:00

---------------------------------------------------------------------------------------

#bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]

65536      5000             6084.80            6084.72    0.097356

---------------------------------------------------------------------------------------

 

Now 6048MB/s ~ 48.68Gbps.  Even taking into account of the 64/66 encoding overhead, over 50Gbps should be the case, or is this the best the setup can do?  Or is there anything I can do to push up the speed further?

 

Look forward to hearing the experience and observations from the experienced camp!  Thanks!

Re: Is this the best our FDR adapters can do?

$
0
0

One thing to keep in mind is that you'll hit the bandwidth of the PCIe bus.

I've not used the ib_write test myself - but I'm fairly sure that it's not actually handling data - just accepting it and tossing it away so it's going to be a theoretical maximum.

In real life situations that bus is going to be handling all data in/out of the CPU and for my oldest motherboards that maxes out at 25Gb/s - which is what I hit with fio tests on QDR links.  I've heard that with PCIe gen 3 you'll get up to 35Gb/s.

Generally whenever newer networking tech rolls out there is nothing that a single computer can do to saturate the link - unless it's pushing junk data and the only way to really max it out is for switch-switch (hardware to hardware) traffic.

Of course using IPoIB an anything other than native IB traffic is going to cost you performance.  In my case of NFS with IPoIB (with or without RDMA) I quickly slam into the bandwidth of my SSDs.  The only exception I'll have is the Oracle dB where the low latency is what I'm after as the database is small enough to fit in RAM.

Re: How to change speed from FDR to QDR using ibportstate command in centos 6.2

$
0
0

Hi,

 

Thanks for the help, however, unfortunately this didn't help.

 

Without firing command:

 

CA PortInfo:

# Port info: Lid 144 port 1

LinkState:.......................Active

PhysLinkState:...................LinkUp

Lid:.............................144

SMLid:...........................442

LMC:.............................0

LinkWidthSupported:..............1X or 4X

LinkWidthEnabled:................1X or 4X

LinkWidthActive:.................4X

LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps

LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps

LinkSpeedActive:.................10.0 Gbps

LinkSpeedExtSupported:...........14.0625 Gbps

LinkSpeedExtEnabled:.............14.0625 Gbps

LinkSpeedExtActive:..............14.0625 Gbps

# Extended Port info: Lid 144 port 1

StateChangeEnable:...............0x00

LinkSpeedSupported:..............0x01

LinkSpeedEnabled:................0x01

LinkSpeedActive:.................0x00

 

 

After firing the above mentioned commands, i.e.

ibportstate 144 1 fdr10 0 espeed 30

ibportstate 144 1 reset

 

We get below output

 

CA PortInfo:

# Port info: Lid 144 port 1

LinkState:.......................Active

PhysLinkState:...................LinkUp

Lid:.............................144

SMLid:...........................442

LMC:.............................0

LinkWidthSupported:..............1X or 4X

LinkWidthEnabled:................1X or 4X

LinkWidthActive:.................4X

LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps

LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps

LinkSpeedActive:.................10.0 Gbps

LinkSpeedExtSupported:...........14.0625 Gbps

LinkSpeedExtEnabled:.............14.0625 Gbps

LinkSpeedExtActive:..............14.0625 Gbps

# Extended Port info: Lid 144 port 1

StateChangeEnable:...............0x00

LinkSpeedSupported:..............0x01

LinkSpeedEnabled:................0x00

LinkSpeedActive:.................0x00

 

Can you please help ?

 

CA 'mlx4_0'

        CA type: MT4099

        Number of ports: 1

        Firmware version: 2.30.8000

        Hardware version: 0

        Node GUID: 0x0002c903001f69e0

        System image GUID: 0x0002c903001f69e3

        Port 1:

                State: Active

                Physical state: LinkUp

                Rate: 56

                Base lid: 144

                LMC: 0

                SM lid: 442

                Capability mask: 0x02514868

                Port GUID: 0x0002c903001f69e1

                Link layer: InfiniBand

Re: Which ESXi driver to use for SRP/iSER over IB (not Eth!)?

$
0
0

KVM?

No! KVM also have many limitations.

For example EoIB, etc.


Infiniband communication rely on SM

- Subnet Manager


SM consist of some components and API.

But SM architecture didn't designed for hypervisor world.


Historically many problems were exist in vSphere environment.


1st. ESXi

When vSphere 4.x age VMware give us two choice.


ESX and ESXi

ESX consist hypervisor and OEMed Red Hat console.

ESXi consist hypervisor only.


Some IB tools didn't work in ESXi host in my experience. But that IB tools work in ESX host nicely.


ESXi isn't a general purpose kernel.

I think it cause a major IB driver porting problem.


2nd. Infiniband design itself!!!

Hypervisor control all communication guest vm and host network. RDMA have a kernel by-pass feature called zero copy or RDMA R/W.


This feature controlled by SM, but add hypervisor to this network, many complex modification must add to SM and IB APIs.


There isn't IBTA standard yet.

This will be stadardize in near future.


3rd. RDMA storage protocol.

Infiniband specification concern all of RDMA and ULP protocols.


The improvement of Linux OFED is very fast.

No one know Which OFED version will be port to latest ESXi version.


Many complexity exist and must resolve issues in ESXi environment.


iSER also good candidate for ESXi RDMA protocol, but some critical problem exist.


I think that we must check the latest Linux OFED release note that include so many bugs, limitations.


Linux is a good platform but also suffer from IB's unique limitations.


Conclusion.

I think IB is fastest and eiffient high speed protocol in the planet. But not ready to enterprise network environment yet.


Mellanox said in their product brouchur that they can support major OS environments.


But many case wasn't.

Beta level driver, manual, bug, limitation, etc.


Absolutely!

Many time later all problem will be overcome with new standard and product.


But not now...






Re: Is this the best our FDR adapters can do?

$
0
0

 

Thanks for sharing your experience.  I did the following:

 

[root@sc2u0n0 ~]# dmidecode |grep PCI

  Designation: PCIe Slot 1

  Type: x8 PCI Express 3 x16

  Designation: PCIe Slot 3

  Type: x8 PCI Express 3

 

lspci -vv

[...]

02:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

[...]

                LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited

                        ClockPM- Surprise- LLActRep- BwNot-

 

So, the theoretical speed should be 8Gbps/lane x 8 lane x 128b/130b = 63 Gbps.  In fact, we just did a fio sweep using fio-2.12.  The read is quite reasonable. We are now investigating why the write is so low.

 

A. Read test results

 

  • Chunk size = 2 MiB
  • Num. Jobs = 32
  • IO Depth = 128
  • File size = 500 GiB
  • Test time = 360 seconds
ModeSpeed, GbpsIOPS
psync, direct47.772986
psync, buffered24.491530
libaio, direct49.17

3073

 

 

B. Write test results

 

  • Chunk size = 2 MiB
  • Num. Jobs = 32
  • IO Depth = 128
  • File size = 500 GiB
  • Test time = 360 seconds
ModeSpeed, GbpsIOPS
psync, direct24.141509
psync, buffered9.32583
libaio, direct22.511407

 


 

 

Is srp supported in RHEL7.2 PPC64 ?

$
0
0

Hi all,

Trying to hook up my IB storage device to my Power7 Server running RHEL 7.2 ppc64.  Installed the latest OFED, but srp isn't working.  Can't find any info anywhere that says it is not supported?

 

The cards ConnectX 2 VPI cards.   Has anyone tried this?

 

Regards

 

Mark Guz

Re: mlnx_tune does not detect the BIOS I/O non-posted prefetch settings?

$
0
0

IMHO this mlnx_tune Python script has a bug.  The logic used in issuing the warning message is incorrect.

 

  1462                  pci_width_compatible_to_cpu_ok = not (int(self.actual_pci_width) >= PciDeviceInfo.MIN_LENGTH_CHECK_HSW_COMPATIBLE and cpu_arch == Architecture.HASWELL)

 

In the same code,

  1408          MIN_LENGTH_CHECK_HSW_COMPATIBLE         = 16

 

Now if you follow the logic, the result is:

 

int(self.actual_pci_width) >= PciDeviceInfo.MIN_LENGTH_CHECK_HSW_COMPATIBLE is 16 >= 16 is True

Architecture.HASWELL is also True

True and True is True

not True is False

pci_width_compatible_to_cpu_ok is false

 

But this is invalid! EDR needs PCIe gen3 x 16. That's mandatory, so why the script complains?  Using lspci -vv, it's easy to see the following:

 

LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
                       ClockPM- Surprise- LLActRep- BwNot-

See the above Width x16 ?

 

So, if the lspci correctly reports an EDR HCA in the system and it's in a PCIe Gen3 x 16 slot, the mlnx_tune issues a warning?   This is a bug I am afraid.

 


Re: Is this the best our FDR adapters can do?

$
0
0

I think I have the answer now.  It's due to the confusion caused by the prevalent and inconsistent use of MB and MiB out there in different software applications.

 

When I ran ib_write_bw with the --report_gbits flag, I did see over 50+ Gbps. That got me curious, so I assumed the MB/s output to be actually MiB/s, then 6028MiB/s = 51.02Gbps, as anticipated.

Re: Is srp supported in RHEL7.2 PPC64 ?

$
0
0

Not sure about PPC64 architecture but, in general, the situation with Mellanox Infiniband and SRP support is this:

  • SRP initiator is supported out of the box, both with inbox and the latest Mellanox OFED distro
  • When using RHEL inbox drivers, SRP target is supported with LIO. If you install Mellanox OFED, the inbox drivers are removed and you lose SRP support for LIO
  • SCST has its own SRP target driver that can be used against both inbox and Mellanox OFED, up to the latest version. So you have to use SCST target if you want to use Mellanox OFED.

Hope this helps.

 

Cheers!

Re: How to upgrade to latest version of MLNX-OS ?

$
0
0

Hi! Anand

 

Could you give me a ftp link for my 2 of SX6036G VPI Gateway system?

I received a message another threads that said to me cable detection bug will be resolved in future MLNX-OS release.

My eMail address is here.

 

jhchoi AT kuwoo DOT co DOT kr

 

Best Regards.

Re: Is srp supported in RHEL7.2 PPC64 ?

$
0
0

Hi

 

Thanks for responding.

 

I'm not trying to create a target on my PPC64 box, I'm trying to connect

to an external IB target.

 

I have a ConnectX-2 VPI card that does not work with the inbox drivers on

ppc64

 

mlx4_core: Initializing 0002:01:00.0

mlx4_core 0002:01:00.0: PCIe link speed is 5.0GT/s, device

supports 5.0GT/s

mlx4_core 0002:01:00.0: PCIe link width is x8, device

supports x8

mlx4_core 0002:01:00.0: Only 64 UAR pages (need more than

128)

mlx4_core 0002:01:00.0: Increase firmware

log2_uar_bar_megabytes?

mlx4_core 0002:01:00.0: Failed to initialize user access

region table, aborting

 

If I install the OFED the card works but there is no SRP support so I'm

not able to connect to the external target?

 

Am I missing something?

 

Cheers

 

Mark Guz

 

Senior IT Specialist

Flash Systems & Technology

 

Office  713-587-1048

Mobile 832-290-8161

mguz@us.ibm.com

Re: 'State: Initializing' but works

$
0
0

Hi,

 

Can you please run the below and provide the output :

 

sminfo -P 1

sminfo -P 2

Re: Connecting SX1016 MLAG Pair to Cisco VSS

$
0
0

Hi Scott,

 

Can you copy paste the switches configuration (including the cisco 6509),

 

please indicate which ports are connected between the SX1016 and the Cisco's

Re: How to upgrade to latest version of MLNX-OS ?

$
0
0

Hi Anand,

 

EMC² is NOT supportive at all. Their support people even do not know about

Mellanox switches ( !? ), although EMC² use the switches in their ISILON NAS

systems. STRANGE !

 

I recently devoted some time again to study the extensive Mellanox documentation.

My problem is to put the latest MLNX-OS onto my two MSX6012F-2BFS switches.

These switches have a non standard EMC² firmware, degrading them to unmanaged

switches, local 1Gb management LAN port not operable. I found the solution on how

to do that in appendix D of the MFT user manual version 2.7

 

What I need is the following:

 

OEM version of the FLINT program to burn the VPI switches EEPROM the first time.

   This version can burn the MLNX-OS to an empty EEPROM. If the EEPROM really needs

   to be EMPTY, I would need a program to bulk erase the EEPROM as well.

 

Latest version of MLNX-OS for  the PPC processor, documentation and release notes.

 

Thank you very much for your great support in the past !

 

Speedy


Re: How to upgrade to latest version of MLNX-OS ?

$
0
0

We don't support Upgrading from EMC firmware to Mellanox Firmware, you need to contact EMC for that.

 

Ophir.

WinOF v5.22 and Platform MPI problem on ConnectX-3 cards

$
0
0

Dears,

I need to use Platform MPI 9.1.3 with my application being run in -IBAL mode.

Unfortuantlly no driver released later than WinOF v2.1.2 ( 3.xx , 4.xx , 5.xx ) won't be detected by MPI and it fires error like this :

Rank 0:1: MPI_Init: didn't find active interface/port

Rank 0:1: MPI_Init: Can't initialize RDMA device

Rank 0:1: MPI_Init: Internal Error: Cannot initialize RDMA protocol

 

if -IBAL switch is removed, TCP will be utilized and the program works but infiniband is not implemented.

Please let me know what's the reason beyond this. Why Mellanox released so many drivers so far which are not compatible with Platform MPI ?

Am I missed a tweak ?

 

Thanks

Re: How to upgrade to latest version of MLNX-OS ?

Re: Connecting SX1016 MLAG Pair to Cisco VSS

Re: Odd, unsymmetric ib_send_lat results?

$
0
0

6 days ago, I reported that I fixed my IPoIB setup.  I just found sometime a while ago to revisit this issue.  Indeed, as I suspected, the original, incorrect IPoIB setup was the cause of jitters observed, most likely due to the fact that for 10G Ethernet ports on servers, I have been using some inexpensive third party SFP+ DACs   So, problem solved for now.

Viewing all 6275 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>