Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6275 articles
Browse latest View live

Windows 2016 Storage Spaces Direct over IPoIB

$
0
0

Hello,

 

I am in need of some assistance regarding Ethernet vs Infiniband IPoIB and lossless networks.

 

We have a 3 Node Windows 2016 Storage Spaces Direct Cluster that was setup early last year when documentation on S2D was still fairly sparse.  We used Infiniband IPoIB instead of Ethernet because we have been using for years to connect our Hyper-V Clusters to our Windows Storage SANs.  The S2D setup is Hyper-converged, storage and hypervisors are separate, so the storage data VM/Ethernet traffic are not over the same network.

 

We currently have a case open with Microsoft related to the Windows Server May Rollup which caused a problem with a VD after a server restart.  The MS engineers have stressed that everything must be perfect in the networking creating a lossless network, including RoCE and QoS setup.

 

Since we are using IPoIB it has brought up the question is our configuration correct.  Does Infiniband IPoIB provide the resiliency needed for S2D traffic?

 

Please excuse me if the question seem too simple.  I have been reading on RoCE and IPoIB for a couple days and I think all the info is confusing me.

 

One added factor.   Since S2D was new at the time and there were a variety of unknowns we included 4x 56Gb ( 2x MCX-354a-fcbt ) ports in each node.  The intent being to over-spec the network to reduce the possibility of congestion.

 

Thanks,

 

Todd


Re: Web interface error on SX6036

$
0
0

Hi Andrew,

Can you provide with version of Mellanox OS running on the switch?

 

Thanks,

Pratik

Remote VTEP mac learning is not working

$
0
0

pastedImage_0.png

 

I'm trying VXLAN configuration with above topology with Mellanox switches(with mellanox OS) as leaves and Cisco N9k as Spine. Both hosts are configured with vlan 10 tagging. Loopbaks on leaf switches are reachable via Spine. swp16 is configured as nve port and vlan 10 is bridged to VNI 10000 on both leaves. This is controller-less configuration and remote VTEPS are added manually using CLI and remote learning is enabled using below commands.

protocol nve

interface nve 1

interface nve 1 vxlan source interface loopback 1

interface ethernet 1/16 nve mode only force

interface nve 1 nve bridge 10000

interface ethernet 1/16 nve vlan 10 bridge 10000

no interface nve 1 nve fdb flood load-balance

interface nve 1 nve fdb flood bridge 10000 address 3.3.3.3

interface nve 1 nve fdb learning remote

 

But the hosts are not able to ping each other. What could be the problem here?

I could see that the VTEP on each switch has learnt the MAC address of the directly connected host. But unable to learn the MAC of the hosts belonging to remote VTEP. I used below command to check MAC learned.

show interface nve 1 mac-address-table

Also nve counters are increased when host2 is pinged from host1. But no packets are going out of swp2.

show interface nve 1 counters

Re: Problem installing MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-x86_64

$
0
0

Hi Sebastian,

 

1) Have you validated based on the RN of the drivers that the following packages were installed:

 

apt-get install perl dpkg autotools-dev autoconf libtool automake1.10

automake m4 dkms debhelper tcl tcl8.4 chrpath swig

graphviz tcl-dev tcl8.4-dev tk-dev tk8.4-dev bison flex dpatch

zlib1g-dev curl libcurl4-gnutls-dev python-libxml2 libvirt-bin

libvirt0 libnl-dev libglib2.0-dev libgfortran3 automake m4

pkg-config libnuma logrotate ethtool lsof

 

2) Did you try to install the latest driver version 4.4-2.0.7.0.

 

3) Can you run it with the following options:

 

./mlnx_add_kernel_support.sh --make-tgz -t /var/tmp/MOFED -k `uname -r` -s /usr/src/kernels/`uname -r` -m . -n MLNX_OFED_LINUX-4.4-2.0.7.0-ubuntu18.04-x86_64-`uname -r` -v

 

Possibly add: --distro ubuntu18.04

 

Sophie.

Re: when using write op with more than 1024B(MTU) in softroce mode,the operation fail

$
0
0

Hi Tianyu,

 

Have you properly configured Soft-Roce whether upstream or Mellanox OFED Driver.

See reference links below:

 

HowTo Configure Soft-RoCE

How to configure Soft-RoCE with Mellanox OFED 4.x

 

Also, you original statement is confusing or contradicting itself:

 

when my write opcode with length=1024, it is ok. but when length=1025 in the same code, it will fail.

when the same code with length=1024 or 1025 run using mellanox CX4 card, it is ok >>> Apparently working.

 

Sophie.

Re: How to configure host chaining for ConnectX-5 VPI

$
0
0

Hi,

 

I have problem to pinging between the nic, this is my configuration:

 

SERVER 1: PORT1:192.168.10.10 PORT2: 192.168.10.11

SERVER 2: PORT1:192.168.10.12 PORT2: 192.168.10.13

SERVER 3: PORT1: 192.168.10.14 PORT2: 192.168.10.15

 

mlxconfig -d mt4119-pciconf0 set LINK_TYPE_P1=2  LINK_TYPE_P2=2

mlxconfig -d mt4119-pciconf0 set HOST_CHAINING_MODE=1

mlxfwreset --device mt4119_pciconf0 reset

 

All commands works perfect, but only pingin ports interconnected, i need pinging all ports.

 

My configuration is correct?

Re: How to configure host chaining for ConnectX-5 VPI

$
0
0

That config looks correct. I'm being that guy... I'd be tempted to do a full machine restart.

Make sure you've issued those commands to the other servers, and done a restart to solidify the config.

 

I haven't used the mlxfwreset command, but looking at the docs, without the level argument, it is only doing the lowest level of what the adapter supports.

A physical 'shutdown -r now' has always worked for me.

Re: RoCE v2 configuration with Linux drivers and packages

$
0
0

Thank you! I was able to configure and run,  I had problems with i40e & i40iw drivers.


Factors that determine compatibility of SFPs with new fibre services?

$
0
0
  1. Whilst I understand that product recommendations are off topic can anyone help by explaining what the critical factors are when looking for SFPs that are going to be compatible with a new service?
  2. Is wavelength a defining factor that should be considered/matched or should anything else be used to guide selection?

Sorry I am new to 10G BASE-SR and I can't seem to find a good resource that can confirm if an SFP supported in a Cisco Nexus 5548UP will be compatible with a new service The new service is described as '10 Gigabit Ethernet LAN PHY IEEE 10G BASE-LR10.3125 Gbps +/- 100 ppm 1310nm'
Ultimately I need to understand if a 'cisco sfp-10g-sr' for which the transmitter wavelength spec is described as 850nm is usable.

Thanks for your patient, i plan to take it and any site recommend?

MSX1012B MSX6012F

$
0
0

Hi,

 

I have two switches that i'am very pleased with. An MSX1012B-2BFS and an MSX6012F-1BRS_WT (as shown in web console).

I have to put one switch on a site and the other on an other site : 6 kilometers between sites.

 

So i want to install on each switch an MC2210511-LR4 module.

The optical fiber between the sites have been tested.

 

My questions are:

 

1. Will this optical QSFP+ module (40GbE) work on each switch ? If so, on what port ? Or any port will be OK ?

2. What does mean the '' WT '' letters on the MSX6012 model ?  (Wide Transceiver ??) If so, does that mean that i need to buy an another MSX6012F_WT ? Or the MSX1012B will also work fine ?

3. Actually, my MSX6012F is in VPI profil mode. Do i need to put it in single_ethernet mode ? (i don't do infiniband on the network for now, but it is envisaged).

4. As our needs are evolving, and my two switches already have almost all their ports in use, I plan to buy two MSX6036F to replace the MSX1012B and MSX6012F. Will both MC2210511-LR4 modules will work and, if so, on which ports?

 

Many thanks for your help !

 

Regards.

Re: Problem installing MLNX_OFED_LINUX-4.4-1.0.0.0-ubuntu18.04-x86_64

$
0
0

Hi Sophie,

 

I doublechecked the packages. Instead of tcl8.4 I have tcl8.6 installed and instead of libnuma I have libnuma1. Could this be the issue?

Version 4.4-2.0.7.0 does give me the same error messages as posted before. I can't even run mlnxofedinstall and get it to finish properly.

What can I do to provide more information so you can help me?

Thank you for your reply!

Re: How can I add a timestamp in Roce?

$
0
0

Hi Chris,

 

have you received a solution for your problem?

 

I'm quite interested because we just have installed a new cluster in our compagny and, unfortunately, we have the same error messsage than you when we launch some MPI jobs.

 

It is a very strong problem for us since, up to now, we are not able to start our studies.

 

Thank you.

ConnectX-5 EN SR-IOV max_vfs

$
0
0

Hi,

 

I'm looking the best value of max_vfs on SR-IOV configuration of virtualization host. I find following statement on ConnectX-5 EN brochure.

 

SR-IOV: Up to 1K Virtual Functions

SR-IOV: Up to 16 Physical Functions per host

 

Let's say that the virtualization host is equipped with total of 40 cores (80 threads). Should I use 1024 or 64 (1024/16) for max_vfs? If not, what is the optimal one for that environment? How to utilize all ConnectX-5 EN VF? How to utilize all ConnectX-5 EN PF?

 

Best regards,

Re: Is an p2p (dedicated link, without switch) Fibre connexion totally lossless ?

$
0
0

Hi Raphael,

RoCE v2 is a UDP based protocol, and UDP, unfortunately,  does not guarantee delivery, ordering or duplicate protection of the packets.

In the case if you have an additional programming questions, I would suggest to ask the question on linux-rdma mailing list.

Re: SN2100B v3.6.8004

$
0
0

Hi Pratik,

 

Oohh I see. Will try this then and will give an update.

 

Thank you.


Re: why not just BUG_ON(!pci_channel_offline(dev->persist->pdev))

Re: Need help updating firmware/speed for MNPA19-XTR adapters

$
0
0

Hello Gerald,

Many thanks for reaching out to the Mellanox Community.

Unfortunately, the ConnectX-2 adapters you have are EOL/EOS for a while now. Even though the firmware you are running is the latest available, the current versions of the driver do not support the ConnectX-2 adapter anymore. 

Also we notice, that you are running on an AMD Ryzen platform, which we do not test. The only AMD platform we test and certify the driver and adapters for is the EPYC platform.

Never the less, you can try if applying the recommendation from the following community post, improves the performance. The link is https://community.mellanox.com/docs/DOC-3086

Our recommendation is to try to get your hands on a pair of ConnectX-3 adapters which are fully supported with our current drivers.

Many Thanks and regards,
~Mellanox Technical Support

Re: ConnectX-2 10GbE Ethernet "Flash not found", FW update problem

$
0
0

Hi Tibor,

 

Many thanks for reaching out to the Mellanox Community.

 

Please check if reburning your firmware when the card is booted in the Livefish mode, resolve your issue. If this does not resolve the issue, the card is dead.

 

For booting you card into Livefish mode, please follow the instructions in the MFT (Mellanox Firmware Tools) User Manual ( http://www.mellanox.com/related-docs/MFT/MFT_user_manual_4_10_0.pdf ), Appendix C: "Booting HCA Device in Livefish Mode".

 

For correctly burning the firmware in this mode, please use the following command:

# flint -d <device> -i <firmware bin file> -nofs -guid <card guid> burn GUID can be found on the HCA sticker.

Example:

# flint -d /dev/mst/mt26428_pciconf0 -i MNPH29C-XTR.bin -nofs –guid 0002C903000B6D1C burn

 

After successful burning the firmware, please reboot the card in Normal mode.

 

Many thanks,

~Mellanox Technical Support

Re: Is Mellanox ConnectX-4 compatible with VPP 18.07?

$
0
0

Hi Garegin,

Many thanks for reaching out to the Mellanox Community.

As the ConnectX-4 is using the same mlx5 PMD driver as the ConnectX-5, the ConnectX-4 is supported for using with VPP.

For VPP support, please refer to the VPP Mailings list.

Many thanks.
~Mellanox Technical Support

DPDK-mlx5 set_mac question on Mellanox NIC passthru (or SR-IOV) at VMWare Hypervisor.

$
0
0

Hi, Experts:

 

When deploying VM, I have meet an issue about mlx5_mac_addr_set() to set a new MAC different with the MAC that VMWare Hypervisor generated,

and the unicast traffic (ping) fails, while ARP has learned the new MAC. Since Mellanox NIC is not set anti-spoofing by default,

the VMWare lloks to add some  anti-mac-spoofing functionality. The VF is configured through pciPassthru.

I have googled online and some link mentions to set the following parameters in VMX, but it looks NOT working:

pciPassthru0.noPromisc = "false"

pciPassthru0.noForgedSrcAddr = "false"

pciPassthru1.noPromisc = "false"

pciPassthru1.noForgedSrcAddr = "false"

 

(Note: it is not SR-IOV, and there has no up-link vswitch Port Group there for passthru pciDevice).

 

Someone else may have asked this question already. I have two questions:

 

1): Does Mellanox community has finalized the method or document? or no solution yet?

2): How to check (which instruction) whether Mellanox NIC anti-MAC-spoofing status?

 

Thanks.

 

-- Edward

Viewing all 6275 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>