Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6275 articles
Browse latest View live

Re: Problem adding kernel support on XenServer

$
0
0

Hello, David

 

You are correct, I am running as root. There should not be any weird permissions that I can think of. I have "no_root_squash" set in export options of the mount. I did check that I can create/modify files manually from that host. And also the fact that the previous version of OFED build fine indicates that the problem is probably elsewhere.


Trying to setup eIPoIB in order to use IB interface from a VM

$
0
0

I was following theeIPoIB Manual Configuration guide in order to set up the ib interface as the main data interface for a VM. I followed the guide quite closely except for the fact that after setting up the eIPoIB network interface (eth2), I forced it to use a bridge. This bridge was used in the VM definition.

 

Here's my configuration on the hypervisor:

 

[root@mgmt2 ~]# ethtool -i eth2

driver: eth_ipoib

version: 1.0.0

firmware-version: 1

bus-info: ib0

 

[root@mgmt2 ~]# cat /sys/class/net/eth2/eth/vifs

SLAVE=ib0.1      MAC=00:1e:67:76:17:1d VLAN=N/A

 

[root@mgmt2 ~]# ifconfig eth2

eth2      Link encap:Ethernet  HWaddr 00:1E:67:76:17:1D 

          inet6 addr: fe80::21e:67ff:fe76:171d/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1

          RX packets:70141448 errors:0 dropped:0 overruns:0 frame:0

          TX packets:914489 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:138436120082 (128.9 GiB)  TX bytes:40287426 (38.4 MiB)

 

[root@mgmt2 ~]# ifconfig br-data

br-data   Link encap:Ethernet  HWaddr 00:1E:67:76:17:1D 

          inet addr:192.200.0.2  Bcast:192.200.255.255  Mask:255.255.0.0

          inet6 addr: fe80::21e:67ff:fe76:171d/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1

          RX packets:3528371 errors:0 dropped:0 overruns:0 frame:0

          TX packets:914442 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:135771247634 (126.4 GiB)  TX bytes:49426414 (47.1 MiB)

 

[root@mgmt2 ~]# brctl show

bridge name     bridge id               STP enabled     interfaces

br-data         8000.001e6776171d       no              eth2

                                                                                 vnet0

 

If I ping another machine on the 192.200. network from the hypervisor everything works fine. However, if I set up a VM to use br-data, I can not ping other machines on the 192.200 network (beside the hypervisor). I'm having a little bit of trouble visualizing how the cloning of interfaces and enslaving them works. Any help would be greatly appreciated. 

Re: Different types of cables

$
0
0

MCC4Q28C is old cable and does supports SDR, DDR, QDR and 10GigE

MC2206130 is alternative to MCC4Q28C and supports QDR, FDR10 and 40GigE

Re: CONNECT-X works on ESXi5.x?

Not able to build lustre 2.5.0 against OFED 2.2.

$
0
0

Hi,

 

I am trying to build lustre 2.5.0 against OFED but not able to compile lustre with OFED.

My CentOS version is 6.4, So I downloaded MLNX_OFED_LINUX-2.2-1.0.1-rhel6.4-x86_64.

# cat /etc/issue
CentOS release 6.4 (Final)
Kernel \r on an \m

 

Then I could run following commands successfully:

 

./mlnxofedinstall -vvv --add-kernel-support --without-32bit --without-fw-update --hpc

/etc/init.d/openibd restart

configure ib0 interface

 

After this I tried to compile lustre with OFED.

I could run ./configure --with-linux=Path_to_linux-2.6.32-358.18.1.el6 --with-o2ib=/usr/src/ofa_kernel/default/ command successfully.

 

But make command gave me below error:

 

In file included from include/trace/ftrace.h:440,
                  from include/trace/define_trace.h:73,
                  from /root/rpmbuild/BUILD/lustre-2.5.0/ldiskfs/trace/events/ldiskfs.h:904,
                  from /root/rpmbuild/BUILD/lustre-2.5.0/ldiskfs/super.c:56:
/root/rpmbuild/BUILD/lustre-2.5.0/ldiskfs/trace/events/ldiskfs.h: In function 'ftrace_profile_enable_ldiskfs_free_inode':
/root/rpmbuild/BUILD/lustre-2.5.0/ldiskfs/trace/events/ldiskfs.h:18: error: implicit declaration of function 'register_trace_ldiskfs_free_inode'
/root/rpmbuild/BUILD/lustre-2.5.0/ldiskfs/trace/events/ldiskfs.h: In function 'ftrace_profile_disable_ldiskfs_free_inode':
/root/rpmbuild/BUILD/lustre-2.5.0/ldiskfs/trace/events/ldiskfs.h:18: error: implicit declaration of function 'unregister_trace_ldiskfs_free_inode'
.....
.....

 

I looked online for this error and found bug registered on the same: https://jira.hpdd.intel.com/browse/LU-4266

I tried below patches suggested in the link one by one:

1. http://review.whamcloud.com/#/c/8451/2

2. http://review.whamcloud.com/#/c/9109

 

Both patches resolve earlier error but then gives below error:

CC [M]  /home/calsoft/common/lustre_2.5.0/lustre-2.5.0_InfinibandTrial/fresh/lustre-2.5.0_61/lnet/klnds/o2iblnd/o2iblnd.o
gcc: @EXTRA_OFED_INCLUDE@: No such file or directory

In my understanding this must be due to change in lustre version. The patches seems to be for lustre version 2.5.2 and 2.6.0.

Could anyone please point me to corresponding patches for 2.5.0 or suggest if I am missing anything in above explained procedure.

 

Thanks,

Aayush.     

Re: Not able to build lustre 2.5.0 against OFED 2.2.

I can't find the firmware of the PSID SM_2091000001000 on your website

$
0
0

I can't find the firmware of the PSID SM_2091000001000 on your website. The infomation of the device is listed as below:


02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0)

[root@VMM3 ~]# mstflint -d 02:00.0 q

Image type:      FS2

FW Version:      2.6.0

Device ID:       26428

Description:     Node             Port1            Port2            Sys image

GUIDs:           003048ffffcb063c 003048ffffcb063d 003048ffffcb063e 003048ffffcb063f

MACs:                                 003048cb063d     003048cb063e

VSD:             n/a

PSID:            SM_2091000001000

Re: I can't find the firmware of the PSID SM_2091000001000 on your website

$
0
0

Hi Ning Li,

 

The PSID indicates that it's an on-board SuperMIcro HCA, you should contact SuperMicro to get the latest Firmware for that LOM device.

 

Please provide them the PSID and the server model


Re: I can't find the firmware of the PSID SM_2091000001000 on your website

$
0
0

Thanks for your reply! I'll contact SuperMicro.

Re: Not able to build lustre 2.5.0 against OFED 2.2.

get_pcounter and collectl

$
0
0

Hello All,

 

I'm attempting to set up a full implementation of collectl and it specifies in the conf file that the command 'get_pcounter' which is supposedly provided by the Mellanox OFED install. Has the command been deprecated or moved to a new command? I have searched Google looking for a solution but have not come across anything relative to the get_pcounter. Any guidance is appreciated.

 

Thanks,

 

Jared

which VMA card to buy to connect HP DL360p Gen8 to 1000Base-T

$
0
0

I'm using HP DL360p Gen8

I need to connect it to 1 Gb cooper Ethernet (1000BASE-T)

I want to use VMA to improve latency.

What is the best card for me (and compatible transceivers)?

Re: Re: "Missing UAR" kernel error on boot, ConnectX-3 not detected by the OS

$
0
0

i am having this problem too.

 

this is the message from my syslog:

Oct  2 05:59:56 fr-dev11 kernel: [  670.201294] mlx4_core: Mellanox ConnectX core driver v1.1 (Oct  2 2014)

Oct  2 05:59:56 fr-dev11 kernel: [  670.201319] mlx4_core: Initializing 0000:04:00.0

Oct  2 05:59:56 fr-dev11 kernel: [  670.201465] mlx4_core 0000:04:00.0: Missing UAR, aborting.

 

This is the mlxfwmanager knows:

Querying Mellanox devices firmware ...

 

Device #1:

----------

 

 

  Device Type:      ConnectX3

  Part Number:      0W0RM9_0Y3KKR

  Description:      ConnectX-3 Dual Port 10GbE SFP+ Adapter card for Dell

  PSID:             DEL0A80000023

  PCI Device Name:  /dev/mst/mt4099_pci_cr0

  Port1 MAC:        f452143d2ef0

  Port2 MAC:        f452143d2ef1

  Versions:         Current        Available    

     FW             2.32.5100      N/A          

     PXE            3.4.0306       N/A          

     UEFI           12.5.0022      N/A          

 

 

  Status:           No matching image found

This is my kernel:

Linux 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

 

This is what lshw knows:

 

  *-network UNCLAIMED

       description: Ethernet controller

       product: MT27500 Family [ConnectX-3]

       vendor: Mellanox Technologies

       physical id: 0

       bus info: pci@0000:04:00.0

       version: 00

       width: 64 bits

       clock: 33MHz

       capabilities: pm vpd msix pciexpress cap_list

       configuration: latency=0

       resources: memory:da700000-da7fffff memory:d9800000-d98fffff

 

This is what lspci knows:

 

04:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]

Re: Re: Re: "Missing UAR" kernel error on boot, ConnectX-3 not detected by the OS

$
0
0

at bit more information gleaned using flint:

 

Image type:      FS2

FW Version:      2.32.5100

FW Release Date: 3.9.2014

MIC Version:     1.5.0

Config Sectors:  2

Product Version: 02.32.51.00

Rom Info:        type=PXE version=3.4.306 devid=4099 proto=ETH

                 type=UEFI version=12.5.22 proto=ETH

Device ID:       4099

Description:     Node             Port1            Port2            Sys image

GUIDs:           ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff

MACs:                                 f452143d2ef0     f452143d2ef1

VSD:            

PSID:            DEL0A80000023

Re: which VMA card to buy to connect HP DL360p Gen8 to 1000Base-T


Re: how spread my net bandwidth when I used Send / receive or read / write semantics on ConnetX-3 VPI

Troubleshooting Mellanox InfiniBand card

$
0
0

 

We have a Mellanox MT27500 Family, ConnectX-3 FDR InfiniBand card that we have purchased and set up in our Mechanical Engineering department cluster. Everything was working fine until a week ago when InfiniBand suddenly stopped working for no apparent reason. I have been trying to troubleshoot this issue with no success and am need of some help from the experts.

 

 

When i try to start the subnet manager on the master node using the command,

 

 

[user@server ~]# /etc/init.d/opensm start

 

 

i get an error saying it failed to start and the following message gets logged in the log file.

 

 

Sep 30 10:36:58 137756 [DE707700] 0x80 -> OpenSM 3.3.15
Entering DISCOVERING state

Sep 30 10:36:58 144767 [DE707700] 0x02 -> osm_vendor_init: 1000 pending umads specified
Sep 30 10:36:58 148482 [DE707700] 0x80 -> Entering DISCOVERING state

No local ports detected!
Sep 30 10:36:58 148959 [DE707700] 0x01 -> perfmgr_mad_unbind: ERR 5405: No previous bind
Sep 30 10:36:58 148969 [DE707700] 0x01 -> osm_congestion_control_shutdown: ERR C108: No previous bind
Sep 30 10:36:58 149163 [DE707700] 0x01 -> osm_sa_mad_ctrl_unbind: ERR 1A11: No previous bind
Exiting SM

 

 

The most curious thing is that the command ibstat returns nothing which is making it really hard for me to troubleshoot this issue. However trying it in debug mode gives the following output.

 

 

[user@server ~] ibstat -dd

 

ibwarn: [29989] umad_init: umad_init
ibwarn: [29989] umad_get_cas_names: max 32
ibwarn: [29989] umad_get_cas_names: return 0 cas

 

 

I am more than willing to provide any other information you need to get to the bottom of it.

 

 

Any help is greatly appreciated!

 

Re: How to stack two Voltaire 4036

$
0
0

Hi andre, sorry it took me so long time to reply, the email ended up in spam box and I have just seen it. I will do the tests you recommended see what happen. Thank you very much again for all your help

Re: How to stack two Voltaire 4036

$
0
0

Hi!

If you test on Windows Platform, you must use SMB Direct with firmware 2.10.720.

 

QDR's throughput is real 32Gb.

But you can use that performance on SRP, MPI, SMB Direct only.

 

QDR HCA's performance is 10Gb only. (8/10 encording 8Gb real performance).

 

If you want 40Gb IPoIB performance, switch to FDR CX-3 HCA and etc.

 

Good luck!

Problems installing WinOF on Windows 8.1

$
0
0

I'm trying to install WinOF for windows 8.1 (which is currently installed on the computer), but when I try to install it I get "MLNX_VPI requires that your computer is running Windows 8.1" Does anyone know what my problem might be?

Viewing all 6275 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>