Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6275

Mixing OFED 1.5.3 and 2.2 in the same network?

$
0
0

Hi all,

 

We have 2 clusters using infiniband, as follows:

Computing cluster 1:

  • 27 nodes
  • IBM bladecenter
  • CentOS 6.2
  • Each node has 1x MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE]
  • Firmware version: 2.9.1100
  • OFED 1.5.3-3.0.0 (from the installer MLNX_OFED_LINUX-1.5.3-3.0.0-rhel6.2-x86_64, with addkernel and all that )
  • ib_ipoib

 

GPFS server cluster:

  • 4 nodes
  • Dell Power edge R720
  • RHEL 6.3
  • Each node has 1x MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE]
  • Firmware version: 2.9.1000
  • OFED 1.5.3-3.1.0 (from the installer MLNX_OFED_LINUX-1.5.3-3.1.0-rhel6.3-x86_64 )
  • ib_ipoib

 

Our network topology consists of:

  • 1x 36x port Mellanox FabricIT IS5030/U1 connected to:
    • 4x GPFS servers - 1x port each
    • 2x Voltaire 40Gb InfiniBand Switch Module. The bladecenter's switches - 2x ports each.
    • 4x other gpfs clients
  • 2x Voltaire 40Gb InfiniBand Switch Module.
    • 13x nodes each. Connected internally tho the HCA's through the chassis backplane.

 

Both clusters have been running for more than a year. The original GPFS + Infiniband installation was done by IBM techs. We merely copied/adapted it when we moved the servers to Dell machines. We never did much research on Infiniband. Mostly went with what we found already installed/configured.

We run GPFS over infiniband (that's why I think we have ib_ipoib there, to use ip addresses to name nodes). The only infiniband parameter we ever modified was adding this to /etc/modprobe.d/mlx4_en.conf to all nodes of both clusters:

 

options mlx4_core pfctx=0 pfcrx=0 log_num_mtt=20 log_mtts_per_seg=4

 

Performance tests (ib_read/write_bw/lat) report ~3250MB/sec and ~2.5 usec (I cannot show actual numbers because there is currently heavy traffic messing with the numbers).

 

GPFS performance showed single-read/write performance (dd) of 2.0-2.5GB/s, and a global multi-node bandwidth of 6~10GB/s.

 

Having these numbers in mind, we consider that the system (and the infiniband network) are working fine.  (Aren't they?)

 

 

 

 

Now, we are planning to build a new computing cluster (and/or rebuild the current one) and we started doing some tests with a couple of computing nodes.

We are moving to CentOS 6.5 and we are forced to move to OFED 2.2 (MLNX_OFED_LINUX-2.2-1.0.1-rhel6.5), as we tried to install 1.5.3 (MLNX_OFED_LINUX-1.5.3-4.0.42-rhel6.3) but the mlnx_add_kernel_support script only supports up to rhel6.4. Even if cheating, the compilation fails due to some missing includes. Thus, we moved on and installed the test cluster with CentOS 6.5 and MLNX_OFED_LINUX-2.2-1.0.1.

 

We have had a couple of problems, that we more or less determined after asking in the openfabrics discussion list:

 

  • perftest handshake mechanism changed from 1.5 to 2.2 and cannot run tests between the new and the old cluster.

We can deal with this. The performance tests between the two ofed-2.2 nodes seemed ok.

 

  • Loading the ib_ipoib module under ofed-2.2 changes the mac address of ib0.

This wouldn't be a problem if it weren't for the ofed installer deleting CentOS's rdma-3.10-3.el6.noarch rpm. This rpm contains the ifup-ib and ifdown-ib scripts able to initialize the infiniband interfaces ignoring mac address changes. We can get around this copying back the two "old" scripts after ofed's installation in the kickstart graph.

 

Even after all these problems, we have been able to join these nodes as GPFS clients with "normal" performance. But, after dealing with this, we are wondering what to do with our infiniband-gpfs network.

 

Can we keep the GPFS servers on ofed 1.5.3 and move the (new) clients to 2.2? Should we try to update everything to 2.2? Will there appear new problems when upgrading the servers in production? Should we keel everything in 1.53? Should we use the community ofed? (CentOS 6.5 default infiniband installation "works")

 

Any other criticism to our system is welcome.

 

Thanks in advance,

 

Txema

 

 

PS: All these doubts are due to one of our techs adding a client to the gpfs with a "poorly installed" infiniband, that stalled all the whole infiniband and GPFS traffic until we removed the node. So we are afraid of touching anything on that network.


Viewing all articles
Browse latest Browse all 6275

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>