Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6275

Problems on ARM platform

$
0
0

I'm trying to use Mellanox  Infiniband adapter with SECO GPU DevKit on NVIDIA CARMA platform. To PCI slot of SECO CARMA board  I had connected Netstor NA255A expansion box and assembled Infiniband adapter and NVIDIA GPU inside.

The SECO software is based upon NVIDIA L4T R16 (Linux kernel 3.1.10).  Unfortunately,  I was not able to find a driver which would work correctly. I have tried the driver bundled with Kenel 3.1.10, and both 1.5.3 and 2.0 drivers from Mellanox website.

 

While trying with Mellanox ConnectX-2 VPI,  the system starts fine and I can exchange data using Infiniband. Unfortunately, the system hangs after several hours of operation (which is not happening if Infiniband is not started). I see following messages before the crash.

[ 8671.282351] irq 130: nobody cared (try booting with the "irqpoll" option)

[ 8671.289890] handlers:

[ 8671.292225] [<c0043d7c>] tegra_pcie_isr

[ 8671.296211] [<be80a324>] mlx4_interrupt

[ 8671.300142] Disabling IRQ #130

 

With Mellanox ConnectX-3 VPI, the system crashes more-or-less directly after the driver is loaded reporting Internal Error.

[  278.665090] mlx4_core 0000:06:00.0: Internal error detected:

[  278.670933] mlx4_core 0000:06:00.0:   buf[00]: 00000000

[  278.676315] mlx4_core 0000:06:00.0:   buf[01]: 00000000

[  278.681682] mlx4_core 0000:06:00.0:   buf[02]: 00000000

[  278.687097] mlx4_core 0000:06:00.0:   buf[03]: 00000000

[  278.692547] mlx4_core 0000:06:00.0:   buf[04]: 00000000

[  278.697943] mlx4_core 0000:06:00.0:   buf[05]: 00000000

[  278.703342] mlx4_core 0000:06:00.0:   buf[06]: 00000000

[  278.708720] mlx4_core 0000:06:00.0:   buf[07]: 00000000

[  278.714104] mlx4_core 0000:06:00.0:   buf[08]: 00000000

[  278.719480] mlx4_core 0000:06:00.0:   buf[09]: 00000000

[  278.724867] mlx4_core 0000:06:00.0:   buf[0a]: 00000000

[  278.730232] mlx4_core 0000:06:00.0:   buf[0b]: 00000000

[  278.735605] mlx4_core 0000:06:00.0:   buf[0c]: 00000000

[  278.740975] mlx4_core 0000:06:00.0:   buf[0d]: 00000000

[  278.746343] mlx4_core 0000:06:00.0:   buf[0e]: 00000000

[  278.751705] mlx4_core 0000:06:00.0:   buf[0f]: 00000000

 

In both cases, the mlx4_core is loaded with msi_x=0 option. Otherwise, there is not enough DMA resources to initialize event queue table.

 

Had somebody successfully used Mellanox adapters on ARM platform? Which driver version and which options?


Viewing all articles
Browse latest Browse all 6275

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>