Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6275

Re: MLNX+NVIDIA ASYNC GPUDirect - Segmentation fault: invalid permissions for mapped object running mpi with CUDA

$
0
0

Thanks a lot for the reply. It solved the above issue but after running mpirun, i do not see any latency difference with and without GDR

 

My Questions :

  1. Why I do not see any latency difference with and without GDR. ?
  2. Does below sequence or steps correct ? Does it matter for my Question 1

 

Note: I am having single GPU on both host and peer. Iommu is disabled.

## nvidia-smi topo -m

           GPU0    mlx5_0  mlx5_1  CPU Affinity

GPU0     X      PHB     PHB     18-35

mlx5_0  PHB      X      PIX

mlx5_1  PHB     PIX      X

 

Steps followed are:

1. Install CUDA 9.2 and add the library and bin path in .bashrc

2. Install latest MLX OFED

3. Compile and Install nv_peer_mem driver

4. Get UCX from git. Configure UCX with cuda and  Install UCX

5. Configure Openmpi-3.1.1 and install it.

./configure --prefix=/usr/local --with-wrapper-ldflags=-Wl,-rpath,/lib --enable-orterun-prefix-by-default --disable-io-romio --enable-picky --with-cuda=/usr/local/cuda-9.2

6. Configure OSU Benchmarks-5.4.2 with cuda and install it

./configure prefix=/root/osu_benchmarks CC=mpicc --enable-cuda --with-cuda=/usr/local/cuda-9.2

 

Run mpirun. I do not see any latency difference with and without GDR.

 

Thanks for your Help.


Viewing all articles
Browse latest Browse all 6275

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>