Thanks a lot for the reply. It solved the above issue but after running mpirun, i do not see any latency difference with and without GDR
My Questions :
- Why I do not see any latency difference with and without GDR. ?
- Does below sequence or steps correct ? Does it matter for my Question 1
Note: I am having single GPU on both host and peer. Iommu is disabled.
## nvidia-smi topo -m
GPU0 mlx5_0 mlx5_1 CPU Affinity
GPU0 X PHB PHB 18-35
mlx5_0 PHB X PIX
mlx5_1 PHB PIX X
Steps followed are:
1. Install CUDA 9.2 and add the library and bin path in .bashrc
2. Install latest MLX OFED
3. Compile and Install nv_peer_mem driver
4. Get UCX from git. Configure UCX with cuda and Install UCX
5. Configure Openmpi-3.1.1 and install it.
./configure --prefix=/usr/local --with-wrapper-ldflags=-Wl,-rpath,/lib --enable-orterun-prefix-by-default --disable-io-romio --enable-picky --with-cuda=/usr/local/cuda-9.2
6. Configure OSU Benchmarks-5.4.2 with cuda and install it
./configure prefix=/root/osu_benchmarks CC=mpicc --enable-cuda --with-cuda=/usr/local/cuda-9.2
Run mpirun. I do not see any latency difference with and without GDR.
Thanks for your Help.