Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6275

On Demand Paging with Connect-X 4 ECAT 456A

$
0
0

I tried a couple of hours to use ODP in my application without success.

 

 

Issue 1: ibv_exp_query_device() returns no odp capabilities in 'per_transport_caps'. The code snippet is as follows:

 

 

===============================================================================================

  struct ibv_exp_device_attr attr;

  memset(&attr,0,sizeof(struct ibv_exp_device_attr));

  attr.comp_mask = IBV_EXP_DEVICE_ATTR_EXP_CAP_FLAGS | IBV_EXP_DEVICE_ATTR_ODP;

  TEST_NZ(ibv_exp_query_device(ctxt,&attr),"Could not query experimental device attributes.");

  printf("ODP device support:\t0x%lx\n", attr.exp_device_cap_flags & IBV_EXP_DEVICE_ODP);

  printf("ODP driver support:\t0x%x\n",attr.comp_mask & IBV_EXP_DEVICE_ATTR_ODP);

  printf("general_odp_caps=\t0x%lx\n",attr.odp_caps.general_odp_caps);

  printf("rc_odp_caps=\t0x%x\n", attr.odp_caps.per_transport_caps.rc_odp_caps);

  printf("uc_odp_caps=\t0x%x\n", attr.odp_caps.per_transport_caps.uc_odp_caps);

  printf("ud_odp_caps=\t0x%x\n", attr.odp_caps.per_transport_caps.ud_odp_caps);

  printf("dc_odp_caps=\t0x%x\n", attr.odp_caps.per_transport_caps.dc_odp_caps);

  printf("xrc_odp_caps=\t0x%x\n", attr.odp_caps.per_transport_caps.xrc_odp_caps);

  printf("raw_eth_odp_caps=\t0x%x\n", attr.odp_caps.per_transport_caps.raw_eth_odp_caps);

===============================================================================================

results:

 

ODP device support:     0x8000000000

ODP driver support:     0x400

general_odp_caps=       0x0

rc_odp_caps=    0x0

uc_odp_caps=    0x0

ud_odp_caps=    0x0

dc_odp_caps=    0x0

xrc_odp_caps=   0x0

raw_eth_odp_caps=       0x0

 

 

Issue 2: I just ignore Issue 1 and try to register a memory region with odp as follows:

===============================================================================================

    struct ibv_exp_reg_mr_in in;

    in.pd = ctxt.pd;

    in.addr = ctxt.pages;

    in.length = page_size*MAX_PAGE;

    in.exp_access = IBV_EXP_ACCESS_ON_DEMAND|IBV_EXP_ACCESS_REMOTE_WRITE|IBV_ACCESS_REMOTE_WRITE|IBV_EXP_ACCESS_LOCAL_WRITE|IBV_ACCESS_LOCAL_WRITE|IBV_EXP_ACCESS_REMOTE_READ|IBV_ACCESS_REMOTE_READ|IBV_EXP_ACCESS_REMOTE_ATOMIC|IBV_ACCESS_REMOTE_ATOMIC;

    in.comp_mask = 0;

    TEST_Z(ctxt.mr=ibv_exp_reg_mr(&in),"Could not register odp mr");

===============================================================================================

I re-ran the RDMA transfer program (one-sided RDMA over RC connection), which works with out ODP, I got this error:

 

mlx5: compute26: got completion with error:

00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000

00000000 a9005604 0800013b 0000e4d2

28855:server_routine: Completion with error at server:

28855:server_routine: Failed status 4: wr_id 3, qp_num = 315, vendor_err = 86

 

Simple put, ibv_poll_cq() returned a wc with status field = 4, IBV_WC_LOC_PROT_ERR (4). I also tried file mapped region and get the same error.

 

Issue 3: The pre-installed ib_send_bw tool from MLNX_OFED_LINUX-3.3-1.0.0.0-ubuntu16.04-x86_64.tgz works with --odp flag enabled. But if I build the 'perftest' tool from the source in the same package and run its 'ib_send_bw' tool with --odp flag, it reports error as follows:

 

---------------------------------------------------------------------------------------

On-demand paging not supported by driver.

failed to create mr

Failed to create MR

local address: LID 0000 QPN 0x013d PSN 0xab38a3

GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:09:25

remote address: LID 0000 QPN 0x013d PSN 0xae7d06

GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:09:26

---------------------------------------------------------------------------------------

#bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]

mlx5: compute25: got completion with error:

00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000

00000000 00000000 00000000 00000000

00000000 92003204 0000013d 000085e2

Completion with error at server

Failed status 4: wr_id 0 syndrom 0x32

rcnt=0

 

 

I just suspect if the perftest binary and source match with each other. If only I can get the perftest run with odp, I think I can understand what I should do with my code.

 

Operationg system:

weijia@compute26:~/workspace/odp$ cat /etc/lsb-release

DISTRIB_ID=Ubuntu

DISTRIB_RELEASE=16.04

DISTRIB_CODENAME=xenial

DISTRIB_DESCRIPTION="Ubuntu 16.04 LTS"

weijia@compute26:~/workspace/odp$ uname -a

Linux compute26 4.4.0-28-generic #47-Ubuntu SMP Fri Jun 24 10:09:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

 

HCA:

CA 'mlx5_1'

        CA type: MT4115

        Number of ports: 1

        Firmware version: 12.16.1006

        Hardware version: 0

        Node GUID: 0x7cfe90030080ab79

        System image GUID: 0x7cfe90030080ab78

        Port 1:

                State: Active

                Physical state: LinkUp

                Rate: 100

                Base lid: 0

                LMC: 0

                SM lid: 0

                Capability mask: 0x3c010000

                Port GUID: 0x7efe90fffe80ab79

                Link layer: Ethernet


Viewing all articles
Browse latest Browse all 6275

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>