yes, still having problems. since original post, i'm now on my 3rd switch, and the second topspin 120 and i'm having the exact same issue. While I can plug in two of the infinihost IIIs together and get a link light, when I plug them into the switch, I get no link light. I also cannot plug the connectx card into either and get it to work, but i'm starting to suspect that it's just a bad card. I just can't believe one person can have this much trouble with this stuff.
Re: New to infiniband, can't get a working connection.
Re: Re: New to infiniband, can't get a working connection.
No worries. Which OS are you using?
Is there any chance you could do stuff on CentOS/RHEL 6.4?
Asking that because it's what I'm super familiar with.
If you're ok with that, please install the CentOS/RHEL provided IB software, and also pciutils:
$ sudo yum groupinstall "Infiniband Support"
$ sudo yum install mstflint pciutils
$ sudo chkconfig rdma on
$ sudo service rdma start
Then let's do some basic info gathering so we know what we're dealing with.
- Run lspci -Qvvs on the ConnectX card, and at least one of the Infinihost III's, then post the results here
- Also query the firmware of both using mstflint
Example from a ConnectX card here. First I find out it's PCI address in the box:
$ sudo lspci |grep Mell
01:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)
Then use lspci -Qvvs on that address, to retrieve all of the potentially useful info:
$ sudo lspci -Qvvs 01:00.0
01:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)
Subsystem: Mellanox Technologies Device 0006
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at f7c00000 (64-bit, non-prefetchable) [size=1M]
Region 2: Memory at f0000000 (64-bit, prefetchable) [size=8M]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] Vital Product Data
Product Name: Eagle DDR
Read-only fields:
[PN] Part number: 375-3549-01
[EC] Engineering changes: 51
[SN] Serial number: 1388FMH-0905400010
[V0] Vendor specific: PCIe x8
[RV] Reserved: checksum good, 0 byte(s) reserved
Read/write fields:
[V1] Vendor specific: N/A
[YA] Asset tag: N/A
[RW] Read-write area: 111 byte(s) free
End
Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
Vector table: BAR=0 offset=0007c000
PBA: BAR=0 offset=0007d000
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #8, Speed 2.5GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 1
ARICtl: MFVC- ACS-, Function Group: 0
Kernel driver in use: mlx4_core
Kernel modules: mlx4_core
Note the blue highlighted bits. For ConnectX cards this stuff is useful. For my card, it's showing a Sun part number, as it was originally a Sun badged card (now reflashed to stock firmware). The PCI link is in x8 state too, which is useful (if it wasn't, it would indicate a problem).
And the mstflint output example:
$ sudo mstflint -d 01:00.0 q
Image type: ConnectX
FW Version: 2.9.1000
Device ID: 25418
Description: Node Port1 Port2 Sys image
GUIDs: 0003ba000100edb8 0003ba000100edb9 0003ba000100edba 0003ba000100edbb
MACs: 0003ba00edb9 0003ba00edba
Board ID: (MT_04A0120002)
VSD:
PSID: MT_04A0120002
That tells us the firmware version on the card. Useful to know, as it might need upgrading (very easy to do).
After you've pasted that info here, we can start figuring out if there's anything wrong with the basics first and fix them. Then we can move onto the next stuff.
(note - edited for typo fixes)
ESX 5.1.0 IPoIB NFS datastore freezes
Hello,
hopefully this is the right place to provide some information about a VMWare IPoIB datastore freeze. We are testing a new VMWare ESX 5.1.0 setup. Sadly we have only 3 ConnectX (gen 1) cards left, the other ConnectX2 ones are in our productive ESX 4.1 environment. We know that this is not officially supported right now but I want to make sure that the error will not happen when we upgrade the productive machines.
The hardware is:
Fujitsu RX300 S6 (Dual Intel X5670)
ConnectX MT25418 Firmware 2.9.1000
ESX 5.1.0 1117900
Mellanox driver 1.8.1
When copying data between VMs all of a sudden the adapter freezes and the datastore is "lost". From the vmkernel log we can read endless lines as below:
2013-07-22T17:22:48.775Z cpu10:8202)<3>vmnic_ib1:ipoib_send:504: found skb where it does not belongtx_head = 3827020, tx_tail =3827020
2013-07-22T17:22:48.775Z cpu10:8202)<3>vmnic_ib1:ipoib_send:505: netif_queue_stopped = 0
2013-07-22T17:22:48.775Z cpu10:8202)Backtrace for current CPU #10, worldID=8202, ebp=0x41220029af68
2013-07-22T17:22:48.776Z cpu10:8202)0x41220029af68:[0x41802a310d59]ipoib_send@<None>#<None>+0x5d4 stack: 0xffffff, 0x0, 0x412410d4c948,
2013-07-22T17:22:48.777Z cpu10:8202)0x41220029b018:[0x41802a310d59]ipoib_send@<None>#<None>+0x5d4 stack: 0x41220029b088, 0x418029e0a55b
2013-07-22T17:22:48.777Z cpu10:8202)0x41220029b148:[0x41802a317160]ipoib_mcast_send@<None>#<None>+0xf7 stack: 0x41220029b188, 0x418029d
2013-07-22T17:22:48.778Z cpu10:8202)0x41220029b238:[0x41802a31dabf]ipoib_start_xmit@<None>#<None>+0x396 stack: 0x41220029b598, 0x412200
2013-07-22T17:22:48.778Z cpu10:8202)0x41220029b398:[0x41802a31ac3b]vmipoib_start_xmit@<None>#<None>+0x49a stack: 0x41000be0b880, 0x839e
2013-07-22T17:22:48.779Z cpu10:8202)0x41220029b468:[0x41802a16d8f0]DevStartTxImmediate@com.vmware.driverAPI#9.2+0x137 stack: 0x41220029
2013-07-22T17:22:48.779Z cpu10:8202)0x41220029b4d8:[0x418029d3470e]UplinkDevTransmit@vmkernel#nover+0x295 stack: 0x10787a40, 0x41220029
2013-07-22T17:22:48.780Z cpu10:8202)0x41220029b558:[0x418029dabbaa]NetSchedFIFORunLocked@vmkernel#nover+0x1a5 stack: 0xc0bd95300, 0x0,
2013-07-22T17:22:48.781Z cpu10:8202)0x41220029b5e8:[0x418029dabf57]NetSchedFIFOInput@vmkernel#nover+0x24e stack: 0x41220029b638, 0x4180
2013-07-22T17:22:48.781Z cpu10:8202)0x41220029b698:[0x418029dab0b2]NetSchedInput@vmkernel#nover+0x191 stack: 0x41220029b748, 0x41000bd9
2013-07-22T17:22:48.782Z cpu10:8202)0x41220029b738:[0x418029d3ced0]IOChain_Resume@vmkernel#nover+0x247 stack: 0x41220029b798, 0x418029d
2013-07-22T17:22:48.782Z cpu10:8202)0x41220029b788:[0x418029d2c0e4]PortOutput@vmkernel#nover+0xe3 stack: 0x41220029b808, 0x41802a216a2a
2013-07-22T17:22:48.783Z cpu10:8202)0x41220029b808:[0x41802a2254c8]TeamES_Output@<None>#<None>+0x16b stack: 0x0, 0x418029cc3879, 0x4122
2013-07-22T17:22:48.784Z cpu10:8202)0x41220029ba08:[0x41802a218047]EtherswitchPortDispatch@<None>#<None>+0x142a stack: 0xffffffff000000
2013-07-22T17:22:48.784Z cpu10:8202)0x41220029ba78:[0x418029d2b2c7]Port_InputResume@vmkernel#nover+0x146 stack: 0x410001553540, 0x41220
2013-07-22T17:22:48.785Z cpu10:8202)0x41220029baa8:[0x41802a3b95cb]TcpipTxDispatch@<None>#<None>+0x9a stack: 0x7c1f45, 0x41220029bad8,
2013-07-22T17:22:48.785Z cpu10:8202)0x41220029bb28:[0x41802a3ba118]TcpipDispatch@<None>#<None>+0x1c7 stack: 0x246, 0x41220029bb70, 0x41
2013-07-22T17:22:48.786Z cpu10:8202)0x41220029bca8:[0x418029d0b245]WorldletProcessQueue@vmkernel#nover+0x4b0 stack: 0x41220029bd58, 0xb
2013-07-22T17:22:48.786Z cpu10:8202)0x41220029bce8:[0x418029d0b895]WorldletBHHandler@vmkernel#nover+0x60 stack: 0x100000000000001, 0x41
2013-07-22T17:22:48.786Z cpu10:8202)0x41220029bd68:[0x418029c2083a]BH_Check@vmkernel#nover+0x185 stack: 0x41220029be68, 0x41220029be08,
2013-07-22T17:22:48.787Z cpu10:8202)0x41220029be68:[0x418029dbc9bc]CpuSchedIdleLoopInt@vmkernel#nover+0x13b stack: 0x41220029be98, 0x41
2013-07-22T17:22:48.787Z cpu10:8202)0x41220029be78:[0x418029dc66de]CpuSched_IdleLoop@vmkernel#nover+0x15 stack: 0xa, 0x14, 0x41220029bf
2013-07-22T17:22:48.787Z cpu10:8202)0x41220029be98:[0x418029c4f71e]Init_SlaveIdle@vmkernel#nover+0x49 stack: 0x0, 0x0, 0x0, 0x0, 0x0
2013-07-22T17:22:48.788Z cpu10:8202)0x41220029bfe8:[0x418029ee26a6]SMPSlaveIdle@vmkernel#nover+0x31d stack: 0x0, 0x0, 0x0, 0x0, 0x0
Any help is appreciated.
Best regards.
Markus
Management command failed in KVM for SR-IOV
Hi,
This is my forth day of fighting with SR-IOV and KVM.
I can ping from VM to other IPoIB computer but when I tried to use ibnetdiscover command I get SIGSEGV
ibnetdiscover
src/query_smp.c:98; send failed; -5
#
# Topology file: generated on Fri Jul 19 19:28:24 2013
#
Segmentation fault (core dumped)
Any access from most of the ib commands failed, dmesg shows:
mlx4_core 0000:04:00.0: vhcr command MAD_IFC (0x24) slave:3 in_param 0x29f3a000 in_mod=0xffff0001, op_mod=0xc failed with error:0, status -1
mlx4_core 0000:04:00.0: vhcr command SET_PORT (0xc) slave:3 in_param 0x29f3a000 in_mod=0x1, op_mod=0x0 failed with error:0, status -22
mlx4_core 0000:04:00.0: slave 3 is trying to execute a Subnet MGMT MAD, class 0x1, method 0x81 for attr 0x11. Rejecting
Looks like command firmware MAD_IFC is failing by somereason in device but I don't have idea about the cause, possibly this part of code is related to this:
+ if (slave != dev->caps.function &&
+ ((smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) ||
+ (smp->mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED &&
+ smp->method == IB_MGMT_METHOD_SET))) {
+ mlx4_err(dev, "slave %d is trying to execute a Subnet MGMT MAD, "
+ "class 0x%x, method 0x%x for attr 0x%x. Rejecting\n",
+ slave, smp->method, smp->mgmt_class,
+ be16_to_cpu(smp->attr_id));
+ return -EPERM;
+ }
from
+static int mlx4_MAD_IFC_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
Please find below some details about my build.
Reallly appreciate if anybody can point me the right direction or even better help me to fix the issue.
Thanks in advance
Marcin
Host:
-----
Motherboard: Supermicro X9DRI-F
CPUs: 2x E5-2640
System: CentOS 6.3:2.6.32-279.el6.x86_64 and CentOS 6.4 2.6.32-358.el6.x86_64
Infiniband: Mellanox Technologies MT27500 Family [ConnectX-3], MCX354A-QCB
Mellanox OFED: MLNX_OFED_LINUX-2.0-2.0.5-rhel6.3-x86_64
qemu-kvm.x86_64 2:0.12.1.2-2.355.el6
#lspci | grep Mel
04:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
04:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.4 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.5 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.6 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.7 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
#dmesg | grep mlx4
mlx4_core: Mellanox ConnectX core driver v1.1 (Apr 23 2013)
mlx4_core: Initializing 0000:04:00.0
mlx4_core 0000:04:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
mlx4_core 0000:04:00.0: setting latency timer to 64
mlx4_core 0000:04:00.0: Enabling SR-IOV with 5 VFs
mlx4_core 0000:04:00.0: Running in master mode
mlx4_core 0000:04:00.0: irq 109 for MSI/MSI-X
mlx4_core 0000:04:00.0: irq 110 for MSI/MSI-X
mlx4_core 0000:04:00.0: irq 111 for MSI/MSI-X
mlx4_core 0000:04:00.0: irq 112 for MSI/MSI-X
mlx4_core: Initializing 0000:04:00.1
mlx4_core 0000:04:00.1: enabling device (0000 -> 0002)
mlx4_core 0000:04:00.1: setting latency timer to 64
mlx4_core 0000:04:00.1: Detected virtual function - running in slave mode
mlx4_core 0000:04:00.1: Sending reset
mlx4_core 0000:04:00.0: Received reset from slave:1
mlx4_core 0000:04:00.1: Sending vhcr0
mlx4_core 0000:04:00.1: HCA minimum page size:512
mlx4_core 0000:04:00.1: irq 113 for MSI/MSI-X
mlx4_core 0000:04:00.1: irq 114 for MSI/MSI-X
mlx4_core 0000:04:00.1: irq 115 for MSI/MSI-X
mlx4_core 0000:04:00.1: irq 116 for MSI/MSI-X
mlx4_core: Initializing 0000:04:00.2
mlx4_core 0000:04:00.2: enabling device (0000 -> 0002)
mlx4_core 0000:04:00.2: setting latency timer to 64
mlx4_core 0000:04:00.2: Skipping virtual function:2
mlx4_core: Initializing 0000:04:00.3
mlx4_core 0000:04:00.3: enabling device (0000 -> 0002)
mlx4_core 0000:04:00.3: setting latency timer to 64
mlx4_core 0000:04:00.3: Skipping virtual function:3
mlx4_core: Initializing 0000:04:00.4
mlx4_core 0000:04:00.4: enabling device (0000 -> 0002)
mlx4_core 0000:04:00.4: setting latency timer to 64
mlx4_core 0000:04:00.4: Skipping virtual function:4
mlx4_core: Initializing 0000:04:00.5
mlx4_core 0000:04:00.5: enabling device (0000 -> 0002)
mlx4_core 0000:04:00.5: setting latency timer to 64
mlx4_core 0000:04:00.5: Skipping virtual function:5
<mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v1.0 (Apr 23 2013)
mlx4_core 0000:04:00.0: mlx4_ib: multi-function enabled
mlx4_core 0000:04:00.0: mlx4_ib: initializing demux service for 80 qp1 clients
mlx4_core 0000:04:00.1: mlx4_ib: multi-function enabled
mlx4_core 0000:04:00.1: mlx4_ib: operating in qp1 tunnel mode
mlx4_en: Mellanox ConnectX HCA Ethernet driver v2.1 (Apr 23 2013)
mlx4_en 0000:04:00.0: Activating port:2
mlx4_en: eth2: Using 216 TX rings
mlx4_en: eth2: Using 4 RX rings
mlx4_en: eth2: Initializing port
mlx4_en 0000:04:00.1: Activating port:2
mlx4_en: eth3: Using 216 TX rings
mlx4_en: eth3: Using 4 RX rings
mlx4_en: eth3: Initializing port
mlx4_core 0000:04:00.0: mlx4_ib: Port 1 logical link is up
mlx4_core 0000:04:00.0: Received reset from slave:2
mlx4_core 0000:04:00.0: slave 2 is trying to execute a Subnet MGMT MAD, class 0x1, method 0x81 for attr 0x11. Rejecting
mlx4_core 0000:04:00.0: vhcr command MAD_IFC (0x24) slave:2 in_param 0x106a10000 in_mod=0xffff0001, op_mod=0xc failed with error:0, status -1
mlx4_core 0000:04:00.1: mlx4_ib: Port 1 logical link is up
mlx4_core 0000:04:00.0: slave 2 is trying to execute a Subnet MGMT MAD, class 0x1, method 0x81 for attr 0x11. Rejecting
mlx4_core 0000:04:00.0: vhcr command MAD_IFC (0x24) slave:2 in_param 0x119079000 in_mod=0xffff0001, op_mod=0xc failed with error:0, status -1
mlx4_core 0000:04:00.0: mlx4_ib: Port 1 logical link is down
mlx4_core 0000:04:00.1: mlx4_ib: Port 1 logical link is down
mlx4_core 0000:04:00.0: mlx4_ib: Port 1 logical link is up
mlx4_core 0000:04:00.1: mlx4_ib: Port 1 logical link is up
# ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.11.500
node_guid: 0002:c903:00a2:8fb0
sys_image_guid: 0002:c903:00a2:8fb3
vendor_id: 0x02c9
vendor_part_id: 4099
hw_ver: 0x0
board_id: MT_1090110018
phys_port_cnt: 2
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 1
port_lmc: 0x00
link_layer: InfiniBand
port: 2
state: PORT_DOWN (1)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: InfiniBand
#cat /etc/modprobe.d/mlx4_core.conf
options mlx4_core num_vfs=8 port_type_array=1,1 probe_vf=1
KVM Guest: CentOS 6.4 and CentOS 6.3
----------------------
Mellanox OFED: MLNX_OFED_LINUX-2.0-2.0.5-rhel6.3-x86_64
Kernel: 2.6.32-279.el6.x86_64
#lspci | grep Mel
00:07.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
#ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.11.500
node_guid: 0014:0500:c0bb:4473
sys_image_guid: 0002:c903:00a2:8fb3
vendor_id: 0x02c9
vendor_part_id: 4100
hw_ver: 0x0
board_id: MT_1090110018
phys_port_cnt: 2
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 1
port_lid: 1
port_lmc: 0x00
link_layer: InfiniBand
port: 2
state: PORT_DOWN (1)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: InfiniBand
# sminfo
ibwarn: [3673] _do_madrpc: send failed; Function not implemented
ibwarn: [3673] mad_rpc: _do_madrpc failed; dport (Lid 1)
sminfo: iberror: failed: query
OpenSM log:
Jul 19 09:57:54 001056 [C520D700] 0x02 -> osm_vendor_init: 1000 pending umads specified
Jul 19 09:57:54 002074 [C520D700] 0x80 -> Entering DISCOVERING state
Using default GUID 0x14050000000002
Jul 19 09:57:54 191924 [C520D700] 0x02 -> osm_vendor_bind: Mgmt class 0x81 binding to port GUID 0x14050000000002
Jul 19 09:57:54 671075 [C520D700] 0x02 -> osm_vendor_bind: Mgmt class 0x03 binding to port GUID 0x14050000000002
Jul 19 09:57:54 671503 [C520D700] 0x02 -> osm_vendor_bind: Mgmt class 0x04 binding to port GUID 0x14050000000002
Jul 19 09:57:54 672363 [C520D700] 0x02 -> osm_vendor_bind: Mgmt class 0x21 binding to port GUID 0x14050000000002
Jul 19 09:57:54 672774 [C520D700] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0014050000000002
Jul 19 09:57:54 673345 [C520D700] 0x01 -> osm_vendor_set_sm: ERR 5431: setting IS_SM capmask: cannot open file '/dev/infiniband/issm0': Invalid argument
Jul 19 09:57:54 674233 [C1605700] 0x01 -> osm_vendor_send: ERR 5430: Send p_madw = 0x7f11b00008c0 of size 256 TID 0x1234 failed -5 (Invalid argument)
Jul 19 09:57:54 674278 [C1605700] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_ERROR): SubnGet(NodeInfo), attr_mod 0x0, TID 0x1234
Jul 19 09:57:54 674311 [C1605700] 0x01 -> vl15_send_mad: ERR 3E03: MAD send failed (IB_UNKNOWN_ERROR)
Jul 19 09:57:54 674336 [C0C04700] 0x01 -> state_mgr_is_sm_port_down: ERR 3308: SM port GUID unknown
Re: Management command failed in KVM for SR-IOV
Hello,
Could you provide ibstat?
Could you provide an sminfo example that targets a specific port you know an SM is present in?
Where are you running your SM?
I do not see any indication of fabric connectivity outside of state: PORT_ACTIVE (4). I am thinking that perhaps the commands you are receiving bad MAD responses from are being directed toward a non-linked port. Similar behavior happens in dual port cards with some commands.
Re: Management command failed in KVM for SR-IOV
Also... i've seen before certain machines that do not support SR-IOV. this function also needs to be supported by the HW. you need to check with your server vendor.
Re: Using ConnectX-2 VPI adapters for network workstation with 2 nodes.
I got a question about the SX6018 switch and any input would be greatly appreciated. Will the FDR ports auto-negotiate down QDR and work with ConnectX-2 adapters? Thanks!
Re: Management command failed in KVM for SR-IOV
His hardware supports it because he sees the virtual functions.
Sent from my iPhone
ping time inconsistent in 10G Ethernet Cards
linux version : Linux 2.6.32-279.el6.x86_64(CentOS 6.3 64bit)
Mellanox OFED Version : MLNX_OFED_LINUX-1.9-0.1.8-rhel6.3-x86_64
VMA Version : libvma-6.3.28-0-x86_64.rpm
HCA Card : MCX312A-XCBT
# ethtool -i p5p1driver: mlx4_en (MT_1080120023_CX-3)version: 1.5.10 (Jan 2013)firmware-version: 2.10.700bus-info: 0000:04:00.0
# ethtool -i p5p1driver: mlx4_en (MT_1080120023_CX-3)version: 1.5.10 (Jan 2013)firmware-version: 2.10.800bus-info: 0000:04:00.0
My customer was using Mellanox 10G Ethernet Cards.
The Mellanox 10G Card did not have a consistent ping time(11us~38us --> inconsistant).
but another card(solarflare,chelsio...) had a consistent ping time(13~15us --> consistant).
They has a very sensitive network issue and speed.
is there a way to resole this issue?
4 bytes from 10.154.136.10: icmp_seq=2 ttl=64 time=0.021 ms
64 bytes from 10.154.136.10: icmp_seq=3 ttl=64 time=0.026 ms
64 bytes from 10.154.136.10: icmp_seq=4 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=5 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=6 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=7 ttl=64 time=0.020 ms
64 bytes from 10.154.136.10: icmp_seq=8 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=9 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=10 ttl=64 time=0.020 ms
64 bytes from 10.154.136.10: icmp_seq=11 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=12 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=13 ttl=64 time=0.014 ms
64 bytes from 10.154.136.10: icmp_seq=14 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=15 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=16 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=17 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=18 ttl=64 time=0.044 ms
64 bytes from 10.154.136.10: icmp_seq=19 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=20 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=21 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=22 ttl=64 time=0.020 ms
64 bytes from 10.154.136.10: icmp_seq=23 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=24 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=25 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=26 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=27 ttl=64 time=0.020 ms
64 bytes from 10.154.136.10: icmp_seq=28 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=29 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=30 ttl=64 time=0.020 ms
64 bytes from 10.154.136.10: icmp_seq=31 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=32 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=33 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=34 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=35 ttl=64 time=0.021 ms
64 bytes from 10.154.136.10: icmp_seq=36 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=37 ttl=64 time=0.014 ms
64 bytes from 10.154.136.10: icmp_seq=38 ttl=64 time=0.049 ms
64 bytes from 10.154.136.10: icmp_seq=39 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=40 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=41 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=42 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=43 ttl=64 time=0.020 ms
64 bytes from 10.154.136.10: icmp_seq=44 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=45 ttl=64 time=0.021 ms
64 bytes from 10.154.136.10: icmp_seq=46 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=47 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=48 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=49 ttl=64 time=0.021 ms
64 bytes from 10.154.136.10: icmp_seq=50 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=51 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=52 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=53 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=54 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=55 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=56 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=57 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=58 ttl=64 time=0.044 ms
64 bytes from 10.154.136.10: icmp_seq=59 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=60 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=61 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=62 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=63 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=64 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=65 ttl=64 time=0.020 ms
64 bytes from 10.154.136.10: icmp_seq=66 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=67 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=68 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=69 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=70 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=71 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=72 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=73 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=74 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=75 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=76 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=77 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=78 ttl=64 time=0.023 ms
64 bytes from 10.154.136.10: icmp_seq=79 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=80 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=81 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=82 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=83 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=84 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=85 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=86 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=87 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=88 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=89 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=90 ttl=64 time=0.019 ms
64 bytes from 10.154.136.10: icmp_seq=91 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=92 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=93 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=94 ttl=64 time=0.018 ms
64 bytes from 10.154.136.10: icmp_seq=95 ttl=64 time=0.013 ms
64 bytes from 10.154.136.10: icmp_seq=96 ttl=64 time=0.011 ms
64 bytes from 10.154.136.10: icmp_seq=97 ttl=64 time=0.021 ms
64 bytes from 10.154.136.10: icmp_seq=98 ttl=64 time=0.038 ms
64 bytes from 10.154.136.10: icmp_seq=99 ttl=64 time=0.012 ms
64 bytes from 10.154.136.10: icmp_seq=100 ttl=64 time=0.019 ms
Re: Management command failed in KVM for SR-IOV
Hi,
Thanks for respons and help.
First of all, Supermicro's mobo supports SR-IOV, VT-D. In this version of motherboard nad BIOS, SR-IOV is turn on all the time.
I forgot to mention when phisical device is passthrough everything works as expected in VM.
On this issue, I'm working on 2 phisical nodes S1 and G1, I have OpenSM on G1
OpenSM (sminfo BUILD VERSION: 1.5.8.MLNX_20110906 Build date: Jun 26 2012 21:31:16))
S1 has virtual system CentOS64
I've started working on infiniband technology about 1 year ago and so far almost everything I needed was out of the box
therefore I can be a little bit clumsy
[root@G1]# sminfo
sminfo: sm lid 2 sm guid 0x8f104039814b9, activity count 3626 priority 0 state 3 SMINFO_MASTER
[root@S1 ~]# sminfo
sminfo: sm lid 2 sm guid 0x8f104039814b9, activity count 3707 priority 0 state 3 SMINFO_MASTER
[root@S1 ~]# sminfo -G 0x8f104039814b9
sminfo: sm lid 2 sm guid 0x8f104039814b9, activity count 3718 priority 0 state 3 SMINFO_MASTER
[root@CentOS64 ~]# sminfo -G 0x8f104039814b9
ibwarn: [3178] ib_path_query_via: sa call path_query failed
sminfo: iberror: failed: can't resolve destination port 0x8f104039814b9
[root@CentOS64 ~]# ibstat
CA 'mlx4_0'
CA type: MT4100
Number of ports: 2
Firmware version: 2.11.500
Hardware version: 0
Node GUID: 0x001405008eaa0a36
System image GUID: 0x0002c90300a28fb3
Port 1:
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 1
LMC: 0
SM lid: 2
Capability mask: 0x02514868
Port GUID: 0x0014050000000002
Link layer: InfiniBand
Port 2:
State: Down
Physical state: Disabled
Rate: 10
Base lid: 4
LMC: 0
SM lid: 1
Capability mask: 0x02514868
Port GUID: 0x0014050000000081
Link layer: InfiniBand
2 devices from: options mlx4_core num_vfs=5 port_type_array=1,1 probe_vf=1
[root@S1 ~]# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 2
Firmware version: 2.11.500
Hardware version: 0
Node GUID: 0x0002c90300a28fb0
System image GUID: 0x0002c90300a28fb3
Port 1:
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 1
LMC: 0
SM lid: 2
Capability mask: 0x02514868
Port GUID: 0x0002c90300a28fb1
Link layer: InfiniBand
Port 2:
State: Down
Physical state: Disabled
Rate: 10
Base lid: 4
LMC: 0
SM lid: 1
Capability mask: 0x02514868
Port GUID: 0x0002c90300a28fb2
Link layer: InfiniBand
CA 'mlx4_1'
CA type: MT4100
Number of ports: 2
Firmware version: 2.11.500
Hardware version: 0
Node GUID: 0x00140500f8bf4e16
System image GUID: 0x0002c90300a28fb3
Port 1:
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 1
LMC: 0
SM lid: 2
Capability mask: 0x02514868
Port GUID: 0x0014050000000001
Link layer: InfiniBand
Port 2:
State: Down
Physical state: Disabled
Rate: 10
Base lid: 4
LMC: 0
SM lid: 1
Capability mask: 0x02514868
Port GUID: 0x0014050000000080
Link layer: InfiniBand
[root@G1 ]# ibstat
CA 'mthca0'
CA type: MT25208 (MT23108 compat mode)
Number of ports: 2
Firmware version: 4.7.600
Hardware version: a0
Node GUID: 0x0008f104039814b8
System image GUID: 0x0008f104039814bb
Port 1:
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 2
LMC: 0
SM lid: 2
Capability mask: 0x02510a6a
Port GUID: 0x0008f104039814b9
Link layer: InfiniBand
Port 2:
State: Down
Physical state: Polling
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02510a68
Port GUID: 0x0008f104039814ba
Link layer: InfiniBand
Re: ESX 5.1.0 IPoIB NFS datastore freezes
Hello,
in between I was able to narrow the error down to a quite simple testcase.
1) create a NFS mount inside a VM that uses the IPOIB network interface of the ESX host.
2) copy data via scp from somwhere into this machine onto the NFS mount
When the error occurs for the first time, one can read from the vmkernel.log:
WARNING: LinDMA: Linux_DMACheckContraints:149:Cannot
map machine address = 0x15ffff37b0, length = 65160
for device 0000:02:00.0; reason = buffer straddles
device dma boundary (0xffffffff)
<3>vmnic_ib1:ipoib_send:504: found skb where it does not belong
tx_head = 323830, tx_tail =323830
<3>vmnic_ib1:ipoib_send:505: netif_queue_stopped = 0
Backtrace for current CPU #20, worldID=8212, ebp=0x41220051b028
ipoib_send@<None>#<None>+0x5d4 stack: 0x41800c4524aa, 0x4f0f5000000d
ipoib_send@<None>#<None>+0x5d4 stack: 0x41800c44bca8, 0x41000fe5d6c0
ipoib_start_xmit@<None>#<None>+0x53 stack: 0x41220051b238, 0x41800c4
...
Best regards.
Markus
Re: Management command failed in KVM for SR-IOV
are you running the SM from the hypervisor host machine? if yes, can you try running the SM on a regular machine?
Re: Management command failed in KVM for SR-IOV
no, SM is running on regular one
Re: Management command failed in KVM for SR-IOV
your SM is running on the older HCA type (MT25208) - which should be just fine but still i am thinking about maybe try to run it on a newer ConnectX card (you few of those out there).
Re: Management command failed in KVM for SR-IOV
I have a few more computers that work on older 20Gbs devices, the new ContectX-3 and server I bought to test VM functionality. As you suggested I moved OpenSM to the only one new dev I have at the moment but no improvements. Now OpenSM is running on S1 and it's visible.
[root@S1 ~]# sminfo
sminfo: sm lid 1 sm guid 0x2c90300a28fb1, activity count 239 priority 0 state 3 SMINFO_MASTER
[root@G1 ]# sminfo
sminfo: sm lid 1 sm guid 0x2c90300a28fb1, activity count 303 priority 0 state 3 SMINFO_MASTER
[root@CentOS64 ~]# sminfo
ibwarn: [3702] _do_madrpc: send failed; Function not implemented
ibwarn: [3702] mad_rpc: _do_madrpc failed; dport (Lid 1)
sminfo: iberror: failed: query
I'm not sure if sminfo on vm shows errors coming from sr-iov functionality and VF device or simple it cannot get out of VM because of invalid configuration.
Should I configure OpenSM in any special way? Maybe this VF device is treated in "special" way by sm?
Can I use FDR and QDR on the same infiniband switch?
I want to make a small cluster using the upcoming SX6012 switch with a FDR ConnectX-3 adapter for the main server and QDR ConnectX-2 adapters for the nodes. From my readings into infiniband this shouldn't be problem but I just want to make sure. Thank you!
Re: Management command failed in KVM for SR-IOV
Hi,
Still nothing... hope this info can be helpful.
I notice that OpenSM must be started on hypervisor host in my case this is S1 otherwise the virtual function's ports are linked up but have state DOWN.
When I start OpenSM (option: PORTS="ALL") all the ports become active (both are cable connected).
I noticed also a few more things:
So far only with ibnetdiscover in virtual system produce system message in hypervisor host:
mlx4_core 0000:04:00.0: slave 2 is trying to execute a Subnet MGMT MAD, class 0x1, method 0x81 for attr 0x11. Rejecting
mlx4_core 0000:04:00.0: vhcr command MAD_IFC (0x24) slave:2 in_param 0x26aaf000 in_mod=0xffff0001, op_mod=0xc failed with error:0, status -1
sminfo command gives the correct OpenSM lid information i.e. give the lid number from OpenSM master:
# sminfo --debug -v
ibwarn: [2843] smp_query_status_via: attr 0x20 mod 0x0 route Lid 1
ibwarn: [2843] _do_madrpc: send failed; Function not implemented
ibwarn: [2843] mad_rpc: _do_madrpc failed; dport (Lid 1)
sminfo: iberror: [pid 2843] main: failed: query
In virtual host I can see message in log:
ibnetdiscover[2755]: segfault at e4 ip 00000031d420a8b6 sp 00007fffc2eee6b8 error 4 in libibmad.so.5.3.1[31d4200000+12000]
and in hypervisor host:
<mlx4_ib> _mlx4_ib_mcg_port_cleanup: _mlx4_ib_mcg_port_cleanup-1102: ff12401bffff000000000000ffffffff (port 2): WARNING: group refcount 1!!! (pointer ffff88083f4fa000)
One more thing:
In virtual machine I started OpenSM with guid point to local port in VF and get those messages:
Jul 24 14:27:09 830432 [FA2C0700] 0x80 -> Entering DISCOVERING state
Using default GUID 0x14050000000002
Jul 24 14:27:09 994036 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x81 binding to port GUID 0x14050000000002
Jul 24 14:27:10 398748 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x03 binding to port GUID 0x14050000000002
Jul 24 14:27:10 398958 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x04 binding to port GUID 0x14050000000002
Jul 24 14:27:10 399371 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x21 binding to port GUID 0x14050000000002
Jul 24 14:27:10 399960 [FA2C0700] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0014050000000002
Jul 24 14:27:10 400439 [FA2C0700] 0x01 -> osm_vendor_set_sm: ERR 5431: setting IS_SM capmask: cannot open file '/dev/infiniband/issm0': Invalid argument
Jul 24 14:27:10 401700 [F66B8700] 0x01 -> osm_vendor_send: ERR 5430: Send p_madw = 0x7fcfe40008c0 of size 256 TID 0x1234 failed -5 (Invalid argument)
Jul 24 14:27:10 401700 [F66B8700] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_ERROR): SubnGet(NodeInfo), attr_mod 0x0, TID 0x1234
Jul 24 14:27:10 401700 [F66B8700] 0x01 -> vl15_send_mad: ERR 3E03: MAD send failed (IB_UNKNOWN_ERROR)
Jul 24 14:27:10 401983 [F5CB7700] 0x01 -> state_mgr_is_sm_port_down: ERR 3308: SM port GUID unknown
Regular linux cat on file /dev/infiniband/issm0 works in hypervisor system at least it's waiting when in VM I get exactly the messages from OpenSM log:
# cat /dev/infiniband/issm0
cat: /dev/infiniband/issm0: Invalid argument
both file on host and VM are the same regarding to access:
VM:
#ls -aZ /dev/infiniband/issm0
crw-rw----. root root system_u:object_r:device_t:s0 /dev/infiniband/issm0
Host:
#ls -lZ /dev/infiniband/issm0
crw-rw----. root root system_u:object_r:device_t:s0 /dev/infiniband/issm0
Re: ESX 5.1.0 IPoIB NFS datastore freezes
memo (for myself)
Modifications to the environment during my tests:
Infiniband card was exchanged to an ConnectX PCIe gen2 (MT26418). The newer chip with PCIe 5.0GT but still not an ConnectX2 card. Error still the same.
Updating the host BIOS does not help too. Even with latest version installed error still occurs.
Re: Can I use FDR and QDR on the same infiniband switch?
Shouldn't be a problem. the Infiniband spec defines backward comparability.
What are you planning on running between those nodes?
Re: ESX 5.1.0 IPoIB NFS datastore freezes
Hm,
seems as the error comes from using one port of an infiniband card as
1) VMKernel NFS Interface
2) and Network interface for the VM
See picture below.
After separating the VM network and the Vmkernel to differnt ports of the
adapter one can transfer Gigabytes without errors. Maybe one of the
VMWare driver developers has an idea?
Best regards.
Markus