Commit Graph

93 Commits

Author SHA1 Message Date
Roland Dreier d172f5a4ab Merge branches 'cma', 'cxgb4', 'ipoib', 'mlx4', 'mlx4-sriov', 'nes', 'qib' and 'srp' into for-linus 2012-10-02 07:43:59 -07:00
Jack Morgenstein 47605df953 mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations
Previously, the structure of a guest's proxy QPs followed the
structure of the PPF special qps (qp0 port 1, qp0 port 2, qp1 port 1,
qp1 port 2, ...).  The guest then did offset calculations on the
sqp_base qp number that the PPF passed to it in QUERY_FUNC_CAP().

This is now changed so that the guest does no offset calculations
regarding proxy or tunnel QPs to use.  This change frees the PPF from
needing to adhere to a specific order in allocating proxy and tunnel
QPs.

Now QUERY_FUNC_CAP provides each port individually with its proxy
qp0, proxy qp1, tunnel qp0, and tunnel qp1 QP numbers, and these are
used directly where required (with no offset calculations).

To accomplish this change, several fields were added to the phys_caps
structure for use by the PPF and by non-SR-IOV mode:

    base_sqpn -- in non-sriov mode, this was formerly sqp_start.
    base_proxy_sqpn -- the first physical proxy qp number -- used by PPF
    base_tunnel_sqpn -- the first physical tunnel qp number -- used by PPF.

The current code in the PPF still adheres to the previous layout of
sqps, proxy-sqps and tunnel-sqps.  However, the PPF can change this
layout without affecting VF or (paravirtualized) PF code.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30 20:33:43 -07:00
Jack Morgenstein 1ffeb2eb8b IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support
1. Introduce the basic SR-IOV parvirtualization context objects for
   multiplexing and demultiplexing MADs.
2. Introduce support for the new proxy and tunnel QP types.

This patch introduces the objects required by the master for managing
QP paravirtualization for guests.

struct mlx4_ib_sriov is created by the master only.
It is a container for the following:

1. All the info required by the PPF to multiplex and de-multiplex MADs
   (including those from the PF). (struct mlx4_ib_demux_ctx demux)
2. All the info required to manage alias GUIDs (i.e., the GUID at
   index 0 that each guest perceives.  In fact, this is not the GUID
   which is actually at index 0, but is, in fact, the GUID which is at
   index[<VF number>] in the physical table.
3. structures which are used to manage CM paravirtualization
4. structures for managing the real special QPs when running in SR-IOV
   mode.  The real SQPs are controlled by the PPF in this case.  All
   SQPs created and controlled by the ib core layer are proxy SQP.

struct mlx4_ib_demux_ctx contains the information per port needed
to manage paravirtualization:

1. All multicast paravirt info
2. All tunnel-qp paravirt info for the port.
3. GUID-table and GUID-prefix for the port
4. work queues.

struct mlx4_ib_demux_pv_ctx contains all the info for managing the
paravirtualized QPs for one slave/port.

struct mlx4_ib_demux_pv_qp contains the info need to run an individual
QP (either tunnel qp or real SQP).

Note:  We made use of the 2 most significant bits in enum
mlx4_ib_qp_flags (based on enum ib_qp_create_flags in ib_verbs.h).
We need these bits in the low-level driver for internal purposes.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30 20:33:30 -07:00
Dotan Barak 46db567deb IB/mlx4: Fill in sq_sig_type in query QP
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30 20:33:08 -07:00
Kleber Sacilotto de Souza a0675a386a IB/mlx4: Check iboe netdev pointer before dereferencing it
Unlike other parts of the mlx4_ib code, the function build_mlx_header()
doesn't check if the iboe netdev of the given port is valid before
dereferencing it, which can cause a crash if the ethernet interface
has already been taken down.

Fix this by checking for a valid netdev pointer before using it to get
the port MAC address.

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-08-16 09:38:19 -07:00
Linus Torvalds 5dedb9f3bd Merge tag 'rdma-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull InfiniBand/RDMA changes from Roland Dreier:
 - Updates to the qib low-level driver
 - First chunk of changes for SR-IOV support for mlx4 IB
 - RDMA CM support for IPv6-only binding
 - Other misc cleanups and fixes

Fix up some add-add conflicts in include/linux/mlx4/device.h and
drivers/net/ethernet/mellanox/mlx4/main.c

* tag 'rdma-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits)
  IB/qib: checkpatch fixes
  IB/qib: Add congestion control agent implementation
  IB/qib: Reduce sdma_lock contention
  IB/qib: Fix an incorrect log message
  IB/qib: Fix QP RCU sparse warnings
  mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them
  mlx4_core: Allow guests to have IB ports
  mlx4_core: Implement mechanism for reserved Q_Keys
  net/mlx4_core: Free ICM table in case of error
  IB/cm: Destroy idr as part of the module init error flow
  mlx4_core: Remove double function declarations
  IB/mlx4: Fill the masked_atomic_cap attribute in query device
  IB/mthca: Fill in sq_sig_type in query QP
  IB/mthca: Warning about event for non-existent QPs should show event type
  IB/qib: Fix sparse RCU warnings in qib_keys.c
  net/mlx4_core: Initialize IB port capabilities for all slaves
  mlx4: Use port management change event instead of smp_snoop
  IB/qib: RCU locking for MR validation
  IB/qib: Avoid returning EBUSY from MR deregister
  IB/qib: Fix UC MR refs for immediate operations
  ...
2012-07-24 13:56:26 -07:00
Jack Morgenstein b1d8eb5a21 IB/mlx4: Add debug prints
Define pr_fmt and add some pr_debug prints.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:05:06 -07:00
Hadar Hen Zion 0ff1fb654b {NET, IB}/mlx4: Add device managed flow steering firmware API
The driver is modified to support three operation modes.

If supported by firmware use the device managed flow steering
API, that which we call device managed steering mode. Else, if
the firmware supports the B0 steering mode use it, and finally,
if none of the above, use the A0 steering mode.

When the steering mode is device managed, the code is modified
such that L2 based rules set by the mlx4_en driver for Ethernet
unicast and multicast, and the IB stack multicast attach calls
done through the mlx4_ib driver are all routed to use the device
managed API.

When attaching rule using device managed flow steering API,
the firmware returns a 64 bit registration id, which is to be
provided during detach.

Currently the firmware is always programmed during HCA initialization
to use standard L2 hashing. Future work should be done to allow
configuring the flow-steering hash function with common, non
proprietary means.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-07 16:23:05 -07:00
Sagi Grimberg fc2d004419 IB/mlx4: Fix max_wqe capacity reported from query device
1. Limit the max number of WQEs per QP reported when querying the
   device, so that ib_create_qp() will not fail for a QP size that the
   device claimed to support due to additional headroom WQEs being
   allocated.

2. Limit qp resources accepted for ib_create_qp() to the limits
   reported in ib_query_device().  In kernel space, make sure that the
   limits returned to the caller following qp creation also lie within
   the reported device limits. For userspace, report as before, and do
   adjustment in libmlx4 (so as not to break ABI).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Sagi Grimberg <sagig@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-06-06 10:08:03 -07:00
Roland Dreier cc169165c8 Merge branches 'core', 'cxgb4', 'ipath', 'iser', 'lockdep', 'mlx4', 'nes', 'ocrdma', 'qib' and 'raw-qp' into for-linus 2012-05-21 09:00:47 -07:00
Shlomo Pongratz 987c8f8fc9 IB/mlx4: Replace printk(KERN_yyy...) with pr_yyy(...)
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>

[ Replace one more printk_once() with pr_info_once().  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:59:36 -07:00
Oren Duer c0c1d3d761 IB/mlx4: Put priority bits in WQE of IBoE MLX QP
Otherwise CM packets going over MLX QP1 get fixed scheduling priority 0.
We want CM packets to get the same scheduling priority, and therefore
map to the same SQ (Schedule Queue) and eventually TC (Traffic Class),
as the application requested for the actual QP used for the connection.

Signed-off-by: Oren Duer <oren@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:48:09 -07:00
Or Gerlitz 3987a2d319 IB/mlx4: Add raw packet QP support
Implement raw packet QPs for Ethernet ports using the MLX transport (as
done by the mlx4_en Ethernet netdevice driver).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:18:09 -07:00
Eli Cohen 4ba6b8eaa9 IB/mlx4: Set bad_wr for invalid send opcode
If the opcode of a work request exceeds the range of valid opcodes,
return the pointer to the offending work request.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-26 01:37:30 -08:00
Or Gerlitz 9106c41069 IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE
For IBoE, SLs 0-7 are mapped to Ethernet 802.1Q user priority bits
(pbits) which are part of the VLAN tag, SLs 8-15 are reserved.

Under Ethernet, the ConnectX firmware treats (decode/encode) the four
bit SL field in various constructs such as QPC / UD WQE / CQE as PPP0
and not as 0PPP. This correlates well to the fact that within the
vlan tag the pbits are located in bits 15-13 and not 12-14.

The current code wasn't consistent around that area - the
encoding was correct for the IBoE QPC.path.schedule_queue field,
but was wrong for IBoE CQEs and when MLX header was built.

These inconsistencies resulted in wrong SL <--> wire 802.1Q pbits
mapping, which is fixed by using SL <--> PPP0 all around the place.

Signed-off-by: Oren Duer <oren@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-03 21:00:02 -08:00
Roland Dreier 504255f8d0 Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', 'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next 2011-11-01 09:37:08 -07:00
Or Gerlitz 80a2dcd8d0 IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
There's no need to set the vlan-related fields in an IBoE send WQE
control segment:

 - the vlan to be used by a UD QP is set in the datagram segment.
 - for GSI (CM) QP, all the headers down to 8021q and MAC are built by
   the software anyway.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:57:51 -07:00
Sean Hefty 0a1405da99 IB/mlx4: Add support for XRC QPs
Support the creation of XRC INI and TGT QPs.  To handle the case where
a CQ or PD is not provided, we allocate them internally with the xrcd.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:44:18 -07:00
Or Gerlitz cfcde11c3d IB/mlx4: Use flow counters on IBoE ports
Allocate flow counter per Ethernet/IBoE port, and attach this counter
to all the QPs created on that port.  Based on patch by Eli Cohen
<eli@mellanox.co.il>.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:04:36 -07:00
Eli Cohen e27535b9c6 IB/mlx4: Fix memory ordering of VLAN insertion control bits
We must fully update the control segment before marking it as valid,
so that hardware doesn't start executing it before we're ready.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>

[ Move VLAN control bit setting to before wmb().  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-12-01 11:08:54 -08:00
Eli Cohen 4c3eb3ca13 IB/mlx4: Add VLAN support for IBoE
This patch allows IBoE traffic to be encapsulated in 802.1Q tagged
VLAN frames.  The VLAN tag is encoded in the GID and derived from it
by a simple computation.

The netdev notifier callback is modified to catch VLAN device
addition/removal and the port's GID table is updated to reflect the
change, so that for each netdevice there is an entry in the GID table.
When the port's GID table is exhausted, GID entries will not be added.
Only children of the main interfaces can add to the GID table; if a
VLAN interface is added on another VLAN interface (e.g. "vconfig add
eth2.6 8"), then that interfaces will not add an entry to the GID
table.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-25 10:20:39 -07:00
Eli Cohen af7bd46376 IB/core: Add VLAN support for IBoE
Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the
GID derived from a link local address in the following way:

    GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN.

The 3 bits user priority field of the packets are identical to the 3
bits of the SL.

In case of rdma_cm apps, the TOS field is used to generate the SL
field by doing a shift right of 5 bits effectively taking to 3 MS bits
of the TOS field.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-25 10:20:39 -07:00
Eli Cohen fa417f7b52 IB/mlx4: Add support for IBoE
Add support for IBoE to mlx4_ib.  The bulk of the code is handling the
new address vector fields; mlx4 needs the MAC address of a remote node
to include it in a WQE (for datagrams) or in the QP context (for
connected QPs).  Address resolution is done by assuming all unicast
GIDs are either link-local IPv6 addresses.

Multicast group attach/detach needs to update the NIC's multicast
filters; but since attaching a QP to a multicast group can be done
before the QP is bound to a port, for IBoE we need to keep track of
all multicast groups that a QP is attached too before it transitions
from INIT to RTR (since it does not have a port in the INIT state).

Signed-off-by: Eli Cohen <eli@mellanox.co.il>

[ Many things cleaned up and otherwise monkeyed with; hope I didn't
  introduce too many bugs.  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-25 10:20:39 -07:00
Eli Cohen ff7f5aab35 IB/pack: IBoE UD packet packing support
Add support for packing IBoE packet headers.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>

[ Clean up and fix ib_ud_header_init() a bit.  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-14 12:41:29 -07:00
Vladimir Sokolovsky 6fa8f71984 IB/mlx4: Add support for masked atomic operations
Add support for masked atomic operations (masked compare and swap,
masked fetch and add).

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 16:37:49 -07:00