Commit Graph

460 Commits

Author SHA1 Message Date
Linus Torvalds 0cda611386 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull base rdma updates from Doug Ledford:
 "Round one of 4.8 code: while this is mostly normal, there is a new
  driver in here (the driver was hosted outside the kernel for several
  years and is actually a fairly mature and well coded driver).  It
  amounts to 13,000 of the 16,000 lines of added code in here.

  Summary:

   - Updates/fixes for iw_cxgb4 driver
   - Updates/fixes for mlx5 driver
   - Add flow steering and RSS API
   - Add hardware stats to mlx4 and mlx5 drivers
   - Add firmware version API for RDMA driver use
   - Add the rxe driver (this is a software RoCE driver that makes any
     Ethernet device a RoCE device)
   - Fixes for i40iw driver
   - Support for send only multicast joins in the cma layer
   - Other minor fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (72 commits)
  Soft RoCE driver
  IB/core: Support for CMA multicast join flags
  IB/sa: Add cached attribute containing SM information to SA port
  IB/uverbs: Fix race between uverbs_close and remove_one
  IB/mthca: Clean up error unwind flow in mthca_reset()
  IB/mthca: NULL arg to pci_dev_put is OK
  IB/hfi1: NULL arg to sc_return_credits is OK
  IB/mlx4: Add diagnostic hardware counters
  net/mlx4: Query performance and diagnostics counters
  net/mlx4: Add diagnostic counters capability bit
  Use smaller 512 byte messages for portmapper messages
  IB/ipoib: Report SG feature regardless of HW UD CSUM capability
  IB/mlx4: Don't use GFP_ATOMIC for CQ resize struct
  IB/hfi1: Disable by default
  IB/rdmavt: Disable by default
  IB/mlx5: Fix port counter ID association to QP offset
  IB/mlx5: Fix iteration overrun in GSI qps
  i40iw: Add NULL check for puda buffer
  i40iw: Change dup_ack_thresh to u8
  i40iw: Remove unnecessary check for moving CQ head
  ...
2016-08-04 20:10:31 -04:00
Doug Ledford 7f1d25b47d Merge branches 'misc' and 'rxe' into k.o/for-4.8-1 2016-08-04 11:13:47 -04:00
Krzysztof Kozlowski 00085f1efa dma-mapping: use unsigned long for dma_attrs
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Alex Vesker ab15c95a17 IB/core: Support for CMA multicast join flags
Added UCMA and CMA support for multicast join flags. Flags are
passed using UCMA CM join command previously reserved fields.
Currently supporting two join flags indicating two different
multicast JoinStates:

1. Full Member:
   The initiator creates the Multicast group(MCG) if it wasn't
   previously created, can send Multicast messages to the group
   and receive messages from the MCG.

2. Send Only Full Member:
   The initiator creates the Multicast group(MCG) if it wasn't
   previously created, can send Multicast messages to the group
   but doesn't receive any messages from the MCG.

   IB: Send Only Full Member requires a query of ClassPortInfo
       to determine if SM/SA supports this option. If SM/SA
       doesn't support Send-Only there will be no join request
       sent and an error will be returned.

   ETH: When Send Only Full Member is requested no IGMP join
	will be sent.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-03 21:06:46 -04:00
Doug Ledford fb92d8fb1b Merge branches 'cxgb4-4.8', 'mlx5-4.8' and 'fw-version' into k.o/for-4.8 2016-06-23 12:29:26 -04:00
Ira Weiny 5fa76c2045 IB/core: Add get FW version string to the core
Allow for a common core function to get firmware version strings
from the individual devices.

In later patches this format can then then be used to pass a
properly formated version string through the IPoIB layer.

The problem with the current code in the IPoIB layer is that it is
specific to certain hardware types.

Furthermore, this gives us a common function through which the core
can provide a common sysfs entry.  Eventually we may want to
remove the sysfs export but this provides for user space backwards
compatibility.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 12:08:33 -04:00
Maor Gottlieb 4c2aae712c IB/core: Add IPv6 support to flow steering
Add IPv6 flow specification support.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:45 -04:00
Yishai Hadas a9017e232f IB/core: Extend create QP to get indirection table
Extend create QP to get Receive Work Queue (WQ) indirection table.

QP can be created with external Receive Work Queue indirection table,
in that case it is ready to receive immediately.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:44 -04:00
Yishai Hadas de019a9404 IB/uverbs: Introduce RWQ Indirection table
User applications that want to spread traffic on several WQs, need to
create an indirection table, by using already created WQs.

Adding uverbs API in order to create and destroy this table.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:43 -04:00
Yishai Hadas 6d39786bf1 IB/core: Introduce Receive Work Queue indirection table
Introduce Receive Work Queue (WQ) indirection table.
This object can be used to spread incoming traffic to different
receive Work Queues.

A Receive WQ indirection table points to variable size of WQs.
This table is given to a QP in downstream patches.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimerg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:43 -04:00
Yishai Hadas f213c05272 IB/uverbs: Add WQ support
User space applications which use RSS functionality need to create
a work queue object (WQ). The lifetime of such an object is:
 * Create a WQ
 * Modify the WQ from reset to init state.
 * Use the WQ (by downstream patches).
 * Destroy the WQ.

These commands are added to the uverbs API.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@rimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:43 -04:00
Yishai Hadas 5fd251c8b4 IB/core: Introduce Work Queue object and its verbs
Introduce Work Queue object and its create/destroy/modify verbs.

QP can be created without internal WQs "packaged" inside it,
this QP can be configured to use "external" WQ object as its
receive/send queue.
WQ is a necessary component for RSS technology since RSS mechanism
is supposed to distribute the traffic between multiple
Receive Work Queues.

WQ associated (many to one) with Completion Queue and it owns WQ
properties (PD, WQ size, etc.).
WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
it may be extend to others such as IB_WQT_SQ. (send queue).
WQ from type IB_WQT_RQ contains receive work requests.

PD is an attribute of a work queue (i.e. send/receive queue), it's used
by the hardware for security validation before scattering to a memory
region which is pointed by the WQ. For that, an external WQ object
needs a PD, letting the hardware makes that validation.

When accessing a memory region that is pointed by the WQ its PD
is used and not the QP's PD, this behavior is similar
to a SRQ and a QP.

WQ context is subject to a well-defined state transitions done by
the modify_wq verb.
When WQ is created its initial state becomes IB_WQS_RESET.
>From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
>From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
or to IB_WQS_ERR.
>From IB_WQS_ERR it can be modified to IB_WQS_RESET.

Note: transition to IB_WQS_ERR might occur implicitly in case there
was some HW error.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:43 -04:00
Mike Marciniszyn c755f4afa6 IB/rdmavt: Correct qp_priv_alloc() return value test
The current drivers return errors from this calldown
wrapped in an ERR_PTR().

The rdmavt code incorrectly tests for NULL.

The code is fixed to use IS_ERR() and change ret according
to the driver return value.

Cc: Stable <stable@vger.kernel.org> # 4.6+
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 10:16:15 -04:00
Max Gurtovoy c7e162a417 IB/core: Make all casts in ib_device_cap_flags enum consistent
Replace the few u64 casts with ULL to match the rest of the casts.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 09:50:55 -04:00
Max Gurtovoy 47355b3cd7 IB/core: Fix bit curruption in ib_device_cap_flags structure
ib_device_cap_flags 64-bit expansion caused caps overlapping
and made consumers read wrong device capabilities. For example
IB_DEVICE_SG_GAPS_REG was falsely read by the iser driver causing
it to use a non-existing capability. This happened because signed
int becomes sign extended when converted it to u64. Fix this by
casting IB_DEVICE_ON_DEMAND_PAGING enumeration to ULL.

Fixes: f5aa9159a4 ('IB/core: Add arbitrary sg_list support')
Reported-by: Robert LeBlanc <robert@leblancnet.us>
Cc: Stable <stable@vger.kernel.org> #[v4.6+]
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 09:50:54 -04:00
Christoph Lameter b40f4757da IB/core: Make device counter infrastructure dynamic
In practice, each RDMA device has a unique set of counters that the
hardware implements.  Having a central set of counters that they must
all adhere to is limiting and causes many useful counters to not be
available.

Therefore we create a dynamic counter registration infrastructure.

The driver must implement a stats structure allocation routine, in
which the driver must place the directory name it wants, a list of
names for all of the counters, an array of u64 counters themselves,
plus a few generic configuration options.

We then implement a core routine to create a sysfs file for each
of the named stats elements, and a core routine to retrieve the
stats when any of the sysfs attribute files are read.

To avoid excessive beating on the stats generation routine in the
drivers, the core code also caches the stats for a short period of
time so that someone attempting to read all of the stats in a
given device's directory will not result in a stats generation
call per file read.

Future work will attempt to standardize just the shared stats
elements, and possibly add a method to get the stats via netlink
in addition to sysfs.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
[ Add caching, make structure names more informative, add i40iw support,
  other significant rewrites from the original patch ]
2016-05-26 12:52:51 -04:00
Doug Ledford 8779e7658d Merge branch 'hfi1-2' into k.o/for-4.7 2016-05-26 12:50:05 -04:00
Mike Marciniszyn 8b103e9cde IB/rdamvt: Fix rdmavt s_ack_queue sizing
rdmavt allows the driver to specify the size of the ack queue, but
only uses it for the modify QP limit testing for setting the atomic
limit value.

The driver dependent size is now used to size the s_ack_queue ring
dynamicially.

Since the driver knows its size, the driver will use its define
for any ring size dependent code.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 12:21:10 -04:00
Mike Marciniszyn 4c0b653335 IB/rdmavt: Max atomic value should be a u8
This matches the ib_qp_attr size and
avoids a extremely large value when the lower level
driver registers.

As part of the patch, the u8 ordinals are moved to the
end of the struct to reduce pahole noted excesses.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 12:21:10 -04:00
Erez Shitrit 628e6f7515 IB/SA Agent: Add support for SA agent get ClassPortInfo
New SA query function to return the ClassPortInfo struct from the SA.
If the SM supports FullMemberSendOnly mode for MCG's, it sets a
capability bit in the capability_mask2 field of the response.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-25 15:39:02 -04:00
Erez Shitrit 507f6afa3a IB/core: Introduce capabilitymask2 field in ClassPortInfo mad
Change struct ib_class_port_info to conform to IB Spec 1.3
That in order to get specific capability mask from ClassPortInfo mad.

>From the IB Spec, ClassPortInfo section:
        "CapabilityMask2 Bits 0-26: Additional class-specific capabilities...
         RespTimeValue the rest 5 bits"

The new struct now has one field for capabilitymask2 (previously was the
reserved field) and the resp_time field.

And it fixes up qib and srpt, use of the field repurposed to be used as
capabilitymask2:
IB/qib: Change pma_get_classportinfo
IB/srpt: Adjust the use of ib_class_port_info

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-25 15:39:02 -04:00
Doug Ledford 0651ec932a Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into k.o/for-4.7 2016-05-13 19:40:38 -04:00
Majd Dibbiny b531b90948 IB/core: Add Scatter FCS create flag
Raw Packet QPs that were created with Scatter FCS flag, will scatter
the FCS into the receive buffers.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:28 -04:00
Majd Dibbiny 45686f2d65 IB/core: Add Raw Scatter FCS device capability
Raw Scatter FCS device capability is set when the NIC supports
scattering the FCS to the receive buffers of Raw Packet QPs.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:27 -04:00
Jianxin Xiong bf77cc34f3 ib_pack.h: Add opcode definition for send with invalidate
The opcode for "SEND Last with Invalidate" and "SEND Only with
Invalidate" have been defined for RC in IBA Specification Vol 1
since Release 1.2. Add the definition to the header file in
preparation of supporting these opcodes in rdmavt based drivers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:39:19 -04:00