Mlx5's mkey mechanism is also used for memory windows.
The current code base uses MR (memory region) naming, which is
inaccurate. Changing MR to mkey in order to represent its different
usages more accurately.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds support for re-registration of memory regions in MLX5.
The functionality is basically the same as deregister followed by
register, but attempts to reuse the existing resources as much as
possible.
Original memory keys are kept if possible, saving the need to
communicate new ones to remote peers.
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to add re-registration of memory region, some logic was
extracted to separate functions:
- ODP related logic.
- Some of the UMR WQE preparation code.
- DMA mapping.
- Umem creation.
- Creating MKey using FW interface.
- MR fields assignments after successful creation.
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Commit 4c21b5bcef ("IB/cma: Add net_dev and private data checks to RDMA
CM") added checks for incoming RDMA CM requests that they can be matched to
a netdev based on the P_Key in the BTH of the request. This behavior was
reverted in commit ab3964ad2a ("IB/cma: Use inner P_Key to determine
netdev"), since the mlx5 and ipath drivers didn't send the correct value
in the BTH P_Key.
Since the ipath driver was removed, and the mlx5 driver can now send GSI
packets on different P_Keys, we could revert the patch to let the rdma_cm
module look on the BTH P_Key when deciding to what netdev a packet belongs.
However, that still breaks compatibility with the older drivers.
Change the behavior to print a warning when receiving a request that has a
different BTH P_Key and inner payload P_Key. In the future, after users
have seen the warnings and upgraded their setups, remove the warning and
block these requests.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Now that the transmission of GSI MADs is done with the special transmission
QPs, eliminate the send buffers in the GSI receive QP.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pick the QP to use according to the wr.ud.pkey_index field in the work
request. If the QP doesn't exist, it means the P_Key is zero and the packet
would have been dropped, so just generate a completion and move on.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The emulated GSI QP's send completions are generated by multiple hardware
QPs, so their completions could arrive out of order with respect to the
order their work request were submitted.
Reorder the completions by keeping a list of the posted work request and
their completions. A newly received completion from the hardware updates
the list and marks its work request as completed. However, the completions
are only reported to the client according to the list order.
In order to support that, create a new private CQ to handle the hardware
completions.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The GSI QP emulation requires also emulating completions for transmitted
MADs. The CQ on which these completions are generated can also be used by
the hardware, and the MAD layer is free to use any CQ of the device for the
GSI QP.
Add a method for generating software completions to each mlx5 CQ. Software
completions are polled first, and generate calls to the completion handler
callback if necessary.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Whenever the P_Key table is changed, we create the required GSI
transmission QPs on-demand.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to send GSI MADs on different P_Keys, mlx5 needs different QPs to
be created, each with a different P_Key set when the QP is modified to the
INIT state.
Create QPs for each non-zero P_Key in the P_Key table.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
mlx5 creates special GSI QPs that has limited ability to control the P_Key
of transmitted packets. The sent P_Key is taken from the QP object,
similarly to what happens with regular UD QPs.
Create a software wrapper around GSI QPs that with the following patches
will be able to emulate the functionality of a GSI QP including control of
the P_Key per work request.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add debugging prints to the modify QP verb to help understand the cause a
returned error.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to create multiple GSI QPs, we need to set the source QP number to
one on all these QPs. Add the necessary definitions and infrastructure to
do that.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The driver checks the csum from the HW when completion arrived and marks
it in the wc->wc_flags field for the ulp drivers.
These is for packets from type IB_WC_RECV only.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In order to support LSO and CSUM in the TX flow the driver does the
following:
* LSO bit for the enum mlx5_ib_qp_flags was added, indicates QP that
supports LSO offloads.
* Enables the special offload when the QP is created, and enable the
special work request id (IB_WR_LSO) when comes.
* Calculates the size of the WQE according to the new WQE format that
support these offloads.
* Handles the new WQE format when arrived, sets the relevant
fields, and copies the needed data.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The HW can supply several offloads for UD QP, added offloads for
checksumming for both TX and RX and LSO for TX.
Two new bits were added in order to expose and enable these offloads:
1. HCA capability bit: declares the support for IPoIB basic offloads.
2. QPC bit which will be used in the QP creation flow, which set these
abilities in the QP.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Modify mlx5_ib_process_mad to use PPCNT and query_vport commands
instead of MAD_IFC, as MAD_IFC is deprecated on new firmware
versions (and doesn't support RoCE anyway).
Traffic counters exist in both 32-bit and 64-bit forms.
Declaring support of extended coutners results in traffic counters
to be read in their 64-bit form only via the query_vport command.
Error counters exist only in 32-bit form and read via PPCNT command.
This commit also adds counters support in RoCE.
Signed-off-by: Meny Yossefi <menyy@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Added helper function to read IB standard error counters
via the PPCNT register.
The PPCNT register read command provides the 32-bit error counters
of both IB/RoCE link layer and transport layer.
Signed-off-by: Meny Yossefi <menyy@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull perf fixes from Thomas Gleixner:
"A rather largish series of 12 patches addressing a maze of race
conditions in the perf core code from Peter Zijlstra"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Robustify task_function_call()
perf: Fix scaling vs. perf_install_in_context()
perf: Fix scaling vs. perf_event_enable()
perf: Fix scaling vs. perf_event_enable_on_exec()
perf: Fix ctx time tracking by introducing EVENT_TIME
perf: Cure event->pending_disable race
perf: Fix race between event install and jump_labels
perf: Fix cloning
perf: Only update context time when active
perf: Allow perf_release() with !event->ctx
perf: Do not double free
perf: Close install vs. exit race
Pull x86 fixes from Thomas Gleixner:
"This update contains:
- Hopefully the last ASM CLAC fixups
- A fix for the Quark family related to the IMR lock which makes
kexec work again
- A off-by-one fix in the MPX code. Ironic, isn't it?
- A fix for X86_PAE which addresses once more an unsigned long vs
phys_addr_t hickup"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mpx: Fix off-by-one comparison with nr_registers
x86/mm: Fix slow_virt_to_phys() for X86_PAE again
x86/entry/compat: Add missing CLAC to entry_INT80_32
x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32
x86/platform/intel/quark: Change the kernel's IMR lock bit to false
Pull scheduler fixlet from Thomas Gleixner:
"A trivial printk typo fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/deadline: Fix trivial typo in printk() message
Pull irq fixes from Thomas Gleixner:
"Four small fixes for irqchip drivers:
- Add missing low level irq handler initialization on mxs, so
interrupts can acutally be delivered
- Add a missing barrier to the GIC driver
- Two fixes for the GIC-V3-ITS driver, addressing a double EOI write
and a cache flush beyond the actual region"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3: Add missing barrier to 32bit version of gic_read_iar()
irqchip/mxs: Add missing set_handle_irq()
irqchip/gicv3-its: Avoid cache flush beyond ITS_BASERn memory size
irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1
Pull staging/android fix from Greg KH:
"Here is one patch, for the android binder driver, to resolve a
reported problem. Turns out it has been around for a while (since
3.15), so it is good to finally get it resolved.
It has been in linux-next for a while with no reported issues"
* tag 'staging-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE