Commit Graph

1332 Commits

Author SHA1 Message Date
Chuck Lever
0aa44595d6 RDMA/core: Fix a couple of obvious typos in comments
Fix typos.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169643338101.8035.6826446669479247727.stgit@manet.1015granger.net
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-04 21:55:44 +03:00
Kees Cook
4755dc6f29 RDMA: Annotate struct rdma_hw_stats with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct rdma_hw_stats.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230929180431.3005464-1-keescook@chromium.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-02 14:44:54 +03:00
Or Har-Toov
561b4a3ac6 IB/mlx5: Expose XDR speed through MAD
Under MAD query port, Report NDR speed when NDR is supported in the port
capability mask.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/d30bdec2a66a8a2edd1d84ee61453c58cf346b43.1695204156.git.leon@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:38:43 +03:00
Or Har-Toov
703289ce43 IB/core: Add support for XDR link speed
Add new IBTA speed XDR, the new rate that was added to Infiniband spec
as part of XDR and supporting signaling rate of 200Gb.

In order to report that value to rdma-core, add new u32 field to
query_port response.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/9d235fc600a999e8274010f0e18b40fa60540e6c.1695204156.git.leon@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:38:39 +03:00
wenglianfa
aebf8145e1 RDMA/core: Add support to dump SRQ resource in RAW format
Add support to dump SRQ resource in raw format. It enable drivers to
return the entire device specific SRQ context without setting each
field separately.

Example:
$ rdma res show srq -r
dev hns3 149000...

$ rdma res show srq -j -r
[{"ifindex":0,"ifname":"hns3","data":[149,0,0,...]}]

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Link: https://lore.kernel.org/r/20230918131110.3987498-3-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-20 10:50:54 +03:00
wenglianfa
0e32d7d43b RDMA/core: Add dedicated SRQ resource tracker function
Add a dedicated callback function for SRQ resource tracker.

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Link: https://lore.kernel.org/r/20230918131110.3987498-2-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-20 10:50:54 +03:00
Yue Haibing
40cc695d63 RDMA Remove unused function declarations
Commit c2261dd76b ("RDMA/device: Add ib_device_set_netdev() as an alternative to get_netdev")
declared but never implemented ib_device_netdev(), remove it.
Commit 922a8e9fb2 ("RDMA: iWARP Connection Manager.") declared but never implemented
iw_cm_unbind_qp() and iw_cm_get_qp().

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20230809142718.42316-1-yuehaibing@huawei.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-08-13 10:32:35 +03:00
Arnd Bergmann
b39aeb338a rdma: fix INFINIBAND_USER_ACCESS dependency
After a change to the bnxt_re driver, it fails to link when
CONFIG_INFINIBAND_USER_ACCESS is disabled:

  aarch64-linux-ld: drivers/infiniband/hw/bnxt_re/ib_verbs.o: in function `bnxt_re_handler_BNXT_RE_METHOD_ALLOC_PAGE':
  ib_verbs.c:(.text+0xd64): undefined reference to `ib_uverbs_get_ucontext_file'
  aarch64-linux-ld: drivers/infiniband/hw/bnxt_re/ib_verbs.o:(.rodata+0x168): undefined reference to `uverbs_idr_class'
  aarch64-linux-ld: drivers/infiniband/hw/bnxt_re/ib_verbs.o:(.rodata+0x1a8): undefined reference to `uverbs_destroy_def_handler'

The problem is that the 'bnxt_re_uapi_defs' structure is built
unconditionally and references a couple of functions that are never
really called in this configuration but instead require other functions
that are left out.

Adding an #ifdef around the new code, or a Kconfig dependency would
address this problem, but adding the compile-time check inside of the
UAPI_DEF_CHAIN_OBJ_TREE_NAMED() macro seems best because that also
addresses the problem in other drivers that may run into the same
dependency.

Fixes: 360da60d6c ("RDMA/bnxt_re: Enable low latency push")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-03 16:55:04 -07:00
Mark Zhang
58030c76cc RDMA/cma: Always set static rate to 0 for RoCE
Set static rate to 0 as it should be discovered by path query and
has no meaning for RoCE.
This also avoid of using the rtnl lock and ethtool API, which is
a bottleneck when try to setup many rdma-cm connections at the same
time, especially with multiple processes.

Fixes: 3c86aa70bf ("RDMA/cm: Add RDMA CM support for IBoE devices")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/f72a4f8b667b803aee9fa794069f61afb5839ce4.1685960567.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11 11:26:02 +03:00
Jason Gunthorpe
8d7c7c0eeb RDMA: Add ib_virt_dma_to_page()
Make it clearer what is going on by adding a function to go back from the
"virtual" dma_addr to a kva and another to a struct page. This is used in the
ib_uses_virt_dma() style drivers (siw, rxe, hfi, qib).

Call them instead of a naked casting and  virt_to_page() when working with dma_addr
values encoded by the various ib_map functions.

This also fixes the virt_to_page() casting problem Linus Walleij has been
chasing.

Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v2-05ea785520ed+10-ib_virt_page_jgg@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-16 11:08:07 +03:00
Jason Gunthorpe
91d088a030 RDMA/umem: Remove unused 'work' member from struct ib_umem
It is not used now.

Fixes: b95df5e3e4 ("drivers/IB,core: reduce scope of mmap_sem")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v1-22a2667fa089+a3-umem_work_jgg@nvidia.com
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-12 20:25:25 +02:00
Deming Wang
68e416255b RDMA/restrack: Correct spelling
Fix spelling errors.

Signed-off-by: Deming Wang <wangdeming@inspur.com>
Link: https://lore.kernel.org/r/20230206085725.1507-1-wangdeming@inspur.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-07 11:25:10 +02:00
Mark Zhang
312b8f79eb RDMA/mlx: Calling qp event handler in workqueue context
Move the call of qp event handler from atomic to workqueue context,
so that the handler is able to block. This is needed by following
patches.

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/0cd17b8331e445f03942f4bb28d447f24ac5669d.1672821186.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-15 12:23:10 +02:00
Mark Zhang
ccae0447af RDMA/cma: Refactor the inbound/outbound path records process flow
Refactors based on comments [1] of the multiple path records support
patchset:
- Return failure if not able to set inbound/outbound PRs;
- Simplify the flow when receiving the PRs from netlink channel: When
  a good PR response is received, unpack it and call the path_query
  callback directly. This saves two memory allocations;
- Define RDMA_PRIMARY_PATH_MAX_REC_NUM in a proper place.

[1] https://lore.kernel.org/linux-rdma/Yyxp9E9pJtUids2o@nvidia.com/

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org> #srp
Link: https://lore.kernel.org/r/7610025d57342b8b6da0f19516c9612f9c3fdc37.1672819376.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10 10:49:50 +02:00
Li Zhijian
208e3a134b RDMA: Extend RDMA kernel verbs ABI to support flush
This commit extends the RDMA kernel verbs ABI to support the flush
operation defined in IBA A19.4.1. These changes are
backward compatible with the existing RDMA kernel verbs ABI.

It makes device/HCA support new FLUSH attributes/capabilities, and it
also makes memory region support new FLUSH access flags.

Users can use ibv_reg_mr(3) to register flush access flags. Only the
access flags also supported by device's capabilities can be registered
successfully.

Once registered successfully, it means the MR is flushable. Similarly,
A flushable MR should also have one or both of GLOBAL_VISIBILITY and
PERSISTENT attributes/capabilities like device/HCA.

Link: https://lore.kernel.org/r/20221206130201.30986-3-lizhijian@fujitsu.com
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 19:36:01 -04:00
Xiao Yang
3ff81e827b RDMA: Extend RDMA kernel ABI to support atomic write
1) Define new atomic write request/completion in kernel.
2) Define new atomic write capability in kernel.
3) Define new atomic write opcode for RC service in packet.

Link: https://lore.kernel.org/r/1669905432-14-3-git-send-email-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-01 19:51:09 -04:00
Jason Gunthorpe
09f530f0c6 RDMA: Add netdevice_tracker to ib_device_set_netdev()
This will cause an informative backtrace to print if the user of
ib_device_set_netdev() isn't careful about tearing down the ibdevice
before its the netdevice parent is destroyed. Such as like this:

  unregister_netdevice: waiting for vlan0 to become free. Usage count = 2
  leaked reference.
   ib_device_set_netdev+0x266/0x730
   siw_newlink+0x4e0/0xfd0
   nldev_newlink+0x35c/0x5c0
   rdma_nl_rcv_msg+0x36d/0x690
   rdma_nl_rcv+0x2ee/0x430
   netlink_unicast+0x543/0x7f0
   netlink_sendmsg+0x918/0xe20
   sock_sendmsg+0xcf/0x120
   ____sys_sendmsg+0x70d/0x8b0
   ___sys_sendmsg+0x11d/0x1b0
   __sys_sendmsg+0xfa/0x1d0
   do_syscall_64+0x35/0xb0
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

This will help debug the issues syzkaller is seeing.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v1-a7c81b3842ce+e5-netdev_tracker_jgg@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-28 11:58:19 +02:00
Jiangshan Yi
7ac7bfe746 RDMA/opa_vnic: fix spelling typo in comment
Fix spelling typo in comment.

Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn>
Link: https://lore.kernel.org/r/20221009081047.2643471-1-13667453960@163.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-19 10:02:37 +03:00
Li Zhijian
53c2d5b14a RDMA/core: return -EOPNOSUPP for ODP unsupported device
ib_reg_mr(3) which is used to register a MR with specific access flags
for specific HCA will set errno when something go wrong.
So, here we should return the specific -EOPNOTSUPP when the being
requested ODP access flag is unsupported by the HCA(such as RXE).

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/20221001020045.8324-1-lizhijian@fujitsu.com
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-19 10:02:18 +03:00
Jason Gunthorpe
015bda8abd RDMA/core: Add UVERBS_ATTR_RAW_FD
This uses the same passing protocol as UVERBS_ATTR_FD (eg len = 0 data_s64
= fd), except that the FD is not required to be a uverbs object and the
core code does not covert the FD to an object handle automatically.

Access to the int fd is provided by uverbs_get_raw_fd().

Link: https://lore.kernel.org/r/2-v1-bd147097458e+ede-umem_dmabuf_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-09-27 10:15:24 -03:00
Mark Zhang
eb8336dbe3 RDMA/cm: Use DLID from inbound/outbound PathRecords as the datapath DLID
In inter-subnet cases, when inbound/outbound PRs are available,
outbound_PR.dlid is used as the requestor's datapath DLID and
inbound_PR.dlid is used as the responder's DLID. The inbound_PR.dlid
is passed to responder side with the "ConnectReq.Primary_Local_Port_LID"
field. With this solution the PERMISSIVE_LID is no longer used in
Primary Local LID field.

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Link: https://lore.kernel.org/r/b3f6cac685bce9dde37c610be82e2c19d9e51d9e.1662631201.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-22 12:35:31 +03:00
Mark Zhang
5a37494933 RDMA/cma: Multiple path records support with netlink channel
Support receiving inbound and outbound IB path records (along with GMP
PathRecord) from user-space service through the RDMA netlink channel.
The LIDs in these 3 PRs can be used in this way:
1. GMP PR: used as the standard local/remote LIDs;
2. DLID of outbound PR: Used as the "dlid" field for outbound traffic;
3. DLID of inbound PR: Used as the "dlid" field for outbound traffic in
   responder side.

This is aimed to support adaptive routing. With current IB routing
solution when a packet goes out it's assigned with a fixed DLID per
target, meaning a fixed router will be used.
The LIDs in inbound/outbound path records can be used to identify group
of routers that allow communication with another subnet's entity. With
them packets from an inter-subnet connection may travel through any
router in the set to reach the target.

As confirmed with Jason, when sending a netlink request, kernel uses
LS_RESOLVE_PATH_USE_ALL so that the service knows kernel supports
multiple PRs.

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Link: https://lore.kernel.org/r/2fa2b6c93c4c16c8915bac3cfc4f27be1d60519d.1662631201.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-22 12:35:21 +03:00
Mark Zhang
bf9a992851 RDMA/core: Rename rdma_route.num_paths field to num_pri_alt_paths
This fields means the total number of primary and alternative paths,
i.e.,:
  0 - No primary nor alternate path is available;
  1 - Only primary path is available;
  2 - Both primary and alternate path are available.
Rename it to avoid confusion as with follow patches primary path will
support multiple path records.

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Link: https://lore.kernel.org/r/cbe424de63a56207870d70c5edce7c68e45f429e.1662631201.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-22 12:35:13 +03:00
Mark Zhang
a461b746c5 IB/cm: remove cm_id_priv->id.service_mask and service_mask parameter of cm_init_listen()
The service_mask is always ~cpu_to_be64(0), so the result is always
a NOP when it is &'d with a service_id. Remove it for simplicity.

Link: https://lore.kernel.org/r/20220819090859.957943-3-markzhang@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:14:23 +03:00
Mark Zhang
91a3f14ec9 IB/cm: Remove the service_mask parameter from ib_cm_listen()
Remove the service_mask parameter of ib_cm_listen(), as all callers
use 0.

Link: https://lore.kernel.org/r/20220819090859.957943-2-markzhang@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:14:23 +03:00