Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband/rdma updates from Roland Dreier:
 - Re-enable flow steering verbs with new improved userspace ABI
 - Fixes for slow connection due to GID lookup scalability
 - IPoIB fixes
 - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
 - Further improvements to SRP error handling
 - Add new transport type for Cisco usNIC

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
  IB/core: Re-enable create_flow/destroy_flow uverbs
  IB/core: extended command: an improved infrastructure for uverbs commands
  IB/core: Remove ib_uverbs_flow_spec structure from userspace
  IB/core: Use a common header for uverbs flow_specs
  IB/core: Make uverbs flow structure use names like verbs ones
  IB/core: Rename 'flow' structs to match other uverbs structs
  IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
  IB/ucma: Convert use of typedef ctl_table to struct ctl_table
  IB/cm: Convert to using idr_alloc_cyclic()
  IB/mlx5: Fix page shift in create CQ for userspace
  IB/mlx4: Fix device max capabilities check
  IB/mlx5: Fix list_del of empty list
  IB/mlx5: Remove dead code
  IB/core: Encorce MR access rights rules on kernel consumers
  IB/mlx4: Fix endless loop in resize CQ
  RDMA/cma: Remove unused argument and minor dead code
  RDMA/ucma: Discard events for IDs not yet claimed by user space
  IB/core: Add Cisco usNIC rdma node and transport types
  RDMA/nes: Remove self-assignment from nes_query_qp()
  IB/srp: Report receive errors correctly
  ...
This commit is contained in:
Linus Torvalds
2013-11-18 15:36:04 -08:00
52 changed files with 1988 additions and 610 deletions
@@ -61,6 +61,12 @@ Description: Interface for making ib_srp connect to a new target.
interrupt is handled by a different CPU then the comp_vector
parameter can be used to spread the SRP completion workload
over multiple CPU's.
* tl_retry_count, a number in the range 2..7 specifying the
IB RC retry count.
* queue_size, the maximum number of commands that the
initiator is allowed to queue per SCSI host. The default
value for this parameter is 62. The lowest supported value
is 2.
What: /sys/class/infiniband_srp/srp-<hca>-<port_number>/ibdev
Date: January 2, 2006
@@ -153,6 +159,13 @@ Contact: linux-rdma@vger.kernel.org
Description: InfiniBand service ID used for establishing communication with
the SRP target.
What: /sys/class/scsi_host/host<n>/sgid
Date: February 1, 2014
KernelVersion: 3.13
Contact: linux-rdma@vger.kernel.org
Description: InfiniBand GID of the source port used for communication with
the SRP target.
What: /sys/class/scsi_host/host<n>/zero_req_lim
Date: September 20, 2006
KernelVersion: 2.6.18
@@ -5,6 +5,24 @@ Contact: linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
Description: Instructs an SRP initiator to disconnect from a target and to
remove all LUNs imported from that target.
What: /sys/class/srp_remote_ports/port-<h>:<n>/dev_loss_tmo
Date: February 1, 2014
KernelVersion: 3.13
Contact: linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
Description: Number of seconds the SCSI layer will wait after a transport
layer error has been observed before removing a target port.
Zero means immediate removal. Setting this attribute to "off"
will disable the dev_loss timer.
What: /sys/class/srp_remote_ports/port-<h>:<n>/fast_io_fail_tmo
Date: February 1, 2014
KernelVersion: 3.13
Contact: linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
Description: Number of seconds the SCSI layer will wait after a transport
layer error has been observed before failing I/O. Zero means
failing I/O immediately. Setting this attribute to "off" will
disable the fast_io_fail timer.
What: /sys/class/srp_remote_ports/port-<h>:<n>/port_id
Date: June 27, 2007
KernelVersion: 2.6.24
@@ -12,8 +30,29 @@ Contact: linux-scsi@vger.kernel.org
Description: 16-byte local SRP port identifier in hexadecimal format. An
example: 4c:49:4e:55:58:20:56:49:4f:00:00:00:00:00:00:00.
What: /sys/class/srp_remote_ports/port-<h>:<n>/reconnect_delay
Date: February 1, 2014
KernelVersion: 3.13
Contact: linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
Description: Number of seconds the SCSI layer will wait after a reconnect
attempt failed before retrying. Setting this attribute to
"off" will disable time-based reconnecting.
What: /sys/class/srp_remote_ports/port-<h>:<n>/roles
Date: June 27, 2007
KernelVersion: 2.6.24
Contact: linux-scsi@vger.kernel.org
Description: Role of the remote port. Either "SRP Initiator" or "SRP Target".
What: /sys/class/srp_remote_ports/port-<h>:<n>/state
Date: February 1, 2014
KernelVersion: 3.13
Contact: linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
Description: State of the transport layer used for communication with the
remote port. "running" if the transport layer is operational;
"blocked" if a transport layer error has been encountered but
the fast_io_fail_tmo timer has not yet fired; "fail-fast"
after the fast_io_fail_tmo timer has fired and before the
"dev_loss_tmo" timer has fired; "lost" after the
"dev_loss_tmo" timer has fired and before the port is finally
removed.
-11
View File
@@ -31,17 +31,6 @@ config INFINIBAND_USER_ACCESS
libibverbs, libibcm and a hardware driver library from
<http://www.openfabrics.org/git/>.
config INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
bool "Experimental and unstable ABI for userspace access to flow steering verbs"
depends on INFINIBAND_USER_ACCESS
depends on STAGING
---help---
The final ABI for userspace access to flow steering verbs
has not been defined. To use the current ABI, *WHICH WILL
CHANGE IN THE FUTURE*, say Y here.
If unsure, say N.
config INFINIBAND_USER_MEM
bool
depends on INFINIBAND_USER_ACCESS != n
+1 -4
View File
@@ -383,14 +383,11 @@ static int cm_alloc_id(struct cm_id_private *cm_id_priv)
{
unsigned long flags;
int id;
static int next_id;
idr_preload(GFP_KERNEL);
spin_lock_irqsave(&cm.lock, flags);
id = idr_alloc(&cm.local_id_table, cm_id_priv, next_id, 0, GFP_NOWAIT);
if (id >= 0)
next_id = max(id + 1, 0);
id = idr_alloc_cyclic(&cm.local_id_table, cm_id_priv, 0, 0, GFP_NOWAIT);
spin_unlock_irqrestore(&cm.lock, flags);
idr_preload_end();
+33 -35
View File
@@ -328,28 +328,6 @@ static int cma_set_qkey(struct rdma_id_private *id_priv, u32 qkey)
return ret;
}
static int find_gid_port(struct ib_device *device, union ib_gid *gid, u8 port_num)
{
int i;
int err;
struct ib_port_attr props;
union ib_gid tmp;
err = ib_query_port(device, port_num, &props);
if (err)
return err;
for (i = 0; i < props.gid_tbl_len; ++i) {
err = ib_query_gid(device, port_num, i, &tmp);
if (err)
return err;
if (!memcmp(&tmp, gid, sizeof tmp))
return 0;
}
return -EADDRNOTAVAIL;
}
static void cma_translate_ib(struct sockaddr_ib *sib, struct rdma_dev_addr *dev_addr)
{
dev_addr->dev_type = ARPHRD_INFINIBAND;
@@ -371,13 +349,14 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
return ret;
}
static int cma_acquire_dev(struct rdma_id_private *id_priv)
static int cma_acquire_dev(struct rdma_id_private *id_priv,
struct rdma_id_private *listen_id_priv)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
struct cma_device *cma_dev;
union ib_gid gid, iboe_gid;
int ret = -ENODEV;
u8 port;
u8 port, found_port;
enum rdma_link_layer dev_ll = dev_addr->dev_type == ARPHRD_INFINIBAND ?
IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
@@ -389,17 +368,39 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
iboe_addr_get_sgid(dev_addr, &iboe_gid);
memcpy(&gid, dev_addr->src_dev_addr +
rdma_addr_gid_offset(dev_addr), sizeof gid);
if (listen_id_priv &&
rdma_port_get_link_layer(listen_id_priv->id.device,
listen_id_priv->id.port_num) == dev_ll) {
cma_dev = listen_id_priv->cma_dev;
port = listen_id_priv->id.port_num;
if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
&found_port, NULL);
else
ret = ib_find_cached_gid(cma_dev->device, &gid,
&found_port, NULL);
if (!ret && (port == found_port)) {
id_priv->id.port_num = found_port;
goto out;
}
}
list_for_each_entry(cma_dev, &dev_list, list) {
for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port) {
if (listen_id_priv &&
listen_id_priv->cma_dev == cma_dev &&
listen_id_priv->id.port_num == port)
continue;
if (rdma_port_get_link_layer(cma_dev->device, port) == dev_ll) {
if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
ret = find_gid_port(cma_dev->device, &iboe_gid, port);
ret = ib_find_cached_gid(cma_dev->device, &iboe_gid, &found_port, NULL);
else
ret = find_gid_port(cma_dev->device, &gid, port);
ret = ib_find_cached_gid(cma_dev->device, &gid, &found_port, NULL);
if (!ret) {
id_priv->id.port_num = port;
if (!ret && (port == found_port)) {
id_priv->id.port_num = found_port;
goto out;
}
}
@@ -1292,7 +1293,7 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
}
mutex_lock_nested(&conn_id->handler_mutex, SINGLE_DEPTH_NESTING);
ret = cma_acquire_dev(conn_id);
ret = cma_acquire_dev(conn_id, listen_id);
if (ret)
goto err2;
@@ -1451,7 +1452,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
{
struct rdma_cm_id *new_cm_id;
struct rdma_id_private *listen_id, *conn_id;
struct net_device *dev = NULL;
struct rdma_cm_event event;
int ret;
struct ib_device_attr attr;
@@ -1481,7 +1481,7 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
goto out;
}
ret = cma_acquire_dev(conn_id);
ret = cma_acquire_dev(conn_id, listen_id);
if (ret) {
mutex_unlock(&conn_id->handler_mutex);
rdma_destroy_id(new_cm_id);
@@ -1529,8 +1529,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
cma_deref_id(conn_id);
out:
if (dev)
dev_put(dev);
mutex_unlock(&listen_id->handler_mutex);
return ret;
}
@@ -2066,7 +2064,7 @@ static void addr_handler(int status, struct sockaddr *src_addr,
goto out;
if (!status && !id_priv->cma_dev)
status = cma_acquire_dev(id_priv);
status = cma_acquire_dev(id_priv, NULL);
if (status) {
if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_RESOLVED,
@@ -2563,7 +2561,7 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr)
if (ret)
goto err1;
ret = cma_acquire_dev(id_priv);
ret = cma_acquire_dev(id_priv, NULL);
if (ret)
goto err1;
}
+1 -1
View File
@@ -148,7 +148,7 @@ static int ibnl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
list_for_each_entry(client, &client_list, list) {
if (client->index == index) {
if (op < 0 || op >= client->nops ||
!client->cb_table[RDMA_NL_GET_OP(op)].dump)
!client->cb_table[op].dump)
return -EINVAL;
{
+1
View File
@@ -612,6 +612,7 @@ static ssize_t show_node_type(struct device *device,
switch (dev->node_type) {
case RDMA_NODE_IB_CA: return sprintf(buf, "%d: CA\n", dev->node_type);
case RDMA_NODE_RNIC: return sprintf(buf, "%d: RNIC\n", dev->node_type);
case RDMA_NODE_USNIC: return sprintf(buf, "%d: usNIC\n", dev->node_type);
case RDMA_NODE_IB_SWITCH: return sprintf(buf, "%d: switch\n", dev->node_type);
case RDMA_NODE_IB_ROUTER: return sprintf(buf, "%d: router\n", dev->node_type);
default: return sprintf(buf, "%d: <unknown>\n", dev->node_type);
+2 -2
View File
@@ -57,7 +57,7 @@ MODULE_LICENSE("Dual BSD/GPL");
static unsigned int max_backlog = 1024;
static struct ctl_table_header *ucma_ctl_table_hdr;
static ctl_table ucma_ctl_table[] = {
static struct ctl_table ucma_ctl_table[] = {
{
.procname = "max_backlog",
.data = &max_backlog,
@@ -271,7 +271,7 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id,
goto out;
}
ctx->backlog--;
} else if (!ctx->uid) {
} else if (!ctx->uid || ctx->cm_id != cm_id) {
/*
* We ignore events for new connections until userspace has set
* their context. This can only happen if an error occurs on a
+32 -4
View File
@@ -47,6 +47,14 @@
#include <rdma/ib_umem.h>
#include <rdma/ib_user_verbs.h>
#define INIT_UDATA(udata, ibuf, obuf, ilen, olen) \
do { \
(udata)->inbuf = (void __user *) (ibuf); \
(udata)->outbuf = (void __user *) (obuf); \
(udata)->inlen = (ilen); \
(udata)->outlen = (olen); \
} while (0)
/*
* Our lifetime rules for these structs are the following:
*
@@ -178,6 +186,22 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler,
struct ib_event *event);
void ib_uverbs_dealloc_xrcd(struct ib_uverbs_device *dev, struct ib_xrcd *xrcd);
struct ib_uverbs_flow_spec {
union {
union {
struct ib_uverbs_flow_spec_hdr hdr;
struct {
__u32 type;
__u16 size;
__u16 reserved;
};
};
struct ib_uverbs_flow_spec_eth eth;
struct ib_uverbs_flow_spec_ipv4 ipv4;
struct ib_uverbs_flow_spec_tcp_udp tcp_udp;
};
};
#define IB_UVERBS_DECLARE_CMD(name) \
ssize_t ib_uverbs_##name(struct ib_uverbs_file *file, \
const char __user *buf, int in_len, \
@@ -217,9 +241,13 @@ IB_UVERBS_DECLARE_CMD(destroy_srq);
IB_UVERBS_DECLARE_CMD(create_xsrq);
IB_UVERBS_DECLARE_CMD(open_xrcd);
IB_UVERBS_DECLARE_CMD(close_xrcd);
#ifdef CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
IB_UVERBS_DECLARE_CMD(create_flow);
IB_UVERBS_DECLARE_CMD(destroy_flow);
#endif /* CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING */
#define IB_UVERBS_DECLARE_EX_CMD(name) \
int ib_uverbs_ex_##name(struct ib_uverbs_file *file, \
struct ib_udata *ucore, \
struct ib_udata *uhw)
IB_UVERBS_DECLARE_EX_CMD(create_flow);
IB_UVERBS_DECLARE_EX_CMD(destroy_flow);
#endif /* UVERBS_H */
+49 -60
View File
@@ -54,17 +54,7 @@ static struct uverbs_lock_class qp_lock_class = { .name = "QP-uobj" };
static struct uverbs_lock_class ah_lock_class = { .name = "AH-uobj" };
static struct uverbs_lock_class srq_lock_class = { .name = "SRQ-uobj" };
static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
#ifdef CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
#endif /* CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING */
#define INIT_UDATA(udata, ibuf, obuf, ilen, olen) \
do { \
(udata)->inbuf = (void __user *) (ibuf); \
(udata)->outbuf = (void __user *) (obuf); \
(udata)->inlen = (ilen); \
(udata)->outlen = (olen); \
} while (0)
/*
* The ib_uobject locking scheme is as follows:
@@ -939,13 +929,9 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
if ((cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK))
return -EINVAL;
/*
* Local write permission is required if remote write or
* remote atomic permission is also requested.
*/
if (cmd.access_flags & (IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_REMOTE_WRITE) &&
!(cmd.access_flags & IB_ACCESS_LOCAL_WRITE))
return -EINVAL;
ret = ib_check_mr_access(cmd.access_flags);
if (ret)
return ret;
uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
if (!uobj)
@@ -2128,6 +2114,9 @@ ssize_t ib_uverbs_post_send(struct ib_uverbs_file *file,
}
next->wr.ud.remote_qpn = user_wr->wr.ud.remote_qpn;
next->wr.ud.remote_qkey = user_wr->wr.ud.remote_qkey;
if (next->opcode == IB_WR_SEND_WITH_IMM)
next->ex.imm_data =
(__be32 __force) user_wr->ex.imm_data;
} else {
switch (next->opcode) {
case IB_WR_RDMA_WRITE_WITH_IMM:
@@ -2601,8 +2590,7 @@ out_put:
return ret ? ret : in_len;
}
#ifdef CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
static int kern_spec_to_ib_spec(struct ib_kern_spec *kern_spec,
static int kern_spec_to_ib_spec(struct ib_uverbs_flow_spec *kern_spec,
union ib_flow_spec *ib_spec)
{
ib_spec->type = kern_spec->type;
@@ -2642,28 +2630,31 @@ static int kern_spec_to_ib_spec(struct ib_kern_spec *kern_spec,
return 0;
}
ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
const char __user *buf, int in_len,
int out_len)
int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
struct ib_udata *ucore,
struct ib_udata *uhw)
{
struct ib_uverbs_create_flow cmd;
struct ib_uverbs_create_flow_resp resp;
struct ib_uobject *uobj;
struct ib_flow *flow_id;
struct ib_kern_flow_attr *kern_flow_attr;
struct ib_uverbs_flow_attr *kern_flow_attr;
struct ib_flow_attr *flow_attr;
struct ib_qp *qp;
int err = 0;
void *kern_spec;
void *ib_spec;
int i;
int kern_attr_size;
if (out_len < sizeof(resp))
if (ucore->outlen < sizeof(resp))
return -ENOSPC;
if (copy_from_user(&cmd, buf, sizeof(cmd)))
return -EFAULT;
err = ib_copy_from_udata(&cmd, ucore, sizeof(cmd));
if (err)
return err;
ucore->inbuf += sizeof(cmd);
ucore->inlen -= sizeof(cmd);
if (cmd.comp_mask)
return -EINVAL;
@@ -2672,32 +2663,27 @@ ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
!capable(CAP_NET_ADMIN)) || !capable(CAP_NET_RAW))
return -EPERM;
if (cmd.flow_attr.num_of_specs < 0 ||
cmd.flow_attr.num_of_specs > IB_FLOW_SPEC_SUPPORT_LAYERS)
if (cmd.flow_attr.num_of_specs > IB_FLOW_SPEC_SUPPORT_LAYERS)
return -EINVAL;
kern_attr_size = cmd.flow_attr.size - sizeof(cmd) -
sizeof(struct ib_uverbs_cmd_hdr_ex);
if (cmd.flow_attr.size < 0 || cmd.flow_attr.size > in_len ||
kern_attr_size < 0 || kern_attr_size >
(cmd.flow_attr.num_of_specs * sizeof(struct ib_kern_spec)))
if (cmd.flow_attr.size > ucore->inlen ||
cmd.flow_attr.size >
(cmd.flow_attr.num_of_specs * sizeof(struct ib_uverbs_flow_spec)))
return -EINVAL;
if (cmd.flow_attr.num_of_specs) {
kern_flow_attr = kmalloc(cmd.flow_attr.size, GFP_KERNEL);
kern_flow_attr = kmalloc(sizeof(*kern_flow_attr) + cmd.flow_attr.size,
GFP_KERNEL);
if (!kern_flow_attr)
return -ENOMEM;
memcpy(kern_flow_attr, &cmd.flow_attr, sizeof(*kern_flow_attr));
if (copy_from_user(kern_flow_attr + 1, buf + sizeof(cmd),
kern_attr_size)) {
err = -EFAULT;
err = ib_copy_from_udata(kern_flow_attr + 1, ucore,
cmd.flow_attr.size);
if (err)
goto err_free_attr;
}
} else {
kern_flow_attr = &cmd.flow_attr;
kern_attr_size = sizeof(cmd.flow_attr);
}
uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
@@ -2714,7 +2700,7 @@ ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
goto err_uobj;
}
flow_attr = kmalloc(cmd.flow_attr.size, GFP_KERNEL);
flow_attr = kmalloc(sizeof(*flow_attr) + cmd.flow_attr.size, GFP_KERNEL);
if (!flow_attr) {
err = -ENOMEM;
goto err_put;
@@ -2729,19 +2715,22 @@ ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
kern_spec = kern_flow_attr + 1;
ib_spec = flow_attr + 1;
for (i = 0; i < flow_attr->num_of_specs && kern_attr_size > 0; i++) {
for (i = 0; i < flow_attr->num_of_specs &&
cmd.flow_attr.size > offsetof(struct ib_uverbs_flow_spec, reserved) &&
cmd.flow_attr.size >=
((struct ib_uverbs_flow_spec *)kern_spec)->size; i++) {
err = kern_spec_to_ib_spec(kern_spec, ib_spec);
if (err)
goto err_free;
flow_attr->size +=
((union ib_flow_spec *) ib_spec)->size;
kern_attr_size -= ((struct ib_kern_spec *) kern_spec)->size;
kern_spec += ((struct ib_kern_spec *) kern_spec)->size;
cmd.flow_attr.size -= ((struct ib_uverbs_flow_spec *)kern_spec)->size;
kern_spec += ((struct ib_uverbs_flow_spec *) kern_spec)->size;
ib_spec += ((union ib_flow_spec *) ib_spec)->size;
}
if (kern_attr_size) {
pr_warn("create flow failed, %d bytes left from uverb cmd\n",
kern_attr_size);
if (cmd.flow_attr.size || (i != flow_attr->num_of_specs)) {
pr_warn("create flow failed, flow %d: %d bytes left from uverb cmd\n",
i, cmd.flow_attr.size);
goto err_free;
}
flow_id = ib_create_flow(qp, flow_attr, IB_FLOW_DOMAIN_USER);
@@ -2760,11 +2749,10 @@ ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
memset(&resp, 0, sizeof(resp));
resp.flow_handle = uobj->id;
if (copy_to_user((void __user *)(unsigned long) cmd.response,
&resp, sizeof(resp))) {
err = -EFAULT;
err = ib_copy_to_udata(ucore,
&resp, sizeof(resp));
if (err)
goto err_copy;
}
put_qp_read(qp);
mutex_lock(&file->mutex);
@@ -2777,7 +2765,7 @@ ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
kfree(flow_attr);
if (cmd.flow_attr.num_of_specs)
kfree(kern_flow_attr);
return in_len;
return 0;
err_copy:
idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
destroy_flow:
@@ -2794,16 +2782,18 @@ err_free_attr:
return err;
}
ssize_t ib_uverbs_destroy_flow(struct ib_uverbs_file *file,
const char __user *buf, int in_len,
int out_len) {
int ib_uverbs_ex_destroy_flow(struct ib_uverbs_file *file,
struct ib_udata *ucore,
struct ib_udata *uhw)
{
struct ib_uverbs_destroy_flow cmd;
struct ib_flow *flow_id;
struct ib_uobject *uobj;
int ret;
if (copy_from_user(&cmd, buf, sizeof(cmd)))
return -EFAULT;
ret = ib_copy_from_udata(&cmd, ucore, sizeof(cmd));
if (ret)
return ret;
uobj = idr_write_uobj(&ib_uverbs_rule_idr, cmd.flow_handle,
file->ucontext);
@@ -2825,9 +2815,8 @@ ssize_t ib_uverbs_destroy_flow(struct ib_uverbs_file *file,
put_uobj(uobj);
return ret ? ret : in_len;
return ret;
}
#endif /* CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING */
static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
struct ib_uverbs_create_xsrq *cmd,
+99 -35
View File
@@ -115,10 +115,13 @@ static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file,
[IB_USER_VERBS_CMD_CLOSE_XRCD] = ib_uverbs_close_xrcd,
[IB_USER_VERBS_CMD_CREATE_XSRQ] = ib_uverbs_create_xsrq,
[IB_USER_VERBS_CMD_OPEN_QP] = ib_uverbs_open_qp,
#ifdef CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
[IB_USER_VERBS_CMD_CREATE_FLOW] = ib_uverbs_create_flow,
[IB_USER_VERBS_CMD_DESTROY_FLOW] = ib_uverbs_destroy_flow
#endif /* CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING */
};
static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
struct ib_udata *ucore,
struct ib_udata *uhw) = {
[IB_USER_VERBS_EX_CMD_CREATE_FLOW] = ib_uverbs_ex_create_flow,
[IB_USER_VERBS_EX_CMD_DESTROY_FLOW] = ib_uverbs_ex_destroy_flow
};
static void ib_uverbs_add_one(struct ib_device *device);
@@ -589,6 +592,7 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
{
struct ib_uverbs_file *file = filp->private_data;
struct ib_uverbs_cmd_hdr hdr;
__u32 flags;
if (count < sizeof hdr)
return -EINVAL;
@@ -596,45 +600,105 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
if (copy_from_user(&hdr, buf, sizeof hdr))
return -EFAULT;
if (hdr.command >= ARRAY_SIZE(uverbs_cmd_table) ||
!uverbs_cmd_table[hdr.command])
return -EINVAL;
flags = (hdr.command &
IB_USER_VERBS_CMD_FLAGS_MASK) >> IB_USER_VERBS_CMD_FLAGS_SHIFT;
if (!file->ucontext &&
hdr.command != IB_USER_VERBS_CMD_GET_CONTEXT)
return -EINVAL;
if (!flags) {
__u32 command;
if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << hdr.command)))
return -ENOSYS;
#ifdef CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
if (hdr.command >= IB_USER_VERBS_CMD_THRESHOLD) {
struct ib_uverbs_cmd_hdr_ex hdr_ex;
if (copy_from_user(&hdr_ex, buf, sizeof(hdr_ex)))
return -EFAULT;
if (((hdr_ex.in_words + hdr_ex.provider_in_words) * 4) != count)
if (hdr.command & ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK |
IB_USER_VERBS_CMD_COMMAND_MASK))
return -EINVAL;
return uverbs_cmd_table[hdr.command](file,
buf + sizeof(hdr_ex),
(hdr_ex.in_words +
hdr_ex.provider_in_words) * 4,
(hdr_ex.out_words +
hdr_ex.provider_out_words) * 4);
} else {
#endif /* CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING */
command = hdr.command & IB_USER_VERBS_CMD_COMMAND_MASK;
if (command >= ARRAY_SIZE(uverbs_cmd_table) ||
!uverbs_cmd_table[command])
return -EINVAL;
if (!file->ucontext &&
command != IB_USER_VERBS_CMD_GET_CONTEXT)
return -EINVAL;
if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << command)))
return -ENOSYS;
if (hdr.in_words * 4 != count)
return -EINVAL;
return uverbs_cmd_table[hdr.command](file,
buf + sizeof(hdr),
hdr.in_words * 4,
hdr.out_words * 4);
#ifdef CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
return uverbs_cmd_table[command](file,
buf + sizeof(hdr),
hdr.in_words * 4,
hdr.out_words * 4);
} else if (flags == IB_USER_VERBS_CMD_FLAG_EXTENDED) {
__u32 command;
struct ib_uverbs_ex_cmd_hdr ex_hdr;
struct ib_udata ucore;
struct ib_udata uhw;
int err;
size_t written_count = count;
if (hdr.command & ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK |
IB_USER_VERBS_CMD_COMMAND_MASK))
return -EINVAL;
command = hdr.command & IB_USER_VERBS_CMD_COMMAND_MASK;
if (command >= ARRAY_SIZE(uverbs_ex_cmd_table) ||
!uverbs_ex_cmd_table[command])
return -ENOSYS;
if (!file->ucontext)
return -EINVAL;
if (!(file->device->ib_dev->uverbs_ex_cmd_mask & (1ull << command)))
return -ENOSYS;
if (count < (sizeof(hdr) + sizeof(ex_hdr)))
return -EINVAL;
if (copy_from_user(&ex_hdr, buf + sizeof(hdr), sizeof(ex_hdr)))
return -EFAULT;
count -= sizeof(hdr) + sizeof(ex_hdr);
buf += sizeof(hdr) + sizeof(ex_hdr);
if ((hdr.in_words + ex_hdr.provider_in_words) * 8 != count)
return -EINVAL;
if (ex_hdr.response) {
if (!hdr.out_words && !ex_hdr.provider_out_words)
return -EINVAL;
} else {
if (hdr.out_words || ex_hdr.provider_out_words)
return -EINVAL;
}
INIT_UDATA(&ucore,
(hdr.in_words) ? buf : 0,
(unsigned long)ex_hdr.response,
hdr.in_words * 8,
hdr.out_words * 8);
INIT_UDATA(&uhw,
(ex_hdr.provider_in_words) ? buf + ucore.inlen : 0,
(ex_hdr.provider_out_words) ? (unsigned long)ex_hdr.response + ucore.outlen : 0,
ex_hdr.provider_in_words * 8,
ex_hdr.provider_out_words * 8);
err = uverbs_ex_cmd_table[command](file,
&ucore,
&uhw);
if (err)
return err;
return written_count;
}
#endif /* CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING */
return -ENOSYS;
}
static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
+17
View File
@@ -114,6 +114,8 @@ rdma_node_get_transport(enum rdma_node_type node_type)
return RDMA_TRANSPORT_IB;
case RDMA_NODE_RNIC:
return RDMA_TRANSPORT_IWARP;
case RDMA_NODE_USNIC:
return RDMA_TRANSPORT_USNIC;
default:
BUG();
return 0;
@@ -130,6 +132,7 @@ enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_
case RDMA_TRANSPORT_IB:
return IB_LINK_LAYER_INFINIBAND;
case RDMA_TRANSPORT_IWARP:
case RDMA_TRANSPORT_USNIC:
return IB_LINK_LAYER_ETHERNET;
default:
return IB_LINK_LAYER_UNSPECIFIED;
@@ -958,6 +961,11 @@ EXPORT_SYMBOL(ib_resize_cq);
struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags)
{
struct ib_mr *mr;
int err;
err = ib_check_mr_access(mr_access_flags);
if (err)
return ERR_PTR(err);
mr = pd->device->get_dma_mr(pd, mr_access_flags);
@@ -980,6 +988,11 @@ struct ib_mr *ib_reg_phys_mr(struct ib_pd *pd,
u64 *iova_start)
{
struct ib_mr *mr;
int err;
err = ib_check_mr_access(mr_access_flags);
if (err)
return ERR_PTR(err);
if (!pd->device->reg_phys_mr)
return ERR_PTR(-ENOSYS);
@@ -1010,6 +1023,10 @@ int ib_rereg_phys_mr(struct ib_mr *mr,
struct ib_pd *old_pd;
int ret;
ret = ib_check_mr_access(mr_access_flags);
if (ret)
return ret;
if (!mr->device->rereg_phys_mr)
return -ENOSYS;
+2 -2
View File
@@ -602,10 +602,10 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
rdev->lldi.vr->qp.size,
rdev->lldi.vr->cq.start,
rdev->lldi.vr->cq.size);
PDBG("udb len 0x%x udb base %p db_reg %p gts_reg %p qpshift %lu "
PDBG("udb len 0x%x udb base %llx db_reg %p gts_reg %p qpshift %lu "
"qpmask 0x%x cqshift %lu cqmask 0x%x\n",
(unsigned)pci_resource_len(rdev->lldi.pdev, 2),
(void *)(unsigned long)pci_resource_start(rdev->lldi.pdev, 2),
(u64)pci_resource_start(rdev->lldi.pdev, 2),
rdev->lldi.db_reg,
rdev->lldi.gts_reg,
rdev->qpshift, rdev->qpmask,
@@ -280,9 +280,7 @@ static int ipath_user_sdma_pin_pages(const struct ipath_devdata *dd,
int j;
int ret;
ret = get_user_pages(current, current->mm, addr,
npages, 0, 1, pages, NULL);
ret = get_user_pages_fast(addr, npages, 0, pages);
if (ret != npages) {
int i;
@@ -811,10 +809,7 @@ int ipath_user_sdma_writev(struct ipath_devdata *dd,
while (dim) {
const int mxp = 8;
down_write(&current->mm->mmap_sem);
ret = ipath_user_sdma_queue_pkts(dd, pq, &list, iov, dim, mxp);
up_write(&current->mm->mmap_sem);
if (ret <= 0)
goto done_unlock;
else {
+7 -2
View File
@@ -324,7 +324,7 @@ static int mlx4_ib_get_outstanding_cqes(struct mlx4_ib_cq *cq)
u32 i;
i = cq->mcq.cons_index;
while (get_sw_cqe(cq, i & cq->ibcq.cqe))
while (get_sw_cqe(cq, i))
++i;
return i - cq->mcq.cons_index;
@@ -365,7 +365,7 @@ int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata)
mutex_lock(&cq->resize_mutex);
if (entries < 1 || entries > dev->dev->caps.max_cqes) {
if (entries < 1) {
err = -EINVAL;
goto out;
}
@@ -376,6 +376,11 @@ int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata)
goto out;
}
if (entries > dev->dev->caps.max_cqes) {
err = -EINVAL;
goto out;
}
if (ibcq->uobject) {
err = mlx4_alloc_resize_umem(dev, cq, entries, udata);
if (err)
+3 -5
View File
@@ -1685,11 +1685,9 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.create_flow = mlx4_ib_create_flow;
ibdev->ib_dev.destroy_flow = mlx4_ib_destroy_flow;
#ifdef CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING
ibdev->ib_dev.uverbs_cmd_mask |=
(1ull << IB_USER_VERBS_CMD_CREATE_FLOW) |
(1ull << IB_USER_VERBS_CMD_DESTROY_FLOW);
#endif /* CONFIG_INFINIBAND_EXPERIMENTAL_UVERBS_FLOW_STEERING */
ibdev->ib_dev.uverbs_ex_cmd_mask |=
(1ull << IB_USER_VERBS_EX_CMD_CREATE_FLOW) |
(1ull << IB_USER_VERBS_EX_CMD_DESTROY_FLOW);
}
mlx4_ib_alloc_eqs(dev, ibdev);
+10 -15
View File
@@ -556,7 +556,7 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
goto err_db;
}
mlx5_ib_populate_pas(dev, cq->buf.umem, page_shift, (*cqb)->pas, 0);
(*cqb)->ctx.log_pg_sz = page_shift - PAGE_SHIFT;
(*cqb)->ctx.log_pg_sz = page_shift - MLX5_ADAPTER_PAGE_SHIFT;
*index = to_mucontext(context)->uuari.uars[0].index;
@@ -620,7 +620,7 @@ static int create_cq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq,
}
mlx5_fill_page_array(&cq->buf.buf, (*cqb)->pas);
(*cqb)->ctx.log_pg_sz = cq->buf.buf.page_shift - PAGE_SHIFT;
(*cqb)->ctx.log_pg_sz = cq->buf.buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT;
*index = dev->mdev.priv.uuari.uars[0].index;
return 0;
@@ -653,8 +653,11 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, int entries,
int eqn;
int err;
if (entries < 0)
return ERR_PTR(-EINVAL);
entries = roundup_pow_of_two(entries + 1);
if (entries < 1 || entries > dev->mdev.caps.max_cqes)
if (entries > dev->mdev.caps.max_cqes)
return ERR_PTR(-EINVAL);
cq = kzalloc(sizeof(*cq), GFP_KERNEL);
@@ -747,17 +750,9 @@ int mlx5_ib_destroy_cq(struct ib_cq *cq)
return 0;
}
static int is_equal_rsn(struct mlx5_cqe64 *cqe64, struct mlx5_ib_srq *srq,
u32 rsn)
static int is_equal_rsn(struct mlx5_cqe64 *cqe64, u32 rsn)
{
u32 lrsn;
if (srq)
lrsn = be32_to_cpu(cqe64->srqn) & 0xffffff;
else
lrsn = be32_to_cpu(cqe64->sop_drop_qpn) & 0xffffff;
return rsn == lrsn;
return rsn == (ntohl(cqe64->sop_drop_qpn) & 0xffffff);
}
void __mlx5_ib_cq_clean(struct mlx5_ib_cq *cq, u32 rsn, struct mlx5_ib_srq *srq)
@@ -787,8 +782,8 @@ void __mlx5_ib_cq_clean(struct mlx5_ib_cq *cq, u32 rsn, struct mlx5_ib_srq *srq)
while ((int) --prod_index - (int) cq->mcq.cons_index >= 0) {
cqe = get_cqe(cq, prod_index & cq->ibcq.cqe);
cqe64 = (cq->mcq.cqe_sz == 64) ? cqe : cqe + 64;
if (is_equal_rsn(cqe64, srq, rsn)) {
if (srq)
if (is_equal_rsn(cqe64, rsn)) {
if (srq && (ntohl(cqe64->srqn) & 0xffffff))
mlx5_ib_free_srq_wqe(srq, be16_to_cpu(cqe64->wqe_counter));
++nfreed;
} else if (nfreed) {
+2 -1
View File
@@ -745,7 +745,8 @@ static int alloc_pa_mkey(struct mlx5_ib_dev *dev, u32 *key, u32 pdn)
seg->qpn_mkey7_0 = cpu_to_be32(0xffffff << 8);
seg->start_addr = 0;
err = mlx5_core_create_mkey(&dev->mdev, &mr, in, sizeof(*in));
err = mlx5_core_create_mkey(&dev->mdev, &mr, in, sizeof(*in),
NULL, NULL, NULL);
if (err) {
mlx5_ib_warn(dev, "failed to create mkey, %d\n", err);
goto err_in;
+6
View File
@@ -262,6 +262,9 @@ struct mlx5_ib_mr {
int npages;
struct completion done;
enum ib_wc_status status;
struct mlx5_ib_dev *dev;
struct mlx5_create_mkey_mbox_out out;
unsigned long start;
};
struct mlx5_ib_fast_reg_page_list {
@@ -323,6 +326,7 @@ struct mlx5_cache_ent {
struct mlx5_ib_dev *dev;
struct work_struct work;
struct delayed_work dwork;
int pending;
};
struct mlx5_mr_cache {
@@ -358,6 +362,8 @@ struct mlx5_ib_dev {
spinlock_t mr_lock;
struct mlx5_ib_resources devr;
struct mlx5_mr_cache cache;
struct timer_list delay_timer;
int fill_delay;
};
static inline struct mlx5_ib_cq *to_mibcq(struct mlx5_core_cq *mcq)
+122 -45
View File
@@ -35,11 +35,12 @@
#include <linux/random.h>
#include <linux/debugfs.h>
#include <linux/export.h>
#include <linux/delay.h>
#include <rdma/ib_umem.h>
#include "mlx5_ib.h"
enum {
DEF_CACHE_SIZE = 10,
MAX_PENDING_REG_MR = 8,
};
enum {
@@ -63,6 +64,51 @@ static int order2idx(struct mlx5_ib_dev *dev, int order)
return order - cache->ent[0].order;
}
static void reg_mr_callback(int status, void *context)
{
struct mlx5_ib_mr *mr = context;
struct mlx5_ib_dev *dev = mr->dev;
struct mlx5_mr_cache *cache = &dev->cache;
int c = order2idx(dev, mr->order);
struct mlx5_cache_ent *ent = &cache->ent[c];
u8 key;
unsigned long flags;
spin_lock_irqsave(&ent->lock, flags);
ent->pending--;
spin_unlock_irqrestore(&ent->lock, flags);
if (status) {
mlx5_ib_warn(dev, "async reg mr failed. status %d\n", status);
kfree(mr);
dev->fill_delay = 1;
mod_timer(&dev->delay_timer, jiffies + HZ);
return;
}
if (mr->out.hdr.status) {
mlx5_ib_warn(dev, "failed - status %d, syndorme 0x%x\n",
mr->out.hdr.status,
be32_to_cpu(mr->out.hdr.syndrome));
kfree(mr);
dev->fill_delay = 1;
mod_timer(&dev->delay_timer, jiffies + HZ);
return;
}
spin_lock_irqsave(&dev->mdev.priv.mkey_lock, flags);
key = dev->mdev.priv.mkey_key++;
spin_unlock_irqrestore(&dev->mdev.priv.mkey_lock, flags);
mr->mmr.key = mlx5_idx_to_mkey(be32_to_cpu(mr->out.mkey) & 0xffffff) | key;
cache->last_add = jiffies;
spin_lock_irqsave(&ent->lock, flags);
list_add_tail(&mr->list, &ent->head);
ent->cur++;
ent->size++;
spin_unlock_irqrestore(&ent->lock, flags);
}
static int add_keys(struct mlx5_ib_dev *dev, int c, int num)
{
struct mlx5_mr_cache *cache = &dev->cache;
@@ -78,36 +124,39 @@ static int add_keys(struct mlx5_ib_dev *dev, int c, int num)
return -ENOMEM;
for (i = 0; i < num; i++) {
if (ent->pending >= MAX_PENDING_REG_MR) {
err = -EAGAIN;
break;
}
mr = kzalloc(sizeof(*mr), GFP_KERNEL);
if (!mr) {
err = -ENOMEM;
goto out;
break;
}
mr->order = ent->order;
mr->umred = 1;
mr->dev = dev;
in->seg.status = 1 << 6;
in->seg.xlt_oct_size = cpu_to_be32((npages + 1) / 2);
in->seg.qpn_mkey7_0 = cpu_to_be32(0xffffff << 8);
in->seg.flags = MLX5_ACCESS_MODE_MTT | MLX5_PERM_UMR_EN;
in->seg.log2_page_size = 12;
spin_lock_irq(&ent->lock);
ent->pending++;
spin_unlock_irq(&ent->lock);
mr->start = jiffies;
err = mlx5_core_create_mkey(&dev->mdev, &mr->mmr, in,
sizeof(*in));
sizeof(*in), reg_mr_callback,
mr, &mr->out);
if (err) {
mlx5_ib_warn(dev, "create mkey failed %d\n", err);
kfree(mr);
goto out;
break;
}
cache->last_add = jiffies;
spin_lock(&ent->lock);
list_add_tail(&mr->list, &ent->head);
ent->cur++;
ent->size++;
spin_unlock(&ent->lock);
}
out:
kfree(in);
return err;
}
@@ -121,16 +170,16 @@ static void remove_keys(struct mlx5_ib_dev *dev, int c, int num)
int i;
for (i = 0; i < num; i++) {
spin_lock(&ent->lock);
spin_lock_irq(&ent->lock);
if (list_empty(&ent->head)) {
spin_unlock(&ent->lock);
spin_unlock_irq(&ent->lock);
return;
}
mr = list_first_entry(&ent->head, struct mlx5_ib_mr, list);
list_del(&mr->list);
ent->cur--;
ent->size--;
spin_unlock(&ent->lock);
spin_unlock_irq(&ent->lock);
err = mlx5_core_destroy_mkey(&dev->mdev, &mr->mmr);
if (err)
mlx5_ib_warn(dev, "failed destroy mkey\n");
@@ -162,9 +211,13 @@ static ssize_t size_write(struct file *filp, const char __user *buf,
return -EINVAL;
if (var > ent->size) {
err = add_keys(dev, c, var - ent->size);
if (err)
return err;
do {
err = add_keys(dev, c, var - ent->size);
if (err && err != -EAGAIN)
return err;
usleep_range(3000, 5000);
} while (err);
} else if (var < ent->size) {
remove_keys(dev, c, ent->size - var);
}
@@ -280,23 +333,37 @@ static void __cache_work_func(struct mlx5_cache_ent *ent)
struct mlx5_ib_dev *dev = ent->dev;
struct mlx5_mr_cache *cache = &dev->cache;
int i = order2idx(dev, ent->order);
int err;
if (cache->stopped)
return;
ent = &dev->cache.ent[i];
if (ent->cur < 2 * ent->limit) {
add_keys(dev, i, 1);
if (ent->cur < 2 * ent->limit)
queue_work(cache->wq, &ent->work);
if (ent->cur < 2 * ent->limit && !dev->fill_delay) {
err = add_keys(dev, i, 1);
if (ent->cur < 2 * ent->limit) {
if (err == -EAGAIN) {
mlx5_ib_dbg(dev, "returned eagain, order %d\n",
i + 2);
queue_delayed_work(cache->wq, &ent->dwork,
msecs_to_jiffies(3));
} else if (err) {
mlx5_ib_warn(dev, "command failed order %d, err %d\n",
i + 2, err);
queue_delayed_work(cache->wq, &ent->dwork,
msecs_to_jiffies(1000));
} else {
queue_work(cache->wq, &ent->work);
}
}
} else if (ent->cur > 2 * ent->limit) {
if (!someone_adding(cache) &&
time_after(jiffies, cache->last_add + 60 * HZ)) {
time_after(jiffies, cache->last_add + 300 * HZ)) {
remove_keys(dev, i, 1);
if (ent->cur > ent->limit)
queue_work(cache->wq, &ent->work);
} else {
queue_delayed_work(cache->wq, &ent->dwork, 60 * HZ);
queue_delayed_work(cache->wq, &ent->dwork, 300 * HZ);
}
}
}
@@ -336,18 +403,18 @@ static struct mlx5_ib_mr *alloc_cached_mr(struct mlx5_ib_dev *dev, int order)
mlx5_ib_dbg(dev, "order %d, cache index %d\n", ent->order, i);
spin_lock(&ent->lock);
spin_lock_irq(&ent->lock);
if (!list_empty(&ent->head)) {
mr = list_first_entry(&ent->head, struct mlx5_ib_mr,
list);
list_del(&mr->list);
ent->cur--;
spin_unlock(&ent->lock);
spin_unlock_irq(&ent->lock);
if (ent->cur < ent->limit)
queue_work(cache->wq, &ent->work);
break;
}
spin_unlock(&ent->lock);
spin_unlock_irq(&ent->lock);
queue_work(cache->wq, &ent->work);
@@ -374,12 +441,12 @@ static void free_cached_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
return;
}
ent = &cache->ent[c];
spin_lock(&ent->lock);
spin_lock_irq(&ent->lock);
list_add_tail(&mr->list, &ent->head);
ent->cur++;
if (ent->cur > 2 * ent->limit)
shrink = 1;
spin_unlock(&ent->lock);
spin_unlock_irq(&ent->lock);
if (shrink)
queue_work(cache->wq, &ent->work);
@@ -394,16 +461,16 @@ static void clean_keys(struct mlx5_ib_dev *dev, int c)
cancel_delayed_work(&ent->dwork);
while (1) {
spin_lock(&ent->lock);
spin_lock_irq(&ent->lock);
if (list_empty(&ent->head)) {
spin_unlock(&ent->lock);
spin_unlock_irq(&ent->lock);
return;
}
mr = list_first_entry(&ent->head, struct mlx5_ib_mr, list);
list_del(&mr->list);
ent->cur--;
ent->size--;
spin_unlock(&ent->lock);
spin_unlock_irq(&ent->lock);
err = mlx5_core_destroy_mkey(&dev->mdev, &mr->mmr);
if (err)
mlx5_ib_warn(dev, "failed destroy mkey\n");
@@ -464,12 +531,18 @@ static void mlx5_mr_cache_debugfs_cleanup(struct mlx5_ib_dev *dev)
debugfs_remove_recursive(dev->cache.root);
}
static void delay_time_func(unsigned long ctx)
{
struct mlx5_ib_dev *dev = (struct mlx5_ib_dev *)ctx;
dev->fill_delay = 0;
}
int mlx5_mr_cache_init(struct mlx5_ib_dev *dev)
{
struct mlx5_mr_cache *cache = &dev->cache;
struct mlx5_cache_ent *ent;
int limit;
int size;
int err;
int i;
@@ -479,6 +552,7 @@ int mlx5_mr_cache_init(struct mlx5_ib_dev *dev)
return -ENOMEM;
}
setup_timer(&dev->delay_timer, delay_time_func, (unsigned long)dev);
for (i = 0; i < MAX_MR_CACHE_ENTRIES; i++) {
INIT_LIST_HEAD(&cache->ent[i].head);
spin_lock_init(&cache->ent[i].lock);
@@ -489,13 +563,11 @@ int mlx5_mr_cache_init(struct mlx5_ib_dev *dev)
ent->order = i + 2;
ent->dev = dev;
if (dev->mdev.profile->mask & MLX5_PROF_MASK_MR_CACHE) {
size = dev->mdev.profile->mr_cache[i].size;
if (dev->mdev.profile->mask & MLX5_PROF_MASK_MR_CACHE)
limit = dev->mdev.profile->mr_cache[i].limit;
} else {
size = DEF_CACHE_SIZE;
else
limit = 0;
}
INIT_WORK(&ent->work, cache_work_func);
INIT_DELAYED_WORK(&ent->dwork, delayed_cache_work_func);
ent->limit = limit;
@@ -522,6 +594,7 @@ int mlx5_mr_cache_cleanup(struct mlx5_ib_dev *dev)
clean_keys(dev, i);
destroy_workqueue(dev->cache.wq);
del_timer_sync(&dev->delay_timer);
return 0;
}
@@ -551,7 +624,8 @@ struct ib_mr *mlx5_ib_get_dma_mr(struct ib_pd *pd, int acc)
seg->qpn_mkey7_0 = cpu_to_be32(0xffffff << 8);
seg->start_addr = 0;
err = mlx5_core_create_mkey(mdev, &mr->mmr, in, sizeof(*in));
err = mlx5_core_create_mkey(mdev, &mr->mmr, in, sizeof(*in), NULL, NULL,
NULL);
if (err)
goto err_in;
@@ -660,14 +734,14 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
int err;
int i;
for (i = 0; i < 10; i++) {
for (i = 0; i < 1; i++) {
mr = alloc_cached_mr(dev, order);
if (mr)
break;
err = add_keys(dev, order2idx(dev, order), 1);
if (err) {
mlx5_ib_warn(dev, "add_keys failed\n");
if (err && err != -EAGAIN) {
mlx5_ib_warn(dev, "add_keys failed, err %d\n", err);
break;
}
}
@@ -759,8 +833,10 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, u64 virt_addr,
in->seg.xlt_oct_size = cpu_to_be32(get_octo_len(virt_addr, length, 1 << page_shift));
in->seg.log2_page_size = page_shift;
in->seg.qpn_mkey7_0 = cpu_to_be32(0xffffff << 8);
in->xlat_oct_act_size = cpu_to_be32(get_octo_len(virt_addr, length, 1 << page_shift));
err = mlx5_core_create_mkey(&dev->mdev, &mr->mmr, in, inlen);
in->xlat_oct_act_size = cpu_to_be32(get_octo_len(virt_addr, length,
1 << page_shift));
err = mlx5_core_create_mkey(&dev->mdev, &mr->mmr, in, inlen, NULL,
NULL, NULL);
if (err) {
mlx5_ib_warn(dev, "create mkey failed\n");
goto err_2;
@@ -944,7 +1020,8 @@ struct ib_mr *mlx5_ib_alloc_fast_reg_mr(struct ib_pd *pd,
* TBD not needed - issue 197292 */
in->seg.log2_page_size = PAGE_SHIFT;
err = mlx5_core_create_mkey(&dev->mdev, &mr->mmr, in, sizeof(*in));
err = mlx5_core_create_mkey(&dev->mdev, &mr->mmr, in, sizeof(*in), NULL,
NULL, NULL);
kfree(in);
if (err)
goto err_free;

Some files were not shown because too many files have changed in this diff Show More