Commit Graph

44 Commits

Author SHA1 Message Date
Roland Dreier
9ead190bfd IB/uverbs: Don't serialize with ib_uverbs_idr_mutex
Currently, all userspace verbs operations that call into the kernel
are serialized by ib_uverbs_idr_mutex.  This can be a scalability
issue for some workloads, especially for devices driven by the ipath
driver, which needs to call into the kernel even for datapath
operations.

Fix this by adding reference counts to the userspace objects, and then
converting ib_uverbs_idr_mutex into a spinlock that only protects the
idrs long enough to take a reference on the object being looked up.
Because remove operations may fail, we have to do a slightly funky
two-step deletion, which is described in the comments at the top of
uverbs_cmd.c.

This also still leaves ib_uverbs_idr_lock as a single lock that is
possibly subject to contention.  However, the lock hold time will only
be a single idr operation, so multiple threads should still be able to
make progress, even if ib_uverbs_idr_lock is being ping-ponged.

Surprisingly, these changes even shrink the object code:

add/remove: 23/5 grow/shrink: 4/21 up/down: 633/-693 (-60)

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:44:49 -07:00
Sean Hefty
6d969a471b IB/sa: Add ib_init_ah_from_path()
Add a call to initialize address handle attributes given a path record.
This is used by the CM, and would be useful for users of UD QPs.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:39 -07:00
Sean Hefty
4e00d69454 IB: Add ib_init_ah_from_wc()
Add a function to initialize address handle attributes from a work
completion.  This functionality is duplicated by both verbs and the CM.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:39 -07:00
Sean Hefty
75af908851 IB/ucm: Get rid of duplicate P_Key parameter
The P_Key is provided into a SIDR REQ in two places, once as a
parameter, and again in the path record.  Remove the P_Key as a
parameter and always use the one given in the path record.

This change has no practical effect on ABI functionality.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:39 -07:00
Leonid Arsh
da2ab62ab5 IB: Move struct port_info from ipath to <rdma/ib_smi.h>
Move ipath's struct port_info into <rdma/ib_smi.h>, so that it can be
used by mthca to implement client reregister support.

Remove the __attribute__((packed)) because all the members of the struct
are naturally aligned anyway.

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:36 -07:00
Leonid Arsh
63942c9a98 IB: Add client reregister event type
Add IB_EVENT_CLIENT_REREGISTER to enum so low-level drivers can
generate "client reregister" events.

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:35 -07:00
Jack Morgenstein
6fb9cdbf2c IB: Add caching of ports' LMC
Add an LMC cache to struct ib_device, and add a function
ib_get_cached_lmc() to query the cache.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:34 -07:00
Sean Hefty
e51060f08a IB: IP address based RDMA connection manager
Kernel connection management agent over InfiniBand that connects based
on IP addresses.  The agent defines a generic RDMA connection
abstraction to support clients wanting to connect over different RDMA
devices.

The agent also handles RDMA device hotplug events on behalf of clients.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:29 -07:00
Sean Hefty
7025fcd36b IB: address translation to map IP toIB addresses (GIDs)
Add an address translation service that maps IP addresses to
InfiniBand GID addresses using IPoIB.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:28 -07:00
Sean Hefty
6e61d04f2d IB/cm: Match connection requests based on private data
Extend matching connection requests to listens in the InfiniBand CM to
include private data checks.

This allows applications to listen on the same service identifier,
with private data directing the request to the appropriate application.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:28 -07:00
Sean Hefty
6a9af2e18a IB: common handling for marshalling parameters to/from userspace
Provide common handling for marshalling data between userspace clients
and kernel InfiniBand drivers.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:27 -07:00
Jack Morgenstein
bf6a9e31cf IB: simplify static rate encoding
Push translation of static rate to HCA format into low-level drivers,
where it belongs.  For static rate encoding, use encoding of rate
field from IB standard PathRecord, with addition of value 0, for
backwards compatibility with current usage.  The changes are:

 - Add enum ib_rate to midlayer includes.
 - Get rid of static rate translation in IPoIB; just use static rate
   directly from Path and MulticastGroup records.
 - Update mthca driver to translate absolute static rate into the
   format used by hardware.  This also fixes mthca's static rate
   handling for HCAs that are capable of 4X DDR.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:47 -07:00
Hal Rosenstock
618a3c03fc IB/mad: RMPP support for additional classes
Add RMPP support for additional management classes that support it.
Also, validate RMPP is consistent with management class specified.

Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-30 07:19:51 -08:00
Jack Morgenstein
f36e1793e2 IB/umad: Add support for large RMPP transfers
Add support for sending and receiving large RMPP transfers.  The old
code supports transfers only as large as a single contiguous kernel
memory allocation.  This patch uses linked list of memory buffers when
sending and receiving data to avoid needing contiguous pages for
larger transfers.

  Receive side: copy the arriving MADs in chunks instead of coalescing
  to one large buffer in kernel space.

  Send side: split a multipacket MAD buffer to a list of segments,
  (multipacket_list) and send these using a gather list of size 2.
  Also, save pointer to last sent segment, and retrieve requested
  segments by walking list starting at last sent segment. Finally,
  save pointer to last-acked segment.  When retrying, retrieve
  segments for resending relative to this pointer.  When updating last
  ack, start at this pointer.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:23 -08:00
Dotan Barak
ea88fd16d6 IB/uverbs: Return actual capacity from create SRQ operation
Pass actual capacity of created SRQ back to userspace, so that
userspace can report accurate capacities.  This requires an ABI bump,
to change struct ib_uverbs_create_srq_resp.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:16 -08:00
Dotan Barak
abb6e9ba17 IB/mthca: Return actual capacity from create_srq
Have mthca's create_srq method return the actual capacity of the SRQ
that gets created.  Also update comments in <rdma/ib_verbs.h> to
clarify that this is what is expected from ib_create_srq().

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:16 -08:00
Roland Dreier
4d9781c5ce IB/uverbs: Fix alignment of struct ib_uverbs_create_qp_resp
The size of struct ib_uverbs_create_qp_resp is not even multiple of 8
bytes.  This causes problems for low-level drivers that add private
data after the structure: 32-bit userspace will look in the wrong
place for a response from a 64-bit kernel.  Fix this by adding a
reserved field.  Also, bump the ABI version because this changes the
size of a structure.

Pointed out by Hoang-Nam Nguyen <HNGUYEN@de.ibm.com>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:16 -08:00
Dotan Barak
8bdb0e8632 IB/uverbs: Support for query SRQ from userspace
Add support to uverbs to handle querying userspace SRQs (shared
receive queues), including adding an ABI for marshalling requests and
responses.  The kernel midlayer already has the underlying
ib_query_srq() function.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:14 -08:00
Dotan Barak
7ccc9a24e0 IB/uverbs: Support for query QP from userspace
Add support to uverbs to handle querying userspace QPs (queue pairs),
including adding an ABI for marshalling requests and responses.  The
kernel midlayer already has the underlying ib_query_qp() function.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:14 -08:00
Roland Dreier
a74cd4af0b IB: Whitespace cleanups
Remove trailing whitespace and fix indentation that with spaces
instead of tabs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:13 -08:00
Roland Dreier
8a51866f08 IB: Add ib_modify_qp_is_ok() library function
The in-kernel mthca driver contains a table of which attributes are
valid for each queue pair state transition.  It turns out that both
other IB drivers -- ipath and ehca -- which are being prepared for
merging have copied this table, errors and all.

To forestall this code duplication, move this table and the code to
check parameters against it into a midlayer library function,
ib_modify_qp_is_ok().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:12 -08:00
Or Gerlitz
d36f34aadf IB: Enable FMR pool user to set page size
This patch allows the consumer to set the page size of "pages" mapped
by the pool FMRs, which is a feature already existing in the base
verbs API.  On the cosmetic side it changes ib_fmr_attr.page_size field
to be named page_shift.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:10 -08:00
Roland Dreier
c5bcbbb9fe IB: Allow userspace to set node description
Expose a writable "node_desc" sysfs attribute for InfiniBand devices.
This allows userspace to update the node description with information
such as the node's hostname, so that IB network management software
can tie its view to the real world.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:09 -08:00
Roland Dreier
33b9b3ee97 IB: Add userspace support for resizing CQs
Add support to uverbs to handle resizing userspace CQs (completion
queues), including adding an ABI for marshalling requests and
responses.  The kernel midlayer already has ib_resize_cq().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:07 -08:00
Sean Hefty
cf311cd49a IB: Add node_guid to struct ib_device
Add a node_guid field to struct ib_device.  It is the responsibility
of the low-level driver to initialize this field before registering a
device with the midlayer.  Convert everyone to looking at this field
instead of calling ib_query_device() when all they want is the node
GUID, and remove the node_guid field from struct ib_device_attr.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-10 07:39:34 -08:00