Commit Graph

475 Commits

Author SHA1 Message Date
Alexey Dobriyan
d43c36dc6b headers: remove sched.h from interrupt.h
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-11 11:20:58 -07:00
David J. Wilder
85f20b39fd RDMA/addr: Fix resolution of local IPv6 addresses
This patch allows a local IPv6 address to be resolved by rdma_cm.

To reproduce the problem:

 $ rping -s -v -a ::0  &
 $ rping -c -v -a <IPv6 address local to this system>
 rdma_resolve_addr error -1

Local IPv6 address was obtained with "ip addr show ib0"

Addresses: https://bugs.openfabrics.org/show_bug.cgi?id=1759
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-10-07 16:03:18 -07:00
Steve Wise
54e05f15cc RDMA/iwcm: Don't call provider reject func with irqs disabled
In commit cb58160e ("RDMA/iwcm: Reject the connection when the cm_id
is destroyed") a call to the provider's reject handler was added to
destroy_cm_id() to fix a provider endpoint leak.  This call needs to
be done with interrupts enabled.  So unlock and relock around this
call.  This is safe because:

1) the provider will do nothing with this endpoint until the iwcm either
   accepts or rejects.
2) the lock is only released after the iwcm state is changed, so an
   errant iwcm app that is destroying -and- rejecting the connection
   concurrently will get a failure on one of the calls.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-10-07 15:38:12 -07:00
Alexey Dobriyan
a99bbaf5ee headers: remove sched.h from poll.h
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-04 15:05:10 -07:00
Roland Dreier
0e442afd92 IB/mad: Fix lock-lock-timer deadlock in RMPP code
Holding agent->lock across cancel_delayed_work() (which does
del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of
possible lock-timer deadlocks if a consumer ever does something that
connects agent->lock to a lock taken in IRQ context (cf
http://marc.info/?l=linux-rdma&m=125243699026045).

Fix this by changing the list items to a new state "CANCELING" while
holding the lock, and then canceling the delayed work without holding
the lock.  If the delayed work runs after the lock is dropped, it will
see the state is CANCELING and return immediately, so the list will
stay stable while we traverse it with the lock not held.

Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-23 11:10:15 -07:00
Roland Dreier
73f526da02 Merge branch 'mad' into for-linus
Conflicts:
	drivers/infiniband/core/mad.c
2009-09-10 21:19:45 -07:00
Steve Wise
cb58160e72 RDMA/iwcm: Reject the connection when the cm_id is destroyed
If the cm_id of a connect request is destroyed prior to the ULP
accepting or rejecting the connection, then the provider never cleans
up the connection.  The iwcm should explicitly reject these
connections if the cm_id is destroyed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-09 11:37:38 -07:00
Hal Rosenstock
b76aabc395 IB/mad: Allow tuning of QP0 and QP1 sizes
MADs are UD and can be dropped if there are no receives posted, so
allow receive queue size to be set with a module parameter in case the
queue needs to be lengthened.  Send side tuning is done for symmetry
with receive.

Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-07 08:28:48 -07:00
Roland Dreier
6b2eef8fd7 IB/mad: Fix possible lock-lock-timer deadlock
Lockdep reported a possible deadlock with cm_id_priv->lock,
mad_agent_priv->lock and mad_agent_priv->timed_work.timer; this
happens because the mad module does

	cancel_delayed_work(&mad_agent_priv->timed_work);

while holding mad_agent_priv->lock.  cancel_delayed_work() internally
does del_timer_sync(&mad_agent_priv->timed_work.timer).

This can turn into a deadlock because mad_agent_priv->lock is taken
inside cm_id_priv->lock, so we can get the following set of contexts
that deadlock each other:

 A: holding cm_id_priv->lock, waiting for mad_agent_priv->lock
 B: holding mad_agent_priv->lock, waiting for del_timer_sync()
 C: interrupt during mad_agent_priv->timed_work.timer that takes
    cm_id_priv->lock

Fix this by using the new __cancel_delayed_work() interface (which
internally does del_timer() instead of del_timer_sync()) in all the
places where we are holding a lock.

Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757
Reported-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-07 08:27:50 -07:00
Jack Morgenstein
b1b8afb833 IB/uverbs: Return ENOSYS for unimplemented commands (not EINVAL)
Since the original commit 883a99c7 ("[IB] uverbs: Add a mask of device
methods allowed for userspace"), the uverbs core returns EINVAL for
commands not implemented by a specific low-level driver.

This creates a problem that there is no way to tell the difference
between an unimplemented command and an implemented one which is
incorrectly invoked (which also returns EINVAL).

The fix is to have unimplemented commands return ENOSYS.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:24 -07:00
Yossi Etigin
e1d7806df3 IB/core: Fix send multicast group leave retry
Until now, retries were only sent when joining a multicast group. This
patch will adds retries when leaving a multicast group as well.

Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:24 -07:00
Roland Dreier
6276e08a9b IB: Use DEFINE_SPINLOCK() for static spinlocks
Rather than just defining static spinlock_t variables and then
initializing them later in init functions, simply define them with
DEFINE_SPINLOCK() and remove the calls to spin_lock_init().  This cleans
up the source a tad and also shrinks the compiled code; eg on x86-64:

add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-40 (-40)
function                                     old     new   delta
ib_uverbs_init                               336     326     -10
ib_mad_init_module                           147     137     -10
ib_sa_init                                   123     103     -20

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:23 -07:00
Roland Dreier
60f2b652f5 IB/mad: Check hop count field in directed route MAD to avoid array overflow
The hop count field in a directed route MAD is only allowed to be in the
range 0 to 63 (by spec).  Check that this really is the case to avoid
accessing outside the bounds of the hop array.

Reported-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:10 -07:00
Peter Huewe
716abb1fdf RDMA: Add __init/__exit macros to addr.c and cma.c
Add __init and __exit annotations to the module_init/module_exit
functions from drivers/infiniband/core/addr.c and cma.c.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-23 10:38:42 -07:00
Greg Kroah-Hartman
3f7c58a05f infiniband: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.


Cc: general@lists.openfabrics.org
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:26 -07:00
Yossi Etigin
d2ca39f262 RDMA/cma: Create cm id even when IB port is down
When doing rdma_resolve_addr(), if the relevant IB port is down, the
function fails and the cm_id is not bound to the correct device.
Therefore, application does not have a device handle and cannot wait
for the port to become active.  The function fails because the
underlying IPoIB interface is not joined to the broadcast group and
therefore the SA does not have a multicast record to take a Q_Key
from.

The fix is to use lazy Q_Key resolution - cma_set_qkey() will set
id_priv->qkey if it was not set, and will be called just before the
Q_Key is really required.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 13:42:33 -07:00
Yossi Etigin
84adeee9aa RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groups
When joining an IPoIB multicast group, use the same rate as in the
broadcast group.  Otherwise, if the RDMA CM creates this group before
IPoIB does, it might get a different rate.  This will cause IPoIB to
fail joining to the same group later on, because IPoIB uses strict
rate selection.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-01 13:55:32 -07:00
Roland Dreier
09f98bafea Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next 2009-03-24 20:44:41 -07:00
Roland Dreier
6432f36684 IB: Remove useless ibdev_is_alive() tests from sysfs code
Some attribute show functions test ibdev_is_alive() to make sure that
it's OK to access device state.  However, the sysfs attributes will
not be registered until the device is fully initialized, and they'll
be unregistered before anything is torn down, so ibdev_is_alive()
doesn't do anything useful.  Remove it.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-04 15:22:39 -08:00
Jack Morgenstein
6b708b3dde IB/sa_query: Fix AH leak due to update_sm_ah() race
Our testing uncovered a race condition in ib_sa_event():

	spin_lock_irqsave(&port->ah_lock, flags);
	if (port->sm_ah)
		kref_put(&port->sm_ah->ref, free_sm_ah);
	port->sm_ah = NULL;
	spin_unlock_irqrestore(&port->ah_lock, flags);

	schedule_work(&sa_dev->port[event->element.port_num -
				    sa_dev->start_port].update_task);

If two events occur back-to-back (e.g., client-reregister and LID
change), both may pass the spinlock-protected code above before the
scheduled work updates the port->sm_ah handle.  Then if the scheduled
work ends up running twice, the second operation will then find a
non-NULL port->sm_ah, and will simply overwrite it in update_sm_ah --
resulting in an AH leak.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-03 14:30:01 -08:00
Ralph Campbell
4780c1953f IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp
If ib_post_send_mad() returns 0, the API guarantees that there will be
a callback to send_buf->mad_agent->send_handler() so that the sender
can call ib_free_send_mad().  Otherwise, the ib_mad_send_buf will be
leaked and the mad_agent reference count will never go to zero and the
IB device module cannot be unloaded.  The above can happen without
this patch if process_mad() returns (IB_MAD_RESULT_SUCCESS |
IB_MAD_RESULT_CONSUMED).

If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent
registered to receive the mad being sent, handle_outgoing_dr_smp()
returns zero which causes a MAD packet which is at the end of the
directed route to be incorrectly sent on the wire but doesn't cause a
hang since the HCA generates a send completion.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-03 14:22:17 -08:00
Ralph Campbell
d9620a4c82 IB/mad: initialize mad_agent_priv before putting on lists
There is a potential race in ib_register_mad_agent() where the struct
ib_mad_agent_private is not fully initialized before it is added to
the list of agents per IB port. This means the ib_mad_agent_private
could be seen before the refcount, spin locks, and linked lists are
initialized.  The fix is to initialize the structure earlier.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-27 14:44:32 -08:00
Ralph Campbell
1d9bc6d648 IB/mad: Fix null pointer dereference in local_completions()
handle_outgoing_dr_smp() can queue a struct ib_mad_local_private
*local on the mad_agent_priv->local_work work queue with
local->mad_priv == NULL if device->process_mad() returns
IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY and
(!ib_response_mad(&mad_priv->mad.mad) ||
!mad_agent_priv->agent.recv_handler).

In this case, local_completions() will be called with local->mad_priv
== NULL. The code does check for this case and skips calling
recv_mad_agent->agent.recv_handler() but recv == 0 so
kmem_cache_free() is called with a NULL pointer.

Also, since recv isn't reinitialized each time through the loop, it
can cause a memory leak if recv should have been zero.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
2009-02-27 10:34:30 -08:00
Roland Dreier
9206dff157 IB: Remove sysfs files before unregistering device
Move the ib_device_unregister_sysfs() call from ib_dealloc_device() to
ib_unregister_device().  The old code allows device unregister to
proceed even if some sysfs files are open, which leaves a window where
userspace can open a file before a device is removed but then end up
reading the file after the device is removed, which leads to various
kernel crashes either because the device data structure is freed or
because the low-level driver code is gone after module removal.

By not returning from ib_unregister_device() until after all sysfs
entries are removed, we make sure that data structures and/or module
code is not freed until after all sysfs access is done.

Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-25 13:27:46 -08:00
Harvey Harrison
9c3da09917 IB: Remove __constant_{endian} uses
The base versions handle constant folding just fine, use them
directly.  The replacements are OK in the include/ files as they are
not exported to userspace so we don't need the __ prefixed versions.

This patch does not affect code generation at all.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-17 17:11:57 -08:00