Commit Graph

1506 Commits

Author SHA1 Message Date
Pete Wyckoff 35fb5340e3 Revert "IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs"
This reverts commit a3cd7d9070.

The original commit breaks iSER reliably, making it complain:

    iser: iser_reg_page_vec:ib_fmr_pool_map_phys failed: -11

The FMR cleanup thread runs ib_fmr_batch_release() as dirty entries
build up.  This commit causes clean but used FMR entries also to be
purged.  During that process, another thread can see that there are no
free FMRs and fail, even though there should always have been enough
available.

Signed-off-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-29 13:29:19 -08:00
Sean Hefty 84ba284cd7 IB/cm: Flush workqueue when removing device
When a CM MAD is received, it is queued to a CM workqueue for
processing.  The queued work item references the port and device on
which the MAD was received.  If that device is removed from the system
before the work item can execute, the work item will reference freed
memory.

To fix this, flush the workqueue after unregistering to receive MAD,
and before the device is be freed.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-29 13:27:52 -08:00
John Lacombe 4b1cc7e7ca RDMA/nes: Fix interrupt moderation low threshold
Interrupt moderation low threshold value was incorrectly triggering,
indicating that the threshold should be lowered.

The impact was the timer was likely to become 40usecs and get stuck
there.  The biggest side effect was too many interrupts and nonoptimal
performance.

Signed-off-by: John Lacombe <jlacombe@neteffect.com>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-26 16:24:29 -08:00
Faisal Latif 30da7cff87 RDMA/nes: Fix CRC endianness for RDMA connection establishment on big-endian
With commit ef19454b ("[LIB] crc32c: Keep intermediate crc state in
cpu order"), the behavior of crc32c changes on big-endian platforms.

Our algorithm expects the previous behavior; otherwise we have RDMA
connection establishment failure on big-endian platforms like powerpc.
Apply cpu_to_le32() to value returned by crc32c() to get the previous
behavior.

Signed-off-by: Faisal Latif <flatif@neteffect.com>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-26 16:24:29 -08:00
Faisal Latif a2e9c384ce RDMA/nes: Fix use-after-free in mini_cm_dec_refcnt_listen()
Fix use-after-free spotted by Coverity checker flagged by Adrian Bunk.

Signed-off-by: Faisal Latif <flatif@neteffect.com>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-26 16:24:29 -08:00
Glenn Streiff f84fba6f96 RDMA/nes: Fix use-after-free in nes_create_cq()
Just delete the debugging statement so we don't use cqp_request after
freeing it.  Adrian Bunk flagged this use-after-free issue spotted by
the Coverity checker.

Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-26 16:24:29 -08:00
Adrian Bunk a4435febd4 RDMA/nes: Fix a check-after-use in nes_probe()
Fix a check-after-use spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-26 16:24:29 -08:00
Adrian Bunk ed0ba33d64 RDMA/nes: Fix a memory leak in schedule_nes_timer()
Fix a memory leak spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-26 16:24:27 -08:00
Adrian Bunk 65b07ec293 RDMA/nes: Fix off-by-one
Fix an off-by-one spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-25 16:00:30 -08:00
Chien Tung 9300c0c067 RDMA/nes: Resurrect error path dead code
Adrian Bunk pointed out that a Coverity scan found some apparently
dead code in nes_verbs.c that really shouldn't have been dead.

The function nes_create_cq() was missing the assignment

	err = 1;

just prior to an iteration that conditionally set err = 0 if a PBL was
found for a given virtual CQ.  I also noticed we should have been
returning -EFAULT on a couple related error paths.

Signed-off-by: Chien Tung <ctung@neteffect.com>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-25 16:00:30 -08:00
Bryan Rosenburg 82d416fffb RDMA/cxgb3: Fix shift calc in build_phys_page_list() for 1-entry page lists
A single entry (addr 0x10001000, size 0x2000) will get converted to
page address 0x10000000 with a page size of 0x4000.  The code as it
stands doesn't address the single buffer case, but in fact it allows
the subsequent single-buffer special case to be eliminated entirely.
Because the mask now includes the (page adjusted) starting and ending
addresses, the general case works for the single buffer case as well.

Signed-off-by: Bryan Rosenburg <rosnbrg@us.ibm.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-25 16:00:29 -08:00
Roland Dreier b7f9c112a5 IB/mthca: Free correct MPT on error exit from mthca_fmr_alloc()
When mthca_fmr_alloc() returns an error, it should free the MPT at the
index key, not mr->ibmr.lkey, since the lkey has been mangled by
hw_index_to_key() and no longer is the real index.  This bug causes
corruption of the MPT table free bitmap when mthca_fmr_alloc() fails.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-19 10:42:50 -08:00
Pradeep Satyanarayana ec229e5e81 IPoIB/cm: Fix ipoib_cm_dev_stop() cleanup when drain times out
Commit efcd9971 ("IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list()")
introduced a bug in ipoib_cm_dev_stop() when the receive drain times
out.  In that case, the function moves all the pending rx stuff into a
private list but then calls ipoib_cm_free_rx_reap_list(), which
handles a different list.

Fix this by moving everything to the rx_reap_list that will actually
get freed up.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=906>.

Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-19 10:25:11 -08:00
Roland Dreier 51af33e8e4 RDMA/nes: Fix possible array overrun
In nes_create_qp(), the test

	if (nesqp->mmap_sq_db_index > NES_MAX_USER_WQ_REGIONS) {

is used to error out if the db_index is too large; however, if the
test doesn't trigger, then the index is used as

	nes_ucontext->mmap_nesqp[nesqp->mmap_sq_db_index] = nesqp;

and mmap_nesqp is declared as

	struct nes_qp      *mmap_nesqp[NES_MAX_USER_WQ_REGIONS];

which leads to an array overrun if the index is exactly equal to
NES_MAX_USER_WQ_REGIONS.  Fix this by bailing out if the index is
greater than or equal to NES_MAX_USER_WQ_REGIONS.

This was spotted by the Coverity checker (CID 2162).

Acked-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-18 10:33:59 -08:00
Chien Tung edd2fd643c RDMA/nes: Fix VLAN support
We need to account for the VLAN header size in nes_netdev_change_mtu()
and nes_netdev_init().  Also, add spin lock/unlock during VLAN RX
registration so only one process can assign VLAN group for a given
interface at a time.

Signed-off-by: Chien Tung <ctung@neteffect.com>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-16 21:16:33 -08:00
Glenn Streiff 11e0704b7e RDMA/nes: Fix MAC interrupt erroneously masked on ifdown
Only mask out MAC interrupt if necessary and re-enable on ifup.  There
could be multiple netdevs going through the same MAC.  MAC interrupts
should not be masked off until the last netdev is downed.

Signed-off-by: Chien Tung <ctung@neteffect.com>
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-15 15:05:05 -08:00
Li Zefan c7482b81c8 IB: Fix return value in ib_device_register_sysfs()
If kobject_create_and_add() fails and returns NULL, the current code
in ib_device_register_sysfs() does not set ret and hence returns 0.
Set ret to -ENOMEM for this failure, so that the caller knows that
ib_device_register_sysfs() actually failed.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-15 15:05:05 -08:00
Sean Hefty ead595aeb0 RDMA/cma: Do not issue MRA if user rejects connection request
There's an undesirable interaction with issuing MRA requests to
increase connection timeouts and the listen backlog.

When the rdma_cm receives a connection request, it queues an MRA with
the ib_cm.  (The ib_cm will send an MRA if it receives a duplicate
REQ.)  The rdma_cm will then create a new rdma_cm_id and give that to
the user, which in this case is the rdma_user_cm.

If the listen backlog maintained in the rdma_user_cm is full, it
destroys the rdma_cm_id, which in turns destroys the ib_cm_id.  The
ib_cm_id generates a REJ because the state of the ib_cm_id has changed
to MRA sent, versus REQ received.  When the backlog is full, we just
want to drop the REQ so that it is retried later.

Fix this by deferring queuing the MRA until after the user of the
rdma_cm has examined the connection request.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-14 15:30:41 -08:00
Jack Morgenstein e6028c0e00 IB/mlx4: mlx4_ib_fmr_alloc() should call mlx4_fmr_enable()
Currently mlx4_ib_fmr_alloc() calls mlx4_mr_enable() instead of
mlx4_fmr_enable().  The two functions are equivalent at the moment, but 
this is not really correct (and the change is needed to fix a bug).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-14 10:39:36 -08:00
Eli Cohen a9d1884925 IPoIB: Remove unused struct ipoib_cm_tx.ibwc member
struct ipoib_cm_tx.ibwc is unused since commit 1b524963 ("IPoIB/cm:
Use common CQ for CM send completions"), so remove it.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
2008-02-14 10:30:50 -08:00
Jack Morgenstein 167c42655c IPoIB: On P_Key change event, reset state properly
In P_Key event handling, if the old P_Key is no longer available, the
driver must call ipoib_ib_dev_stop() -- just as it does when the P_Key
is still available (see procedure __ipoib_ib_dev_flush()).

When a P_Key becomes available, the driver will perform ipoib_open(),
which assumes that the QP is in RESET, the cm_id has been
destroyed/deleted, etc.  If ipoib_ib_dev_stop() is not called as
described above, then these assumptions will be false, and the attempt
to bring the interface up will fail.

Found by Mellanox QA.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-14 10:15:06 -08:00
Marcin Slusarz 5163dc1a64 IB/mthca: Convert to use be16_add_cpu()
replace:

	big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) +
						expression_in_cpu_byteorder);

with:

	beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder);

Generated with a semantic patch.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-13 07:47:47 -08:00
Steve Wise 8704e9a879 RDMA/cxgb3: Fail loopback connections
The cxgb3 HW and driver don't support loopback RDMA connections.  So
fail any connection attempt where the destination address is local.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-13 07:47:42 -08:00
Roland Dreier 7c7a9bccd2 IB/cm: Fix infiniband_cm class kobject ref counting
Commit 9af57b7a ("IB/cm: Add basic performance counters") introduced a
bug in how the reference count for cm_class.subsys.kobj was handled:
the path that released a device did a kobject_put() on that kobject, but
there was no kobject_get() in the path the handles adding a device.  So
the reference count ended up too low, which leads to bad things.  Fix up
and simplify the reference counting to avoid this.

(Actually, I introduced the bug when fixing the patch up to match some
of Greg's kobject changes, but who's counting)

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-12 14:38:27 -08:00
Roland Dreier ab64b96067 IB/cm: Remove debug printk()s that snuck upstream
Pesky little devils, sneaking around...

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-12 14:38:27 -08:00