Commit Graph

222 Commits

Author SHA1 Message Date
Ralph Campbell 841adfca9c IPoIB/cm: Partial error clean up unmaps wrong address
If a page can't be allocated for the frag list of a skb, the code to
unmap the partially allocated list is off by one.  For exaple, if
'frags' equals one, i == 0, and the alloc_page() fails, then the old
loop would have unmapped mapping[1] which is uninitialized.  The same
would happen if the call to ib_dma_map_page() failed.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-02 20:48:31 -07:00
Roland Dreier 13ef5f44c3 IPoIB/cm: Remove dead definition of struct ipoib_cm_id
It's completely unused.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 13:39:08 -07:00
Michael S. Tsirkin 82c3aca6ad IPoIB/cm: Fix interoperability when MTU doesn't match
IPoIB connected mode currently rejects a connection request unless the
supported MTU is >= the local netdevice MTU. This breaks
interoperability with implementations that might have tweaked
IPOIB_CM_MTU, and there's real no longer a reason to do so: this test
is just a leftover from when we did not tweak MTU per-connection.  Fix
this by making the test as permissive as possible.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 13:38:08 -07:00
Michael S. Tsirkin 3ec7393a68 IPoIB/cm: Initialize RX before moving QP to RTR
Fix a crasher bug in IPoIB CM: once a QP is in the RTR state, a
receive completion (or even an asynchronous error) might be observed
on this QP, so we have to initialize all of our receive data
structures before moving to the RTR state.

As an optimization (since modify_qp might take a long time), the
jiffies update done when moving RX to the passive_ids list is also
left in place to reduce the chance of the RX being misdetected as
stale.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=662>.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 13:03:50 -07:00
Michael S. Tsirkin ec56dc0b7f IPoIB/cm: Fix performance regression on Mellanox
commit 518b1646 ("IPoIB/cm: Fix SRQ WR leak") introduced a severe
performance regression on Mellanox cards, because keeping a QP in the
error state for extended periods of time moves hardware to the slow
path (until the QP is destroyed).  For example, MPI latency goes from
~3 usecs to ~7 usecs.

Fix this by posting a send WR on one of the QPs that are being
flushed, instead of using a separate drain QP that is kept in the
error state.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=636>,
reported and bisected by Scott Weitzenkamp at Cisco and debugged by
Sasha Mikheev at Voltaire.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-29 16:07:09 -07:00
Michael S. Tsirkin 2dfbfc3712 IPoIB/cm: Drain cq in ipoib_cm_dev_stop()
Since NAPI polling is disabled while ipoib_cm_dev_stop() is running,
ipoib_cm_dev_stop() must poll the CQ itself in order to see the
packets draining.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-24 14:02:40 -07:00
Michael S. Tsirkin 8fd357a6e3 IPoIB/cm: Fix timeout check in ipoib_cm_dev_stop()
time_after() was used backwards, so the timeout occurred immediately.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-24 14:02:39 -07:00
Michael S. Tsirkin 518b1646f8 IPoIB/cm: Fix SRQ WR leak
SRQ WR leakage has been observed with IPoIB/CM: e.g. flipping ports on
and off will, with time, leak out all WRs and then all connections
will start getting RNR NAKs.  Fix this in the way suggested by spec:
move the QP being destroyed to the error state, wait for "Last WQE
Reached" event and then post WR on a "drain QP" connected to the same
CQ.  Once we observe a completion on the drain QP, it's safe to call
ib_destroy_qp.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-21 13:35:40 -07:00
Michael S. Tsirkin 24bd1e4e32 IB/ipoib: Fix typos in error messages
Trivial error message fixups.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-21 13:29:15 -07:00
Yosef Etigin 26bbf13ce1 IPoIB: Handle P_Key table reordering
SM reconfiguration or failover possibly causes a shuffling of the values
in the P_Key table. Right now, IPoIB only queries for the P_Key index
once when it creates the device QP, and hence there are problems if the
index of a P_Key value changes.  Fix this by using the PKEY_CHANGE event
to trigger a recheck of the P_Key index.

Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:54 -07:00
Michael S. Tsirkin 7c5b9ef857 IPoIB/cm: Optimize stale connection detection
In the presence of some running RX connections, we repeat
queue_delayed_work calls each 4 RX WRs, which is a waste.  It's enough
to start stale task when a first passive connection is added, and
rerun it every IPOIB_CM_RX_DELAY as long as there are outstanding
passive connections.

This removes some code from RX data path.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-14 14:11:01 -07:00
Randy Dunlap e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Linus Torvalds 972d45fb43 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
  IPoIB: Convert to NAPI
  IB: Return "maybe missed event" hint from ib_req_notify_cq()
  IB: Add CQ comp_vector support
  IB/ipath: Fix a race condition when generating ACKs
  IB/ipath: Fix two more spin lock problems
  IB/fmr_pool: Add prefix to all printks
  IB/srp: Set proc_name
  IB/srp: Add orig_dgid sysfs attribute to scsi_host
  IPoIB/cm: Don't crash if remote side uses one QP for both directions
  RDMA/cxgb3: Support for new abort logic
  RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message
  RDMA/cxgb3: Fail qp creation if the requested max_inline is too large
  RDMA/cxgb3: Fix TERM codes
  IPoIB/cm: Fix error handling in ipoib_cm_dev_open()
  IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed
  IB/mthca: Work around kernel QP starvation
  IB/ipath: Don't put QP in timeout queue if waiting to send
  IB/ipath: Don't call spin_lock_irq() from interrupt context
2007-05-07 12:18:21 -07:00
Roland Dreier 8d1cc86a62 IPoIB: Convert to NAPI
Convert the IP-over-InfiniBand network device driver over to using
NAPI to handle completions for the main CQ.  This covers all receives
as well as datagram mode sends; send completions for connected mode
connections are still handled from interrupt context.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Michael S. Tsirkin f4fd0b224d IB: Add CQ comp_vector support
Add a num_comp_vectors member to struct ib_device and extend
ib_create_cq() to pass in a comp_vector parameter -- this parallels
the userspace libibverbs API.  Update all hardware drivers to set
num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector
value.  Pass the value of num_comp_vectors to userspace rather than
hard-coding a value of 1.

We want multiple CQ event vector support (via MSI-X or similar for
adapters that can generate multiple interrupts), but it's not clear
how many vectors we want, or how we want to deal with policy issues
such as how to decide which vector to use or how to set up interrupt
affinity.  This patch is useful for experimenting, since no core
changes will be necessary when updating a driver to support multiple
vectors, and we know that we want to make at least these changes
anyway.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Roland Dreier b7f008fdc9 IB/srp: Set proc_name
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Ishai Rabinovitz 3633b3d096 IB/srp: Add orig_dgid sysfs attribute to scsi_host
Add an orig_dgid attribute in sysfs for SRP scsi_hosts, so that
userspace can tell what the original dgid value written to the
add_target file was, even if the connection is redirected to a
different port while connecting.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06 21:18:11 -07:00
Michael S. Tsirkin d6ef7d68f6 IPoIB/cm: Don't crash if remote side uses one QP for both directions
The IPoIB CM spec allows the use of a single connection in both
active->passive and passive->active directions.  The current Linux
code uses one connection for both directions, but if another node only
uses one connection for both directions, we oops when we try to look
up the passive connection.  Fix by checking that qp_context is
non-NULL before dereferencing it.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
2007-05-06 21:18:11 -07:00
Linus Torvalds 4f7a307dc6 Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (87 commits)
  [SCSI] fusion: fix domain validation loops
  [SCSI] qla2xxx: fix regression on sparc64
  [SCSI] modalias for scsi devices
  [SCSI] sg: cap reserved_size values at max_sectors
  [SCSI] BusLogic: stop using check_region
  [SCSI] tgt: fix rdma transfer bugs
  [SCSI] aacraid: fix aacraid not finding device
  [SCSI] aacraid: Correct SMC products in aacraid.txt
  [SCSI] scsi_error.c: Add EH Start Unit retry
  [SCSI] aacraid: [Fastboot] Panics for AACRAID driver during 'insmod' for kexec test.
  [SCSI] ipr: Driver version to 2.3.2
  [SCSI] ipr: Faster sg list fetch
  [SCSI] ipr: Return better qc_issue errors
  [SCSI] ipr: Disrupt device error
  [SCSI] ipr: Improve async error logging level control
  [SCSI] ipr: PCI unblock config access fix
  [SCSI] ipr: Fix for oops following SATA request sense
  [SCSI] ipr: Log error for SAS dual path switch
  [SCSI] ipr: Enable logging of debug error data for all devices
  [SCSI] ipr: Add new PCI-E IDs to device table
  ...
2007-05-05 13:30:44 -07:00
Jean Delvare 6473d160b4 PCI: Cleanup the includes of <linux/pci.h>
I noticed that many source files include <linux/pci.h> while they do
not appear to need it. Here is an attempt to clean it all up.

In order to find all possibly affected files, I searched for all
files including <linux/pci.h> but without any other occurence of "pci"
or "PCI". I removed the include statement from all of these, then I
compiled an allmodconfig kernel on both i386 and x86_64 and fixed the
false positives manually.

My tests covered 66% of the affected files, so there could be false
positives remaining. Untested files are:

arch/alpha/kernel/err_common.c
arch/alpha/kernel/err_ev6.c
arch/alpha/kernel/err_ev7.c
arch/ia64/sn/kernel/huberror.c
arch/ia64/sn/kernel/xpnet.c
arch/m68knommu/kernel/dma.c
arch/mips/lib/iomap.c
arch/powerpc/platforms/pseries/ras.c
arch/ppc/8260_io/enet.c
arch/ppc/8260_io/fcc_enet.c
arch/ppc/8xx_io/enet.c
arch/ppc/syslib/ppc4xx_sgdma.c
arch/sh64/mach-cayman/iomap.c
arch/xtensa/kernel/xtensa_ksyms.c
arch/xtensa/platform-iss/setup.c
drivers/i2c/busses/i2c-at91.c
drivers/i2c/busses/i2c-mpc.c
drivers/media/video/saa711x.c
drivers/misc/hdpuftrs/hdpu_cpustate.c
drivers/misc/hdpuftrs/hdpu_nexus.c
drivers/net/au1000_eth.c
drivers/net/fec_8xx/fec_main.c
drivers/net/fec_8xx/fec_mii.c
drivers/net/fs_enet/fs_enet-main.c
drivers/net/fs_enet/mac-fcc.c
drivers/net/fs_enet/mac-fec.c
drivers/net/fs_enet/mac-scc.c
drivers/net/fs_enet/mii-bitbang.c
drivers/net/fs_enet/mii-fec.c
drivers/net/ibm_emac/ibm_emac_core.c
drivers/net/lasi_82596.c
drivers/parisc/hppb.c
drivers/sbus/sbus.c
drivers/video/g364fb.c
drivers/video/platinumfb.c
drivers/video/stifb.c
drivers/video/valkyriefb.c
include/asm-arm/arch-ixp4xx/dma.h
sound/oss/au1550_ac97.c

I would welcome test reports for these files. I am fine with removing
the untested files from the patch if the general opinion is that these
changes aren't safe. The tested part would still be nice to have.

Note that this patch depends on another header fixup patch I submitted
to LKML yesterday:
  [PATCH] scatterlist.h needs types.h
  http://lkml.org/lkml/2007/3/01/141

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:35 -07:00
Michael S. Tsirkin 347fcfbed2 IPoIB/cm: Fix error handling in ipoib_cm_dev_open()
If skb allocation fails when we start the device, we call
ipoib_cm_dev_stop() even though ipoib_cm_dev_open() did not run to
completion, so we pass an invalid pointer to ib_destroy_cm_id and get
an oops.

Fix by clearing cm.id on error, and testing it in cm_dev_stop().
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=561>

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-30 17:30:28 -07:00
Linus Torvalds afc2e82c08 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (49 commits)
  IB: Set class_dev->dev in core for nice device symlink
  IB/ehca: Implement modify_port
  IB/umad: Clarify documentation of transaction ID
  IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements
  IB/mad: Change SMI to use enums rather than magic return codes
  IB/umad: Implement GRH handling for sent/received MADs
  IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr
  IB/sa: Set src_path_bits correctly in ib_init_ah_from_path()
  IB/ucm: Simplify ib_ucm_event()
  RDMA/ucma: Simplify ucma_get_event()
  IB/mthca: Simplify CQ cleaning in mthca_free_qp()
  IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory
  IB/mthca: Update HCA firmware revisions
  IB/ipath: Fix WC format drift between user and kernel space
  IB/ipath: Check that a UD work request's address handle is valid
  IB/ipath: Remove duplicate stuff from ipath_verbs.h
  IB/ipath: Check reserved memory keys
  IB/ipath: Fix unit selection when all CPU affinity bits set
  IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times
  IB/ipath: Disable IB link earlier in shutdown sequence
  ...
2007-04-27 09:39:27 -07:00
Arnaldo Carvalho de Melo 459a98ed88 [SK_BUFF]: Introduce skb_reset_mac_header(skb)
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:32 -07:00
Roland Dreier 37aebbde70 IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements
There are quite a few places in ipoib_cm.c where we know IRQs are
enabled because we do something that sleeps in the same function, so
we can convert several occurrences of spin_lock_irqsave() to a plain
spin_lock_irq().  This cleans up the source a little and makes the
code smaller too:

add/remove: 0/0 grow/shrink: 1/5 up/down: 3/-51 (-48)
function                                     old     new   delta
ipoib_cm_tx_reap                             403     406      +3
ipoib_cm_stale_task                          146     145      -1
ipoib_cm_dev_stop                            173     172      -1
ipoib_cm_tx_handler                          964     956      -8
ipoib_cm_rx_handler                          956     937     -19
ipoib_cm_skb_reap                            212     190     -22

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24 21:30:37 -07:00
Sean Hefty 46f1b3d7af IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr
To support destinations that are not on the local IB subnet, IPoIB
should include the GRH information when constructing an address
handle.  Using the existing ib_init_ah_from_path() call will do this
for us.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
2007-04-24 16:31:12 -07:00