Commit Graph

1371 Commits

Author SHA1 Message Date
Anton Blanchard bf31a1a02e IB/ehca: Replace vmalloc() with kmalloc() for queue allocation
To improve performance of driver resource allocation, replace
vmalloc() calls with kmalloc().

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:40 -07:00
Linus Torvalds c98861f7de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Don't overwrite fast registration page list when posting work request
  RDMA/cxgb3: Don't complete flushed send work requests twice
2009-05-13 16:31:12 -07:00
Roland Dreier 8be741b0ac Merge branches 'cxgb3' and 'mlx4' into for-linus 2009-05-13 15:16:17 -07:00
Al Viro 265e771e81 Fix deadlock in ipathfs ->get_sb()
forgot to unlock superblock before calling deactivate_super()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09 10:49:40 -04:00
Jack Morgenstein 2b6b7d4be4 IB/mlx4: Don't overwrite fast registration page list when posting work request
The low-level mlx4 driver modified the page-list addresses for fast
register work requests post send to big-endian, and set a "present"
bit.  This caused problems later when the consumer attempted to unmap
the pages using the page-list (using the list addresses which were
assumed to be still in CPU-endian order).  Fix the mlx4 driver to
allocate two buffers and use a private buffer for the hardware-format
bus addresses.

This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>,
an NFS/RDMA server crash.  The cause of the crash was found by Vu Pham
of Mellanox.  The fix is along the lines suggested by Steve Wise in
comment #21 in bug 1571.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-07 21:35:13 -07:00
Steve Wise ec6995ddaa RDMA/cxgb3: Don't complete flushed send work requests twice
When the SQ is flushed, mark the flushed entries as not signaled so
the poll logic doesn't re-insert the CQ entry thinking its an out of
order completion.

The bug can cause the NFS/RDMA server to crash due to processing the
same completed work request twice.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-29 15:15:59 -07:00
Roland Dreier 9308f96c79 Merge branches 'cxgb3', 'ipoib', 'mthca', 'mlx4' and 'nes' into for-linus 2009-04-28 16:01:31 -07:00
Chien Tung 26cc5e57bb RDMA/nes: Update iw_nes version
Update version number to 1.5.0.0

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:46:29 -07:00
Faisal Latif 9256b25130 RDMA/nes: Fix error path in nes_accept()
If reg_phys_mem() fails, we need to free memory allocated for MPA
frame with private data before returning the error. Also move
nes_add_ref() after the reg_phys_mem() is successful.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:45:19 -07:00
Faisal Latif 109d67e4f1 RDMA/nes: Fix hang issues for large cluster dynamic connections
Running large cluster setup, we are hanging after many hours of
testing.  Fixing this required going over the code and making sure the
rexmit entry was properly removed based on the cm_node's state and
packet received.  Also when receiving a FIN packet, check seq# and
make sure there were no errors before calling handle_fin().

Following are the changes done in nes_cm.c:

* handle_ack_pkt() needs to return error value, so in case of error,
  handle_fin() is not called. Some cleanup done while going over the code.

* handle_rst_pkt(), handling of cm_node's NES_CM_STATE_LAST_ACK is missing.

* process_packet(), in case of FIN only packet is received, call
  check_seq() before processing.

* in handle_fin_pkt(), we are calling cleanup_retrans_entry() for all
  conditions, even if the packets need to be dropped.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:41:06 -07:00
Faisal Latif 4e9c390036 RDMA/nes: Increase rexmit timeout interval
Under heavy load with large cluster testing, it may take longer to
receive a response to MPA requests.  Change the driver to wait longer
after each rexmit to max time value.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:39:36 -07:00
Faisal Latif c11470f9f4 RDMA/nes: Check for sequence number wrap-around
check_seq() was not checking if the seq#s have wrapped.  Fix it.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:38:31 -07:00
Faisal Latif 53094c388f RDMA/nes: Do not set apbvt entry for loopback
When a connect request comes, apbvt should only be set for
non-loopback connections.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:37:34 -07:00
Chien Tung 1f0dba1e51 RDMA/nes: Fix unused variable compile warning when INFINIBAND_NES_DEBUG=n
Remove the NES_DEBUG that is causing the compile warning about an
unused variable when INFINIBAND_NES_DEBUG is not enabled.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:36:03 -07:00
Chien Tung 0e4562da9e RDMA/nes: Fix fw_ver in /sys
/sys/class/infiniband/nes?/fw_ver is not displaying firmware version
properly (it shows 0.0.0 with the current code).  Fill in the correct
firmware version number.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:33:48 -07:00
Chien Tung 923223776b RDMA/nes: Set trace length to 1 inch for SFP_D
With updated PHY firmware for SFP_D, setting the trace length to 1
inch for SFP_D provides a more stable link.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:30:35 -07:00
Chien Tung e998c25bc2 RDMA/nes: Enable repause timer for port 1
Enable repause timer for port 1.  Without this setting, under stress,
the chip may misbehave.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:29:42 -07:00
Chien Tung 366835e249 RDMA/nes: Correct CDR loop filter setting for port 1
In commit 1b949324 ("RDMA/nes: Fix SFP+ PHY initialization") there is
a mistake in the clean up code that removed port 1 CDR loop filter
settings for 10G cards other than CX4.  Put the correct setting back
for appropriate PHY types.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:28:41 -07:00
Chien Tung 010db4d127 RDMA/nes: Modify thermo mitigation to flip SerDes1 ref clk to internal
Change thermo mitigation code to flip the SerDes1 reference clock to
internal, to match the change in commit a4849fc1 ("RDMA/nes: Add
wide_ppm_offset parm for switch compatibility").

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:27:21 -07:00
Miroslaw Walukiewicz 5d1af5c832 RDMA/nes: Fix resource issues in nes_create_cq() and nes_destroy_cq()
In error paths where a CQ is not created, pbl is not freeed properly.

In nes_destroy_cq(), add the corresponding check for nescq->mcrqf to
not call nes_free_resource() when it is already done in nes_create_cq().

Signed-off-by: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 16:16:48 -07:00
Matt Kraai cc005fa20c RDMA/nes: Remove root_256()'s unused pbl_count_256 parameter
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Acked-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 10:43:21 -07:00
Jack Morgenstein 8531f1f14a IB/mthca: Fix timeout for INIT_HCA and a few other commands
Commands INIT_HCA, CLOSE_HCA, SYS_EN, SYS_DIS, and CLOSE_IB all have 1
second timeouts.  For INIT_HCA this causes problems when had more than
2^18 are QPs configured, since the command takes more than 1 second to
complete.

All other commands have 60-second timeouts.  This patch makes the
above commands consistent with the rest of the commands (and with the
chip documentation).

This patch is an expansion of a patch from Arthur Kepner
<akepner@sgi.com> fixing just the INIT_HCA timeout.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 21:12:25 -07:00
Steve Wise cde9e2f930 RDMA/cxgb3: Don't zero QP attrs when moving to IDLE
QP attributes must stay initialized when moving back to IDLE.  Zeroing
them will crash the system in _flush_qp() if the QP is subsequently
moved to ERROR and back to IDLE.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 17:00:53 -07:00
Don Wood 3f32eb1185 RDMA/nes: Fix bugs in nes_reg_phys_mr()
The code incorrectly failed memory registration if the buffer was not
page aligned.  Also, the length field is mangled causing the hardware
to think the registration is much larger than it really is.

The fix is to remove the page alignment restriction as well the
incorrect length adjustment.  Also make sure that all buffers after
the first start at a page boundary, and all buffers except the last
end on a page boundary.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:53:00 -07:00
Chien Tung 1af9222b52 RDMA/nes: Fix compiler warning at nes_verbs.c:1955
Initialize pbl_count_256 to 0 to get rid of the warning:

    drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr':
    drivers/infiniband/hw/nes/nes_verbs.c:1955: warning: 'pbl_count_256' may be used uninitialized in this function

Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:50:36 -07:00