Commit Graph

307 Commits

Author SHA1 Message Date
James Bottomley
622f9a8e7b Merge tag 'fcoe' into for-linus
A short series of fixes to libfc, libfcoe and fcoe.
Most patches fix formatting problems, one changes
the behavior of which discovered ports can/will be
logged into and another fixes a memory leak.
2013-07-13 08:22:56 +04:00
Robert Love
3a2926054a libfc: Differentiate echange timer cancellation debug statements
There are two debug statements with the same output string regarding
echange timer cancellation. This patch simply changes the output of
one string so that they can be differentiated.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2013-07-09 11:18:41 -07:00
Robert Love
4a80f083dd libfc: Remove extra space in fc_exch_timer_cancel definition
Simply remove an extra space that violates coding style.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2013-07-09 11:18:33 -07:00
Mark Rustad
e0335f67a2 libfc: Reject PLOGI from nodes with incompatible role
Reject a PLOGI from a node with an incompatible role,
that is, initiator-to-initiator or target-to-target.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Reviewed-by: Yi Zou <yi.zou@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-07-09 09:29:14 -07:00
Linus Torvalds
80cc38b163 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "The usual stuff from trivial tree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
  treewide: relase -> release
  Documentation/cgroups/memory.txt: fix stat file documentation
  sysctl/net.txt: delete reference to obsolete 2.4.x kernel
  spinlock_api_smp.h: fix preprocessor comments
  treewide: Fix typo in printk
  doc: device tree: clarify stuff in usage-model.txt.
  open firmware: "/aliasas" -> "/aliases"
  md: bcache: Fixed a typo with the word 'arithmetic'
  irq/generic-chip: fix a few kernel-doc entries
  frv: Convert use of typedef ctl_table to struct ctl_table
  sgi: xpc: Convert use of typedef ctl_table to struct ctl_table
  doc: clk: Fix incorrect wording
  Documentation/arm/IXP4xx fix a typo
  Documentation/networking/ieee802154 fix a typo
  Documentation/DocBook/media/v4l fix a typo
  Documentation/video4linux/si476x.txt fix a typo
  Documentation/virtual/kvm/api.txt fix a typo
  Documentation/early-userspace/README fix a typo
  Documentation/video4linux/soc-camera.txt fix a typo
  lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment
  ...
2013-07-04 11:40:58 -07:00
Geert Uytterhoeven
83a35e3604 treewide: relase -> release
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-06-28 14:34:33 +02:00
Neil Horman
fb00cc2353 libfc: extend ex_lock to protect all of fc_seq_send
This warning was reported recently:

WARNING: at drivers/scsi/libfc/fc_exch.c:478 fc_seq_send+0x14f/0x160 [libfc]()
(Not tainted)
Hardware name: ProLiant DL120 G7
Modules linked in: tcm_fc target_core_iblock target_core_file target_core_pscsi
target_core_mod configfs dm_round_robin dm_multipath 8021q garp stp llc bnx2fc
cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt autofs4 sunrpc
pcc_cpufreq ipv6 hpilo hpwdt e1000e microcode iTCO_wdt iTCO_vendor_support
serio_raw shpchp ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi
ata_generic ata_piix hpsa dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
scsi_wait_scan]
Pid: 5464, comm: target_completi Not tainted 2.6.32-272.el6.x86_64 #1
Call Trace:
 [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20
 [<ffffffffa025f7df>] ? fc_seq_send+0x14f/0x160 [libfc]
 [<ffffffffa035cbce>] ? ft_queue_status+0x16e/0x210 [tcm_fc]
 [<ffffffffa030a660>] ? target_complete_ok_work+0x0/0x4b0 [target_core_mod]
 [<ffffffffa030a766>] ? target_complete_ok_work+0x106/0x4b0 [target_core_mod]
 [<ffffffffa030a660>] ? target_complete_ok_work+0x0/0x4b0 [target_core_mod]
 [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0
 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
 [<ffffffff81091d66>] ? kthread+0x96/0xa0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff81091cd0>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20

It occurs because fc_seq_send can have multiple contexts executing within it at
the same time, and fc_seq_send doesn't consistently use the ep->ex_lock that
protects this structure.  Because of that, its possible for one context to clear
the INIT bit in the ep->esb_state field while another checks it, leading to the
above stack trace generated by the WARN_ON in the function.

We should probably undertake the effort to convert access to the fc_exch
structures to use rcu, but that a larger work item.  To just fix this specific
issue, we can just extend the ex_lock protection through the entire fc_seq_send
path

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Gris Ge <fge@redhat.com>
CC: Robert Love <robert.w.love@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-05-10 10:19:23 -07:00
Mark Rustad
732bdb9d14 libfc: Correct check for initiator role
The service_params field is being checked against the symbol
FC_RPORT_ROLE_FCP_INITIATOR where it really should be checked
against FCP_SPPF_INIT_FCN.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-05-10 10:19:22 -07:00
Robert Love
0807619d3c libfc, fcoe, bnx2fc: Split fc_disc_init into fc_disc_{init, config}
Split discovery initialization in code that is setup once (fcoe_disc_init)
and code that can be re-configured (fcoe_disc_config).

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
2013-03-25 16:03:03 -07:00
Robert Love
8a9a713812 libfc, fcoe, bnx2fc: Always use fcoe_disc_init for discovery layer initialization
Currently libfcoe is doing some libfc discovery layer initialization outside of
libfc. This patch moves this code into libfc and sets up a split in discovery
(one time) initialization code and (re-configurable) settings that will come in
the next patch.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
2013-03-25 16:01:10 -07:00
Krishna Mohan
a586069b0f libfc: XenServer fails to mount root filesystem.
schedule_delayed_work() is using msec instead of jiffies. On PLOGI
reject from target, remote port retry is scheduled @ 20 sec instead
of 2sec(FC_DEF_E_D_TOV).
XenServer dom0 kernel is configured with CONFIG_HZ=100Hz

Signed-off-by: Krishna Mohan <krmohan@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-02-15 09:33:27 -08:00
Robert Love
8e6c5363dc libfc, libfcoe, fcoe: Convert debug_logging macros to pr_info
Convert libfc, libfcoe and fcoe's debug_logging macros
to use pr_info() instead of printk(KERN_INFO, ...). checkpatch.pl
now complains about this, so convert libfcoe to preferred
method.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
2012-12-14 10:38:55 -08:00
Vasu Dev
5b97fabdc8 libfc: fix REC handling
Currently fc_fcp_timeout doesn't check FC_RP_FLAGS_REC_SUPPORTED
flag first, this prevents REC request ever going out at all
to the target having REC support. So this patches fixes the
fc_fcp_timeout by checking FC_RP_FLAGS_REC_SUPPORTED flag first.

The changed order won't cause any issue during clearing
FC_RP_FLAGS_REC_SUPPORTED on failed IO with target not supporting
FC_RP_FLAGS_REC_SUPPORTED, since retry on failed IO would succeed.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2012-12-04 10:16:38 -08:00
Yi Zou
3b64b18811 [SCSI] libfc: fix lun reset failure bugs in fc_fcp_resp handling of FCP_RSP_INFO
In LUN RESET testing involving NetApp targets, it is observed that LUN
RESET is failing. The fc_fcp_resp() is not completing the completion
for the LUN RESET task since fc_fcp_resp assumes that the FCP_RSP_INFO
is 8 bytes with the 4 byte reserved field, where in case of NetApp targets
the FCP_RSP to LUN RESET only has 4 bytes of FCP_RSP_INFO. This leads
fc_fcp_resp to error out w/o completing the task completion, eventually
causing LUN RESET to be escalated to host reset, which is not very nice.

Per FCP-3 r04, clause 9.5.15 and Table 23, the FCP_RSP_INFO field can be either
4 bytes or 8 bytes, with the last 4 bytes as "Reserved (if any)". Therefore it
is valid to have 4 bytes FCP_RSP_INFO like some of the NetApp targets behave.
Fixing this by validating the FCP_RSP_INFO against both the two spec allowed
length.

Reported-by: Frank Zhang <frank_1.zhang@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-10-07 11:52:55 +01:00
Yi Zou
a752359f2b [SCSI] libfc: fix sending REC after FCP_RESP is received
This is exposed in the case the FCP_DATA frames somehow got lost and fc_fcp got
the FCP_RSP, in fc_fcp_recv_resp(), since xfer_len is less than the expected_len
it resets the the timer to wait to 2 more jiffies in case the data frames are
already queued locally. However, for target does not support REC, it would just
send RJT w/ ELS_RJT_UNSUP. The rec response handler thus only clears the rport
flag for not doing REC later, but does not do fcp_io_complete() on the
associated fsp.

The fix is just check status of FCP_RSP being received already, i.e. using the
FC_SRB_RCV_STATUS flag, in fc_fcp_timeout before start sending REC. We should
have waited long enough if there is truely data frames queued locally.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:56 +01:00
Vasu Dev
ac166d2fbd [SCSI] libfc: fix retries with FDMI lport states
The FC-GS-3 sepc requires to wait for least 3 times R_A_TOV per
sec 4.6.1 "If the Requesting_CT does not receive a Response
CT_IU from the Responding_CT within three times R_A_TOV,
it shall consider this to be a protocol error."

This means added four new states with management server
could add significant delay with multiple retries
on default 12 second timeout(3 * R_A_TOV), so instead
just skip these states on very first timeout on any of
these states to not stuck with states for such longer
period.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:56 +01:00
Yi Zou
db95fc004e [SCSI] libfc: don't exch_done() on invalid sequence ptr
The lport_recv(), i.e., fc_lport_recv_req() may get called w/o the sequence ptr
being set in fr_seq(), particularly in the case of vn2vn mode, this may happen
if the passive fcp provider, e.g., tcm_fc, has not been registered yet.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:56 +01:00
Vasu Dev
b29a4f309f [SCSI] libfc: add exch timer debug info
Add exch timeout info to have debug log with exch timeout
value to match with retries, also add debug info
on exch timer cancel.

Added common fc_exch_timer_cancel() func and grouped this
along with fc_exch_timer_set() function, so that
added debug code is not repeated.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:55 +01:00
Vasu Dev
4e5fae7adb [SCSI] libfc: update fcp and exch stats
Updates newly added stats from fc_get_host_stats,
added new function fc_exch_update_stats to
update exches related stats from fc_exch.c
by going thru internal ema_list elements.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by : Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:31:48 +01:00
Vasu Dev
0f02a66528 [SCSI] libfc: adds FCP failures stats
Adds stats to track FCP pkt and frame alloc
failure.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by : Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:31:48 +01:00
Vasu Dev
1bd49b4820 [SCSI] libfc, fcoe, bnx2fc: cleanup fcoe_dev_stats
The libfc is used by fcoe but fcoe agnostic,
and therefore should not have any fcoe references.

So renaming fcoe_dev_stats from libfc as its for fc_stats.
After that libfc is fcoe string free except some strings for
Open-FCoE.org.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by : Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:31:47 +01:00
James Bottomley
e346933365 Merge tag 'isci-for-3.5' into misc
isci update for 3.5

1/ Rework remote-node-context (RNC) handling for proper management of
   the silicon state machine in error handling and hot-plug conditions.
   Further details below, suffice to say if the RNC is mismanaged the
   silicon state machines may lock up.

2/ Refactor the initialization code to be reused for suspend/resume support

3/ Miscellaneous bug fixes to address discovery issues and hardware
   compatibility.

RNC rework details from Jeff Skirvin:

In the controller, devices as they appear on a SAS domain (or
direct-attached SATA devices) are represented by memory structures known
as "Remote Node Contexts" (RNCs).  These structures are transferred from
main memory to the controller using a set of register commands; these
commands include setting up the context ("posting"), removing the
context ("invalidating"), and commands to control the scheduling of
commands and connections to that remote device ("suspensions" and
"resumptions").  There is a similar path to control RNC scheduling from
the protocol engine, which interprets the results of command and data
transmission and reception.

In general, the controller chooses among non-suspended RNCs to find one
that has work requiring scheduling the transmission of command and data
frames to a target.  Likewise, when a target tries to return data back
to the initiator, the state of the RNC is used by the controller to
determine how to treat the incoming request. As an example, if the RNC
is in the state "TX/RX Suspended", incoming SSP connection requests from
the target will be rejected by the controller hardware.  When an RNC is
"TX Suspended", it will not be selected by the controller hardware to
start outgoing command or data operations (with certain priority-based
exceptions).

As mentioned above, there are two sources for management of the RNC
states: commands from driver software, and the result of transmission
and reception conditions of commands and data signaled by the controller
hardware.  As an example of the latter, if an outgoing SSP command ends
with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
transition to the "TX Suspended" state, and this is signaled by the
controller hardware in the status to the completion of the pending
command as well as signaled in a controller hardware event.  Examples of
the former are included in the patch changelogs.

Driver software is required to suspend the RNC in a "TX/RX Suspended"
condition before any outstanding commands can be terminated.  Failure to
guarantee this can lead to a complete hardware hang condition.  Earlier
versions of the driver software did not guarantee that an RNC was
correctly managed before I/O termination, and so operated in an unsafe
way.

Further, the driver performed unnecessary contortions to preserve the
remote device command state and so was more complicated than it needed
to be.  A simplifying driver assumption is that once an I/O has entered
the error handler path without having completed in the target, the
requirement on the driver is that all use of the sas_task must end.
Beyond that, recovery of operation is dependent on libsas and other
components to reset, rediscover and reconfigure the device before normal
operation can restart.  In the driver, this simplifying assumption meant
that the RNC management could be reduced to entry into the suspended
state, terminating the targeted I/O request, and resuming the RNC as
needed for device-specific management such as an SSP Abort Task or LUN
Reset Management request.
2012-05-21 12:17:30 +01:00
Vasu Dev
061446a159 [SCSI] libfc: flush lport worker after its disabled
The lport could get timeout armed while its getting disabled,
so flush lport worker after its disabled and ignore lport
retry in that case instead of WARN_ON.

 [13192.936858] WARNING: at drivers/scsi/libfc/fc_lport.c:1573 fc_lport_timeout+0x53/0xa9 [libfc]()
 [13192.938026] Hardware name: Bochs
 [13192.938620] Modules linked in: fcoe libfcoe libfc scsi_transport_fc scsi_tgt fuse 8021q garp stp llc sunrpc ipv6 uinput microcode joydev pcspkr ixgbe e1000 i2c_piix4 i2c_core virtio_balloon dca mdio virtio_blk virtio_pci virtio_ring virtio floppy [last unloaded: speedstep_lib]
 [13192.942589] Pid: 23605, comm: kworker/0:6 Tainted: G        W    3.2.0+ #71
 [13192.943587] Call Trace:
 [13192.944052]  [<ffffffff810403f4>] warn_slowpath_common+0x85/0x9d
 [13192.944940]  [<ffffffff81040426>] warn_slowpath_null+0x1a/0x1c
 [13192.945734]  [<ffffffffa02746eb>] fc_lport_timeout+0x53/0xa9 [libfc]
 [13192.946665]  [<ffffffff81058d88>] process_one_work+0x20c/0x3ad
 [13192.947541]  [<ffffffff81058cbe>] ? process_one_work+0x142/0x3ad
 [13192.948423]  [<ffffffffa0274698>] ? fc_lport_enter_ns+0x178/0x178 [libfc]
 [13192.949363]  [<ffffffff8105a313>] worker_thread+0xfd/0x181
 [13192.950191]  [<ffffffff8105a216>] ? manage_workers.clone.15+0x173/0x173
 [13192.951100]  [<ffffffff8105e19b>] kthread+0xa4/0xac
 [13192.951755]  [<ffffffff814edbb4>] kernel_thread_helper+0x4/0x10
 [13192.952520]  [<ffffffff814e5cb4>] ? retint_restore_args+0x13/0x13
 [13192.953398]  [<ffffffff8105e0f7>] ? __init_kthread_worker+0x5b/0x5b
 [13192.954278]  [<ffffffff814edbb0>] ? gs_change+0x13/0x13
 [13192.954911] ---[ end trace 9763213b95bbd803 ]---

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-10 08:59:25 +01:00
Vasu Dev
93f90e5186 [SCSI] libfc: update mfs boundry checking
A previous commit changed the mfs checking to ensure the new
mfs is less or equal to the mfs supported by the FCF. This
doesn't work for BRDCM cards as they set an mfs of 2048 regardless
of whether the switch returns a larger mfs.

This patch validates the new mfs against the upper and lower spec
defined boundries for a FCoE mfs.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-04-25 08:46:29 +01:00
Linus Torvalds
a75ee6ecd4 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
Pull SCSI updates from James Bottomley:
 "This is primarily another round of driver updates (lpfc, bfa, fcoe,
  ipr) plus a new ufshcd driver.  There shouldn't be anything
  controversial in here (The final deletion of scsi proc_ops which
  caused some build breakage has been held over until the next merge
  window to give us more time to stabilise it).

  I'm afraid, with me moving continents at exactly the wrong time,
  anything submitted after the merge window opened has been held over to
  the next merge window."

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (63 commits)
  [SCSI] ipr: Driver version 2.5.3
  [SCSI] ipr: Increase alignment boundary of command blocks
  [SCSI] ipr: Increase max concurrent oustanding commands
  [SCSI] ipr: Remove unnecessary memory barriers
  [SCSI] ipr: Remove unnecessary interrupt clearing on new adapters
  [SCSI] ipr: Fix target id allocation re-use problem
  [SCSI] atp870u, mpt2sas, qla4xxx use pci_dev->revision
  [SCSI] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up
  [SCSI] bfa: Update the driver version to 3.0.23.0
  [SCSI] bfa: BSG and User interface fixes.
  [SCSI] bfa: Fix to avoid vport delete hang on request queue full scenario.
  [SCSI] bfa: Move service parameter programming logic into firmware.
  [SCSI] bfa: Revised Fabric Assigned Address(FAA) feature implementation.
  [SCSI] bfa: Flash controller IOC pll init fixes.
  [SCSI] bfa: Serialize the IOC hw semaphore unlock logic.
  [SCSI] bfa: Modify ISR to process pending completions
  [SCSI] bfa: Add fc host issue lip support
  [SCSI] mpt2sas: remove extraneous sas_log_info messages
  [SCSI] libfc: fcoe_transport_create fails in single-CPU environment
  [SCSI] fcoe: reduce contention for fcoe_rx_list lock [v2]
  ...
2012-03-31 13:31:23 -07:00