Normally In HBA reset path MPT driver will flush existing work in current work
queue (mpt/0) . This is just a dummy activity for MPT driver point of
view, since HBA reset will turn off Work queue events.
It means we will simply returns from work queue without doing anything.
But for the case where Work is already done (half the way), we have to have
that work to be done.
Considering above condition we stuck forever since Deadlock in scsi midlayer
and MPT driver. sd_sync_cache() will wait forever since HBA is not in
Running state, and it will never come into Running state since
sd_sync_cache() is called from HBA reset context.
Now new code will not wait for half cooked work to be finished
before returning from HBA reset.
Once we are out of HBA reset, EH thread will change host state to running from
recovery and work waiting for running state of HBA will be finished.
New code is turning ON firmware event from another special work called
Rescan toplogy.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Driver is modified to return DID_NO_CONNECT for all pending I/O
requests for bus type SAS, if it founds the target is removed at
the firmware level.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch is solving problem for PAE kernel DMA operation.
On PAE system dma_addr and unsigned long will have different
values.
Now dma_addr is not type casted using unsigned long.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
On Big endian system kernel will crash due to address translation
is not handle properly.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
In fabric-login based on iop BIT_8 firmware notifies presence of
a FCP2 device and not necessarily a TAPE device. So instead of
setting FCF_TAPE_PRESENT flag there we set it using
scsi_device->type after mid-layer scan recognises "type" of the
device.
It also adds a new flag FCF_FCP2_DEVICE for any future use.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This fixes one cause of an occational problem when unloading
libfc where the exchange manager pool doesn't have all items freed.
The existing WARN_ON(mp->total_exches <= 0) isn't hit.
However, note that total_exches is decremented when the
exchange is completed, and it can be held with a refcnt
for a while after that.
I'm not sure what the offending exchange is, but I suspect
it is an incoming request, because outgoing state machines
should be all stopped at this point.
Note that although receive is stopped before the exchange
manager is freed, there could still be active threads
handling received frames.
This patch flushes the queues by allocating a new skb
and sending it through, and have the thread handle
this new skb specially. This is similar to the way the work
queues are flushed now by putting work items in them and waiting
until they make it through the queue.
An skb->destructor function is used to inform us of
the completion of the flush, and the fr_dev() is left
NULL to indicate to fcoe_percpu_receive_thread() that
the skb should be just freed. There's already a check
for the lp being NULL which prints a message.
We skip printing the message if the destructor is for flushing.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
I saw an lport debug message from the exchange manager saying:
"lport 70500: Received response for out of range oxid:ffff"
A trace showed this was a BA_RJT sent due to an incoming ABTS
which arrived on an unknown exchange. So, the sender of the
BA_RJT was in error, but in this case, both the initiator and
responder were the same machine.
The OX_ID and RX_ID should not have been reversed in this case.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When an RSCN indicates changes to individual remote ports,
don't blindly log them out and then back in. Instead, determine
whether they're still in the directory, by doing GPN_ID.
If that is successful, call login, which will send ADISC and reverify,
otherwise, call logoff. Perhaps we should just delete the rport,
not send LOGO, but it seems safer.
Also, fix a possible issue where if a mix of records in the RSCN
cause us to queue disc_ports for disc_single and then we decide
to do full rediscovery, we leak memory for those disc_ports queued.
So, go through the list of disc_ports even if doing full discovery.
Free the disc_ports in any case. If any of the disc_single() calls
return error, do a full discovery.
The ability to fill in GPN_ID requests was added to fc_ct_fill().
For this, it needs the FC_ID to be passed in as an arg.
The did parameter for fc_elsct_send() is used for that, since the
actual D_DID will always be 0xfffffc for all CT requests so far.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The local port facility has been replying to ADISC requests without
looking to see if the remote port is logged in. This is incorrect.
An ADISC request requires PLOGI first. It should be rejected if
the sending remote port is not logged in.
This is like other incoming requests that require login, all of
which should be handled in the remote port module.
Move the ADISC request handling from fc_lport.c to fc_rport.c.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When rport_login is called on an rport that is already thought
to be logged in, use ADISC. If that fails, redo PLOGI.
This is less disruptive after fabric changes that don't affect
the state of the target.
Implement the sending of ADISC via fc_els_fill.
Add ADISC state to the rport state machine. This is entered from READY
and returns to READY after successful completion. If it fails, the rport
is either logged off and deleted or re-does PLOGI.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
fc_rport_logo_resp() had a call to fc_rport_enter_rtv() if the
LOGO was accepted. This must've been a copy/paste mistake, but
it didn't matter since we don't stay in the LOGO state long enough
to hit this code.
Change fc_rport_logo_resp() to just enter the delete state
no matter what.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
After a quick link flap, a target was seen to send us a LOGO.
Apparently, it saw an RSCN reporting that we had dropped out of the
fabric after we had logged back into it.
This is likely in larger fabrics (more than 2 FC switches) after
a quick link flap at the initiator. Each link transition causes
an port-specific RSCN to the target. After the link comes back up,
the initiator successfully discovers and does a PLOGI to the target
before the target sees the first RSCN reporting the initiator is gone,
and it sends a LOGO. The target may see a subsequent RSCN saying the
port is back, but probably wouldn't send a PLOGI and leaves it
up to the initiator to re-login.
An RSCN can be delayed by the switches due to software layers but a
PLOGI is forwarded in hardware causing the PLOGI to beat the RSCN.
If a remote port is in the discovered set and sends a LOGO, re-login to it.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When receiving an ELS request, if the request isn't recognized,
the unsupported operation error should be given even if the port
is not found or not logged in.
Also, the LOGO request shouldn't give the login-required explanation.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
libfc receives PLOGIs from switches which are trying to discover what
kind of devices are present, and from other initiators to find out
if we're a target.
As an initiator, some argue we don't need to handle incoming PLOGI
requests, and we currently reject them from unknown remote ports,
but accept them is we're in the middle of a PLOGI to the remote port.
For eventual target implementations, we want to handle them always.
For incoming PLOGI, don't fail if the rport_priv doesn't exist.
Just create it and go become READY without going through PRLI. If
PRLI occurs, then our roles will be set and we'll become READY again.
Also, allow incoming PRLI in RTV state.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Improve lport and rport debug messages to indicate whether
the response is LS_ACC, LS_RJT, closed, or timeout.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>