Pull rdma updates from Jason Gunthorpe:
"This has been a fairly typical cycle, with the usual sorts of driver
updates. Several series continue to come through which improve and
modernize various parts of the core code, and we finally are starting
to get the uAPI command interface cleaned up.
- Various driver fixes for bnxt_re, cxgb3/4, hfi1, hns, i40iw, mlx4,
mlx5, qib, rxe, usnic
- Rework the entire syscall flow for uverbs to be able to run over
ioctl(). Finally getting past the historic bad choice to use
write() for command execution
- More functional coverage with the mlx5 'devx' user API
- Start of the HFI1 series for 'TID RDMA'
- SRQ support in the hns driver
- Support for new IBTA defined 2x lane widths
- A big series to consolidate all the driver function pointers into a
big struct and have drivers provide a 'static const' version of the
struct instead of open coding initialization
- New 'advise_mr' uAPI to control device caching/loading of page
tables
- Support for inline data in SRPT
- Modernize how umad uses the driver core and creates cdev's and
sysfs files
- First steps toward removing 'uobject' from the view of the drivers"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (193 commits)
RDMA/srpt: Use kmem_cache_free() instead of kfree()
RDMA/mlx5: Signedness bug in UVERBS_HANDLER()
IB/uverbs: Signedness bug in UVERBS_HANDLER()
IB/mlx5: Allocate the per-port Q counter shared when DEVX is supported
IB/umad: Start using dev_groups of class
IB/umad: Use class_groups and let core create class file
IB/umad: Refactor code to use cdev_device_add()
IB/umad: Avoid destroying device while it is accessed
IB/umad: Simplify and avoid dynamic allocation of class
IB/mlx5: Fix wrong error unwind
IB/mlx4: Remove set but not used variable 'pd'
RDMA/iwcm: Don't copy past the end of dev_name() string
IB/mlx5: Fix long EEH recover time with NVMe offloads
IB/mlx5: Simplify netdev unbinding
IB/core: Move query port to ioctl
RDMA/nldev: Expose port_cap_flags2
IB/core: uverbs copy to struct or zero helper
IB/rxe: Reuse code which sets port state
IB/rxe: Make counters thread safe
IB/mlx5: Use the correct commands for UMEM and UCTX allocation
...
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.
Additionally, we have a pile of annotation, unused variable and minor
updates.
The big API change is the updates for Christoph's DMA rework which
include removing the DISABLE_CLUSTERING flag.
And finally there are a couple of target tree updates"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits)
scsi: isci: request: mark expected switch fall-through
scsi: isci: remote_node_context: mark expected switch fall-throughs
scsi: isci: remote_device: Mark expected switch fall-throughs
scsi: isci: phy: Mark expected switch fall-through
scsi: iscsi: Capture iscsi debug messages using tracepoints
scsi: myrb: Mark expected switch fall-throughs
scsi: megaraid: fix out-of-bound array accesses
scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
scsi: fcoe: remove set but not used variable 'port'
scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
scsi: smartpqi: fix build warnings
scsi: smartpqi: update driver version
scsi: smartpqi: add ofa support
scsi: smartpqi: increase fw status register read timeout
scsi: smartpqi: bump driver version
scsi: smartpqi: add smp_utils support
scsi: smartpqi: correct lun reset issues
scsi: smartpqi: correct volume status
scsi: smartpqi: do not offline disks for transient did no connect conditions
scsi: smartpqi: allow for larger raid maps
...
The same effects can be achieved by setting the dma_boundary to
PAGE_SIZE - 1 and the max_segment_size to PAGE_SIZE, so shift those
settings into the drivers. Note that in many cases the setting might
be bogus, but this keeps the status quo.
[mkp: fix myrs and myrb]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This allows the host driver to indicate the maximum supported
segment size in a nice an easy way, so that the driver doesn't
have to worry about DMA-layer imposed limitations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Most SCSI drivers want to enable "clustering", that is merging of
segments so that they might span more than a single page. Remove the
ENABLE_CLUSTERING define, and require drivers to explicitly set
DISABLE_CLUSTERING to disable this feature.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Smatch generates a warning:
drivers/scsi/scsi_lib.c:1656 scsi_mq_done() warn: test_bit() takes a bit number
The problem is that SCMD_STATE_COMPLETE is supposed to be bit number 0
and not a mask like "(1 << 0)". It is used like this:
if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
The test_and_set_bit() has a shift built in so it's a double left shift
and uses bit number 1 instead of number 0. This bug is harmless because
it's done consistently and it doesn't clash with any other flags.
Fixes: f1342709d1 ("scsi: Do not rely on blk-mq for double completions")
Reviewed-by: Keith Busch <keith.busch@intel.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The scsi timeout error handling had been directly updating the block
layer's request state to prevent a error handling and a natural completion
from completing the same request twice. Fix this layering violation
by having scsi control the fate of its commands with scsi owned flags
rather than use blk-mq's.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove the last use of the old BLKPREP_* values, which get converted
to BLK_STS_* later anyway.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Replace the old BLKPREP_* values with the BLK_STS_ ones that they are
converted to later anyway.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull SCSI updates from James Bottomley:
"This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
hisi_sas, smartpqi, megaraid_sas, arcmsr.
In addition, with the continuing absence of Nic we have target updates
for tcmu and target core (all with reviews and acks).
The biggest observable change is going to be that we're (again) trying
to switch to mulitqueue as the default (a user can still override the
setting on the kernel command line).
Other major core stuff is the removal of the remaining Microchannel
drivers, an update of the internal timers and some reworks of
completion and result handling"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue
scsi: ufs: remove unnecessary query(DM) UPIU trace
scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
scsi: aacraid: Spelling fix in comment
scsi: mpt3sas: Fix calltrace observed while running IO & reset
scsi: aic94xx: fix an error code in aic94xx_init()
scsi: st: remove redundant pointer STbuffer
scsi: qla2xxx: Update driver version to 10.00.00.08-k
scsi: qla2xxx: Migrate NVME N2N handling into state machine
scsi: qla2xxx: Save frame payload size from ICB
scsi: qla2xxx: Fix stalled relogin
scsi: qla2xxx: Fix race between switch cmd completion and timeout
scsi: qla2xxx: Fix Management Server NPort handle reservation logic
scsi: qla2xxx: Flush mailbox commands on chip reset
scsi: qla2xxx: Fix unintended Logout
scsi: qla2xxx: Fix session state stuck in Get Port DB
scsi: qla2xxx: Fix redundant fc_rport registration
scsi: qla2xxx: Silent erroneous message
scsi: qla2xxx: Prevent sysfs access when chip is down
scsi: qla2xxx: Add longer window for chip reset
...
To avoid introducing problems like those fixed in commit f7068114d4
("sr: pass down correctly sized SCSI sense buffer"), this creates a macro
wrapper for scsi_execute() that verifies the size of the sense buffer
similar to what was done for command string sizes in commit 3756f6401c
("exec: avoid gcc-8 warning for get_task_comm").
Another solution could be to add a length argument to scsi_execute(),
but this function already takes a lot of arguments and Jens was not fond
of that approach.
Additionally, this moves the SCSI_SENSE_BUFFERSIZE definition into
scsi_device.h, and removes a redundant include for scsi_device.h from
scsi_cmnd.h.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently this function is implemented in the scsi layer, but it's
actual place should be the block layer since T10-PI is a general
data integrity feature that is used in the nvme protocol as well.
Suggested-by: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 2623c7a5f2 ("libata: add refcounting to ata_host") v4.17+ introduced
refcounting to ata_host and will increase or decrease the refcount when
adding or deleting transport ATA port.
Now the ata host for libsas is embedded in domain_device, and the ->kref
member is not initialized. Afer we add ata transport class, ata_host_get()
will be called when adding transport ATA port and a warning will be
triggered as below:
refcount_t: increment on 0; use-after-free.
WARNING: CPU: 2 PID: 103 at
lib/refcount.c:153 refcount_inc+0x40/0x48 ...... Call trace:
refcount_inc+0x40/0x48
ata_host_get+0x10/0x18
ata_tport_add+0x40/0x120
ata_sas_tport_add+0xc/0x14
sas_ata_init+0x7c/0xc8
sas_discover_domain+0x380/0x53c
process_one_work+0x12c/0x288
worker_thread+0x58/0x3f0
kthread+0xfc/0x128
ret_from_fork+0x10/0x18
And also when removing transport ATA port ata_host_put() will be called and
another similar warning will be triggered. If the refcount decreased to
zero, the ata host will be freed. But this ata host is only part of
domain_device, it cannot be freed directly.
So we have to change this embedded static ata host to a dynamically
allocated ata host and initialize the ->kref member. To use ata_host_get()
and ata_host_put() in libsas, we need to move the declaration of these
functions to the public libata.h and export them.
Fixes: b6240a4df0 ("scsi: libsas: add transport class for ATA devices")
Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Taras Kondratiuk <takondra@cisco.com>
CC: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SCSI updates from James Bottomley:
"This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
xfcp, hisi_sas, cxlflash, qla2xxx.
In the absence of Nic, we're also taking target updates which are
mostly minor except for the tcmu refactor.
The only real core change to worry about is the removal of high page
bouncing (in sas, storvsc and iscsi). This has been well tested and no
problems have shown up so far"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits)
scsi: lpfc: update driver version to 12.0.0.4
scsi: lpfc: Fix port initialization failure.
scsi: lpfc: Fix 16gb hbas failing cq create.
scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
scsi: lpfc: correct oversubscription of nvme io requests for an adapter
scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
scsi: hisi_sas: Mark PHY as in reset for nexus reset
scsi: hisi_sas: Fix return value when get_free_slot() failed
scsi: hisi_sas: Terminate STP reject quickly for v2 hw
scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
scsi: hisi_sas: Try wait commands before before controller reset
scsi: hisi_sas: Init disks after controller reset
scsi: hisi_sas: Create a scsi_host_template per HW module
scsi: hisi_sas: Reset disks when discovered
scsi: hisi_sas: Add LED feature for v3 hw
scsi: hisi_sas: Change common allocation mode of device id
scsi: hisi_sas: change slot index allocation mode
scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
...
The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command. Fix the symbolic name to reflect that a little better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Always GFP_KERNEL, and keeping it would cause serious complications for
the next change.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
On Fujitsu ETERNUS systems, sense code ABORTED COMMAND with ASC/Q C1/01
is used to indicate temporary condition where the storage-internal path
to a target is switched from one controller to another. SCSI commands
that return with this error code must be retried unconditionally
(i.e. without the "maybe_retry" logic in scsi_decide_disposition);
otherwise dm-multipath might initiate a failover from a healthy path
e.g. for REQ_FAILFAST_DEV commands.
Introduce a new blist flag for this case.
[mkp: applied by hand]
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
EMC Symmetrix returns 'internal target error' for a variety of
conditions, most of which will be transient. So we should always retry
it, even with failfast set. Otherwise we'd get spurious path flaps with
multipath.
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Warn if a device (or the user) sets blist flags which are unknown
or have been removed. This should enable us to reuse freed blist
bits in later releases.
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Space for SCSI blist flags is gradually running out. Change the type to
__u64 and fix a checkpatch complaint about symbolic mode flags in
scsi_devinfo.c.
Make checkpatch happy by replacing simple_strtoul() with kstrtoull().
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In commit 2104551969 ("scsi: use per-cpu buffer for formatting sense"),
function scsi_show_extd_sense() was removed, switching use over to
scsi_format_extd_sense(). Remove last reference to scsi_show_extd_sense()
in include/scsi/scsi_dbg.h.
Signed-off-by: John Pittman <jpittman@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>