Pull block updates from Jens Axboe:
- clean up how we pass around gfp_t and
blk_mq_req_flags_t (Christoph)
- prepare us to defer scheduler attach (Christoph)
- clean up drivers handling of bounce buffers (Christoph)
- fix timeout handling corner cases (Christoph/Bart/Keith)
- bcache fixes (Coly)
- prep work for bcachefs and some block layer optimizations (Kent).
- convert users of bio_sets to using embedded structs (Kent).
- fixes for the BFQ io scheduler (Paolo/Davide/Filippo)
- lightnvm fixes and improvements (Matias, with contributions from Hans
and Javier)
- adding discard throttling to blk-wbt (me)
- sbitmap blk-mq-tag handling (me/Omar/Ming).
- remove the sparc jsflash block driver, acked by DaveM.
- Kyber scheduler improvement from Jianchao, making it more friendly
wrt merging.
- conversion of symbolic proc permissions to octal, from Joe Perches.
Previously the block parts were a mix of both.
- nbd fixes (Josef and Kevin Vigor)
- unify how we handle the various kinds of timestamps that the block
core and utility code uses (Omar)
- three NVMe pull requests from Keith and Christoph, bringing AEN to
feature completeness, file backed namespaces, cq/sq lock split, and
various fixes
- various little fixes and improvements all over the map
* tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits)
blk-mq: update nr_requests when switching to 'none' scheduler
block: don't use blocking queue entered for recursive bio submits
dm-crypt: fix warning in shutdown path
lightnvm: pblk: take bitmap alloc. out of critical section
lightnvm: pblk: kick writer on new flush points
lightnvm: pblk: only try to recover lines with written smeta
lightnvm: pblk: remove unnecessary bio_get/put
lightnvm: pblk: add possibility to set write buffer size manually
lightnvm: fix partial read error path
lightnvm: proper error handling for pblk_bio_add_pages
lightnvm: pblk: fix smeta write error path
lightnvm: pblk: garbage collect lines with failed writes
lightnvm: pblk: rework write error recovery path
lightnvm: pblk: remove dead function
lightnvm: pass flag on graceful teardown to targets
lightnvm: pblk: check for chunk size before allocating it
lightnvm: pblk: remove unnecessary argument
lightnvm: pblk: remove unnecessary indirection
lightnvm: pblk: return NVM_ error on failed submission
lightnvm: pblk: warn in case of corrupted write buffer
...
Problem:
$ cat /sys/kernel/config/target/core/user_0/block/attrib/qfull_time_out
-1
$ echo "-1" > /sys/kernel/config/target/core/user_0/block/attrib/qfull_time_out
-bash: echo: write error: Invalid argument
Fix:
This patch will help reset qfull_time_out to its default
i.e. qfull_time_out=-1.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Minor optimization - remove a pointer indirection when using fs_bio_set.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Switch everyone to blk_get_request_flags, and then rename
blk_get_request_flags to blk_get_request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull block layer updates from Jens Axboe:
"It's a pretty quiet round this time, which is nice. This contains:
- series from Bart, cleaning up the way we set/test/clear atomic
queue flags.
- series from Bart, fixing races between gendisk and queue
registration and removal.
- set of bcache fixes and improvements from various folks, by way of
Michael Lyle.
- set of lightnvm updates from Matias, most of it being the 1.2 to
2.0 transition.
- removal of unused DIO flags from Nikolay.
- blk-mq/sbitmap memory ordering fixes from Omar.
- divide-by-zero fix for BFQ from Paolo.
- minor documentation patches from Randy.
- timeout fix from Tejun.
- Alpha "can't write a char atomically" fix from Mikulas.
- set of NVMe fixes by way of Keith.
- bsg and bsg-lib improvements from Christoph.
- a few sed-opal fixes from Jonas.
- cdrom check-disk-change deadlock fix from Maurizio.
- various little fixes, comment fixes, etc from various folks"
* tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block: (139 commits)
blk-mq: Directly schedule q->timeout_work when aborting a request
blktrace: fix comment in blktrace_api.h
lightnvm: remove function name in strings
lightnvm: pblk: remove some unnecessary NULL checks
lightnvm: pblk: don't recover unwritten lines
lightnvm: pblk: implement 2.0 support
lightnvm: pblk: implement get log report chunk
lightnvm: pblk: rename ppaf* to addrf*
lightnvm: pblk: check for supported version
lightnvm: implement get log report chunk helpers
lightnvm: make address conversions depend on generic device
lightnvm: add support for 2.0 address format
lightnvm: normalize geometry nomenclature
lightnvm: complete geo structure with maxoc*
lightnvm: add shorten OCSSD version in geo
lightnvm: add minor version to generic geometry
lightnvm: simplify geometry structure
lightnvm: pblk: refactor init/exit sequences
lightnvm: Avoid validation of default op value
lightnvm: centralize permission check for lightnvm ioctl
...
Changes since v1:
Added changes in these files:
drivers/infiniband/hw/usnic/usnic_transport.c
drivers/staging/lustre/lnet/lnet/lib-socket.c
drivers/target/iscsi/iscsi_target_login.c
drivers/vhost/net.c
fs/dlm/lowcomms.c
fs/ocfs2/cluster/tcp.c
security/tomoyo/network.c
Before:
All these functions either return a negative error indicator,
or store length of sockaddr into "int *socklen" parameter
and return zero on success.
"int *socklen" parameter is awkward. For example, if caller does not
care, it still needs to provide on-stack storage for the value
it does not need.
None of the many FOO_getname() functions of various protocols
ever used old value of *socklen. They always just overwrite it.
This change drops this parameter, and makes all these functions, on success,
return length of sockaddr. It's always >= 0 and can be differentiated
from an error.
Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.
rpc_sockname() lost "int buflen" parameter, since its only use was
to be passed to kernel_getsockname() as &buflen and subsequently
not used in any way.
Userspace API is not changed.
text data bss dec hex filename
30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
30108109 2633612 873672 33615393 200ee21 vmlinux.o
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-bluetooth@vger.kernel.org
CC: linux-decnet-user@lists.sourceforge.net
CC: linux-wireless@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: linux-sctp@vger.kernel.org
CC: linux-nfs@vger.kernel.org
CC: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull SCSI target updates from Nicholas Bellinger:
"The highlights include:
- numerous target-core-user improvements related to queue full and
timeout handling. (MNC)
- prevent target-core-user corruption when invalid data page is
requested. (MNC)
- add target-core device action configfs attributes to allow
user-space to trigger events separate from existing attributes
exposed to end-users. (MNC)
- fix iscsi-target NULL pointer dereference 4.6+ regression in CHAP
error path. (David Disseldorp)
- avoid target-core backend UNMAP callbacks if range is zero. (Andrei
Vagin)
- fix a iscsi-target 4.14+ regression related multiple PDU logins,
that was exposed due to removal of TCP prequeue support. (Florian
Westphal + MNC)
Also, there is a iser-target bug still being worked on for post -rc1
code to address a long standing issue resulting in persistent
ib_post_send() failures, for RNICs with small max_send_sge"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
iscsi-target: make sure to wake up sleeping login worker
tcmu: Fix trailing semicolon
tcmu: fix cmd user after free
target: fix destroy device in target_configure_device
tcmu: allow userspace to reset ring
target core: add device action configfs files
tcmu: fix error return code in tcmu_configure_device()
target_core_user: add cmd id to broken ring message
target: add SAM_STAT_BUSY sense reason
tcmu: prevent corruption when invalid data page requested
target: don't call an unmap callback if a range length is zero
target/iscsi: avoid NULL dereference in CHAP auth error path
cxgbit: call neigh_event_send() to update MAC address
target: tcm_loop: Use seq_puts() in tcm_loop_show_info()
target: tcm_loop: Delete an unnecessary return statement in tcm_loop_submission_work()
target: tcm_loop: Delete two unnecessary variable initialisations in tcm_loop_issue_tmr()
target: tcm_loop: Combine substrings for 26 messages
target: tcm_loop: Improve a size determination in two functions
target: tcm_loop: Delete an error message for a failed memory allocation in four functions
sbp-target: Delete an error message for a failed memory allocation in three functions
...
Pull block updates from Jens Axboe:
"This is the main pull request for block IO related changes for the
4.16 kernel. Nothing major in this pull request, but a good amount of
improvements and fixes all over the map. This contains:
- BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
Paolo.
- Support for SMR zones for deadline and mq-deadline from Damien and
Christoph.
- Set of fixes for bcache by way of Michael Lyle, including fixes
from himself, Kent, Rui, Tang, and Coly.
- Series from Matias for lightnvm with fixes from Hans Holmberg,
Javier, and Matias. Mostly centered around pblk, and the removing
rrpc 1.2 in preparation for supporting 2.0.
- A couple of NVMe pull requests from Christoph. Nothing major in
here, just fixes and cleanups, and support for command tracing from
Johannes.
- Support for blk-throttle for tracking reads and writes separately.
From Joseph Qi. A few cleanups/fixes also for blk-throttle from
Weiping.
- Series from Mike Snitzer that enables dm to register its queue more
logically, something that's alwways been problematic on dm since
it's a stacked device.
- Series from Ming cleaning up some of the bio accessor use, in
preparation for supporting multipage bvecs.
- Various fixes from Ming closing up holes around queue mapping and
quiescing.
- BSD partition fix from Richard Narron, fixing a problem where we
can't mount newer (10/11) FreeBSD partitions.
- Series from Tejun reworking blk-mq timeout handling. The previous
scheme relied on atomic bits, but it had races where we would think
a request had timed out if it to reused at the wrong time.
- null_blk now supports faking timeouts, to enable us to better
exercise and test that functionality separately. From me.
- Kill the separate atomic poll bit in the request struct. After
this, we don't use the atomic bits on blk-mq anymore at all. From
me.
- sgl_alloc/free helpers from Bart.
- Heavily contended tag case scalability improvement from me.
- Various little fixes and cleanups from Arnd, Bart, Corentin,
Douglas, Eryu, Goldwyn, and myself"
* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
block: remove smart1,2.h
nvme: add tracepoint for nvme_complete_rq
nvme: add tracepoint for nvme_setup_cmd
nvme-pci: introduce RECONNECTING state to mark initializing procedure
nvme-rdma: remove redundant boolean for inline_data
nvme: don't free uuid pointer before printing it
nvme-pci: Suspend queues after deleting them
bsg: use pr_debug instead of hand crafted macros
blk-mq-debugfs: don't allow write on attributes with seq_operations set
nvme-pci: Fix queue double allocations
block: Set BIO_TRACE_COMPLETION on new bio during split
blk-throttle: use queue_is_rq_based
block: Remove kblockd_schedule_delayed_work{,_on}()
blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
lib/scatterlist: Fix chaining support in sgl_alloc_order()
blk-throttle: track read and write request individually
block: add bdev_read_only() checks to common helpers
block: fail op_is_write() requests to read-only partitions
blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
...
Mike Christie reports:
Starting in 4.14 iscsi logins will fail around 50% of the time.
Problem appears to be that iscsi_target_sk_data_ready() callback may
return without doing anything in case it finds the login work queue
is still blocked in sock_recvmsg().
Nicholas Bellinger says:
It would indicate users providing their own ->sk_data_ready() callback
must be responsible for waking up a kthread context blocked on
sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is
received before the first sock_recvmsg(..., MSG_WAITALL) completes.
So, do this and invoke the original data_ready() callback -- in
case of tcp sockets this takes care of waking the thread.
Disclaimer: I do not understand why this problem did not show up before
tcp prequeue removal.
(Drop WARN_ON usage - nab)
Reported-by: Mike Christie <mchristi@redhat.com>
Bisected-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Diagnosed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fixes: e7942d0633 ("tcp: remove prequeue support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Cc: stable@vger.kernel.org # 4.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The trailing semicolon is an empty statement that does no operation.
It is completely stripped out by the compiler. Removing it since it doesn't do
anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If we are failing the command due to a qfull timeout we are
also freeing the tcmu command, so we cannot access it later
to get the se_cmd.
Note: The clearing of cmd->se_cmd is not needed. We do not check
it later for something like determining if the command was failed
due to a timeout. As a result I am dropping it.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
After dev->transport->configure_device succeeds, target_configure_device
exits abnormally, dev_flags has not set DF_CONFIGURED yet, does not call
destroy_device function in free_device.
Signed-off-by: tangwenji <tang.wenji@zte.com.cn>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds 2 tcmu attrs to block/unblock a device and
reset the ring buffer. They are used when the userspace
daemon has crashed or forced to shutdown while IO is executing.
On restart, the daemon can block the device so new IO is not
sent to userspace while it puts the ring in a clean state.
Notes: The reset ring opreation is specific to tcmu, but the
block one could be generic. I kept it tcmu specific, because
it requires some extra locking/state checks in the main IO
path and since other backend modules did not need this
functionality I thought only tcmu should take the perf hit.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a new group of files that are to be used to
have the kernel module execution some action. The next patch
will have target_core_user use the group/files to be able to block
a device and to reset its memory buffer used to pass commands
between user/kernel space.
This type of file is different from the existing device attributes
in that they may be write only and when written to they result in
the kernel module executing some function. These need to be
separate from the normal device attributes which get/set device
values so userspace can continue to loop over all the attribs and
get/set them during initialization.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fix to return error code -ENOMEM from the kzalloc() error handling
case instead of 0, as done elsewhere in this function.
Fixes: 80eb876 ("tcmu: allow max block and global max blocks to be settable")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Log cmd id that was not found in the tcmu_handle_completions
lookup failure path.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add SAM_STAT_BUSY sense_reason. The next patch will have
target_core_user return this value while it is temporarily
blocked and restarting.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We will always have a page mapped for cmd data if it is
valid command. If the mapping does not exist then something
bad happened in userspace and it should not proceed. This
has us return VM_FAULT_SIGBUS when this happens instead of
returning a freshly allocated paged. The latter can cause
corruption because userspace might write the pages data
overwriting valid data or return it to the initiator.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If a length of a range is zero, it means there is nothing to unmap
and we can skip this range.
Here is one more reason, why we have to skip such ranges. An unmap
callback calls file_operations->fallocate(), but the man page for the
fallocate syscall says that fallocate(fd, mode, offset, let) returns
EINVAL, if len is zero. It means that file_operations->fallocate() isn't
obligated to handle zero ranges too.
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>