rq->errors never has been part of the communication protocol between drivers
and the block stack and most drivers will not have initialized it.
Return -EIO to upper layers when the driver returns BLK_MQ_RQ_QUEUE_ERROR
unconditionally. If a driver want to return a different error it can easily
do so by returning success after calling blk_mq_end_io itself.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Witch to using a preallocated flush_rq for blk-mq similar to what's done
with the old request path. This allows us to set up the request properly
with a tag from the actually allowed range and ->rq_disk as needed by
some drivers. To make life easier we also switch to dynamic allocation
of ->flush_rq for the old path.
This effectively reverts most of
"blk-mq: fix for flush deadlock"
and
"blk-mq: Don't reserve a tag for flush request"
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Rework I/O completions to work more like the old code path. blk_mq_end_io
now stays out of the business of deferring completions to others CPUs
and calling blk_mark_rq_complete. The latter is very important to allow
completing requests that have timed out and thus are already marked completed,
the former allows using the IPI callout even for driver specific completions
instead of having to reimplement them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch adds the missing bio_integrity_enabled() +
bio_integrity_prep() setup into blk_mq_make_request()
in order to use DIF protection with scsi-mq.
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Make blk-mq handle the dma_drain_size field the same way as the old request
path.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
This is neede for proper SG_IO operation as well as various uses of
blk_execute_rq from the SCSI midlayer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Reserving a tag (request) for flush to avoid dead lock is a overkill. A
tag is valuable resource. We can track the number of flush requests and
disallow having too many pending flush requests allocated. With this
patch, blk_mq_alloc_request_pinned() could do a busy nop (but not a dead
loop) if too many pending requests are allocated and new flush request
is allocated. But this should not be a problem, too many pending flush
requests are very rare case.
I verified this can fix the deadlock caused by too many pending flush
requests.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
'struct page' has two list_head fields: 'lru' and 'list'. Conveniently,
they are unioned together. This means that code can use them
interchangably, which gets horribly confusing.
The blk-mq made the logical decision to try to use page->list. But, that
field was actually introduced just for the slub code. ->lru is the right
field to use outside of slab/slub.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
__smp_call_function_single already avoids multiple IPIs by internally
queing up the items, and now also is available for non-SMP builds as
a trivially correct stub, so there is no need to wrap it. If the
additional lock roundtrip cause problems my patch to convert the
generic IPI code to llists is waiting to get merged will fix it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_rq_init() is called in req's complete handler to initialize
the request, so the members of start_time and start_time_ns might
become inaccurate when it is allocated in future.
The patch initializes the two members in blk_mq_rq_ctx_init() to
fix the problem.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_free_queue() is called from release handler of
queue kobject, so it needn't be called from drivers.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch moves synchronization on mq->delay_work
from blk_mq_free_queue() to blk_sync_queue(), so that
blk_sync_queue can work on mq.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_drain_queue() is introduced so that we can drain
mq queue inside blk_cleanup_queue().
Also don't accept new requests any more if queue is marked
as dying.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Needed to bring blk-mq uptodate, since changes have been going in
since for-3.14/core was established.
Fixup merge issues related to the immutable biovec changes.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Conflicts:
block/blk-flush.c
fs/btrfs/check-integrity.c
fs/btrfs/extent_io.c
fs/btrfs/scrub.c
fs/logfs/dev_bdev.c
If accounting is on, we will do the IO completion accounting after
we have freed the request. Fix that by moving it sooner instead.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If __GFP_WAIT isn't set and we fail allocating, when we go
to drop the reference on the ctx, we will attempt to dereference
the NULL rq. Fix that.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If disk stats are enabled on the queue, a request needs to
be marked with REQ_IO_STAT for accounting to be active on
that request. This fixes an issue with virtio-blk not
showing up in /proc/diskstats after the conversion to
blk-mq.
Add QUEUE_FLAG_MQ_DEFAULT, setting stats and same cpu-group
completion on by default.
Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull second round of block driver updates from Jens Axboe:
"As mentioned in the original pull request, the bcache bits were pulled
because of their dependency on the immutable bio vecs. Kent re-did
this part and resubmitted it, so here's the 2nd round of (mostly)
driver updates for 3.13. It contains:
- The bcache work from Kent.
- Conversion of virtio-blk to blk-mq. This removes the bio and request
path, and substitutes with the blk-mq path instead. The end result
almost 200 deleted lines. Patch is acked by Asias and Christoph, who
both did a bunch of testing.
- A removal of bootmem.h include from Grygorii Strashko, part of a
larger series of his killing the dependency on that header file.
- Removal of __cpuinit from blk-mq from Paul Gortmaker"
* 'for-linus' of git://git.kernel.dk/linux-block: (56 commits)
virtio_blk: blk-mq support
blk-mq: remove newly added instances of __cpuinit
bcache: defensively handle format strings
bcache: Bypass torture test
bcache: Delete some slower inline asm
bcache: Use ida for bcache block dev minor
bcache: Fix sysfs splat on shutdown with flash only devs
bcache: Better full stripe scanning
bcache: Have btree_split() insert into parent directly
bcache: Move spinlock into struct time_stats
bcache: Kill sequential_merge option
bcache: Kill bch_next_recurse_key()
bcache: Avoid deadlocking in garbage collection
bcache: Incremental gc
bcache: Add make_btree_freeing_key()
bcache: Add btree_node_write_sync()
bcache: PRECEDING_KEY()
bcache: bch_(btree|extent)_ptr_invalid()
bcache: Don't bother with bucket refcount for btree node allocations
bcache: Debug code improvements
...
The new blk-mq code added new instances of __cpuinit usage.
We removed this a couple versions ago; we now want to remove
the compat no-op stubs. Introducing new users is not what
we want to see at this point in time, as it will break once
the stubs are gone.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We switched to plug mq_list for mq, but some code are still using old list.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The flush state machine takes in a struct request, which then is
submitted multiple times to the underling driver. The old block code
requeses the same request for each of those, so it does not have an
issue with tapping into the request pool. The new one on the other hand
allocates a new request for each of the actualy steps of the flush
sequence. If have already allocated all of the tags for IO, we will
fail allocating the flush request.
Set aside a reserved request just for flushes.
Signed-off-by: Jens Axboe <axboe@kernel.dk>