Previously, LBA read and write were not supported in the lightnvm
specification. Now that it supports it, lets use the traditional
NVMe gendisk, and attach the lightnvm sysfs geometry export.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
When struct nvme_request was introduced, the nvme_nvm_submit_io was
converted to the new interface. The interface moves nvme_nvm_command
data structure into the struct request pdu. On io completion, rq->cmd is
freed, which should have been the dereferenced pdu nvme_request->cmd.
Fixes: d49187e97e "nvme: introduce struct nvme_request"
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
After commit 287922eb0b ("block: defer timeouts to a workqueue"),
deleting the timeout work after freezing the queue shouldn't be
necessary, since the synchronization is already enforced by the
acquisition of a q_usage_counter reference in blk_mq_timeout_work.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Currently there's no way to enable wbt if it's not enabled in the
kernel config by default for a device. Allow a write to the
'wbt_lat_usec' queue sysfs file to enable wbt.
This is useful for both the kernel config case, but also if the
device is CFQ managed and it was turned off by default.
Signed-off-by: Jens Axboe <axboe@fb.com>
Make it clear that we are disabling wbt for the specified queued,
if it was enabled by default. This is in preparation for allowing
users to re-enable wbt, and not have it disabled automatically
again.
Signed-off-by: Jens Axboe <axboe@fb.com>
Allow a write of '-1' to reset the default latency target for
a given device. This removes knowledge of the different default
settings for rotational vs non-rotational from user space.
Signed-off-by: Jens Axboe <axboe@fb.com>
Multiple paths don't set it properly, ensure that we do.
Fixes: 9561a7ade0 ("nbd: add multi-connection support")
Signed-off-by: Jens Axboe <axboe@fb.com>
Bit #7 is already used, move to bit #8 which is the first unused
one.
Fixes: 9561a7ade0 ("nbd: add multi-connection support")
Signed-off-by: Jens Axboe <axboe@fb.com>
NBD can become contended on its single connection. We have to serialize all
writes and we can only process one read response at a time. Fix this by
allowing userspace to provide multiple connections to a single nbd device. This
coupled with block-mq drastically increases performance in multi-process cases.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
blkcg allocates some per-cgroup data structures with GFP_NOWAIT and
when that fails falls back to operations which aren't specific to the
cgroup. Occassional failures are expected under pressure and falling
back to non-cgroup operation is the right thing to do.
Unfortunately, I forgot to add __GFP_NOWARN to these allocations and
these expected failures end up creating a lot of noise. Add
__GFP_NOWARN.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Marc MERLIN <marc@merlins.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
The check on bio->bi_vcnt doesn't make sense in erase_end_io().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Also this patch simplify the code a bit.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Always bio_add_page() is the standard and preferred way to
do the task.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Instead we use standard iterator way to do that.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
When the bio is full, bio_add_pc_page() will return zero,
so use this information tell when the bio is full.
Also replace access to .bi_vcnt for pr_debug() with bio_segments().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
For a non-cloned bio, bio_add_page() only returns failure when
the io vec table is full, but in that case, bio->bi_vcnt can't
be zero at all.
So remove the impossible failure handling.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Some drivers often use external bvec table, so introduce
this helper for this case. It is always safe to access the
bio->bi_io_vec in this way for this case.
After converting to this usage, it will becomes a bit easier
to evaluate the remaining direct access to bio->bi_io_vec,
so it can help to prepare for the following multipage bvec
support.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Fixed up the new O_DIRECT cases.
Signed-off-by: Jens Axboe <axboe@fb.com>
We store the bits in the bdev sector size locally, but we don't use
the calculation anymore. All we do with it is shift it back up to
the bdev sector size. So let's just use that directly and kill the
variable and bits calculation.
Signed-off-by: Jens Axboe <axboe@fb.com>
A direct I/O alignment must be always checked against the device blocks size,
but the I/O offset (bio->bi_iter.bi_sector must always use 512B sector unit, and
not the actual logical block size.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
If a ZBC device is partitioned and operations are performed on the partition
the zone information is rebased to the partition, however the zone reset
is not mapped from the partition to device as are other operations.
This causes the API (report zones / reset zone) to be unbalanced in this
regard. Checking for the zone reset op code explicitly will balance the
API.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Since commit 87374179 ("block: add a proper block layer data direction
encoding") we only or the new op and flags into bi_opf in bio_set_op_attrs
instead of clearing the old value. I've not seen any breakage with the
new behavior, but it seems dangerous.
Also convert it to an inline function to make the argument passing
safer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
This driver is both orphaned, and not really useful anymore. Mark
it as such, and remove it in a future kernel after a release or
two.
Signed-off-by: Jens Axboe <axboe@fb.com>
With compilers which follow the C99 standard (like modern versions of
gcc and clang), "extern inline" does the opposite thing from older
versions of gcc (emits code for an externally linkable version of the
inline function).
"static inline" does the intended behavior in all cases instead.
Description taken from commit 6d91857d48 ("staging, rtl8192e,
LLVMLinux: Change extern inline to static inline").
This also fixes the following GCC warning when building with CONFIG_PM
disabled:
./include/linux/blkdev.h:1143:20: warning: no previous prototype for 'blk_set_runtime_active' [-Wmissing-prototypes]
Fixes: d07ab6d114 ("block: Add blk_set_runtime_active()")
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jens Axboe <axboe@fb.com>