Commit Graph

114 Commits

Author SHA1 Message Date
Linus Torvalds 513a4befae Merge branch 'for-4.9/block' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
 "This is the main pull request for block layer changes in 4.9.

  As mentioned at the last merge window, I've changed things up and now
  do just one branch for core block layer changes, and driver changes.
  This avoids dependencies between the two branches. Outside of this
  main pull request, there are two topical branches coming as well.

  This pull request contains:

   - A set of fixes, and a conversion to blk-mq, of nbd. From Josef.

   - Set of fixes and updates for lightnvm from Matias, Simon, and Arnd.
     Followup dependency fix from Geert.

   - General fixes from Bart, Baoyou, Guoqing, and Linus W.

   - CFQ async write starvation fix from Glauber.

   - Add supprot for delayed kick of the requeue list, from Mike.

   - Pull out the scalable bitmap code from blk-mq-tag.c and make it
     generally available under the name of sbitmap. Only blk-mq-tag uses
     it for now, but the blk-mq scheduling bits will use it as well.
     From Omar.

   - bdev thaw error progagation from Pierre.

   - Improve the blk polling statistics, and allow the user to clear
     them. From Stephen.

   - Set of minor cleanups from Christoph in block/blk-mq.

   - Set of cleanups and optimizations from me for block/blk-mq.

   - Various nvme/nvmet/nvmeof fixes from the various folks"

* 'for-4.9/block' of git://git.kernel.dk/linux-block: (54 commits)
  fs/block_dev.c: return the right error in thaw_bdev()
  nvme: Pass pointers, not dma addresses, to nvme_get/set_features()
  nvme/scsi: Remove power management support
  nvmet: Make dsm number of ranges zero based
  nvmet: Use direct IO for writes
  admin-cmd: Added smart-log command support.
  nvme-fabrics: Add host_traddr options field to host infrastructure
  nvme-fabrics: revise host transport option descriptions
  nvme-fabrics: rework nvmf_get_address() for variable options
  nbd: use BLK_MQ_F_BLOCKING
  blkcg: Annotate blkg_hint correctly
  cfq: fix starvation of asynchronous writes
  blk-mq: add flag for drivers wanting blocking ->queue_rq()
  blk-mq: remove non-blocking pass in blk_mq_map_request
  blk-mq: get rid of manual run of queue with __blk_mq_run_hw_queue()
  block: export bio_free_pages to other modules
  lightnvm: propagate device_add() error code
  lightnvm: expose device geometry through sysfs
  lightnvm: control life of nvm_dev in driver
  blk-mq: register device instead of disk
  ...
2016-10-07 14:42:05 -07:00
Arnd Bergmann 1e3aeae4ea lightnvm: propagate device_add() error code
device_add() may fail, and all callers are supposed to check the
return value, but one new user in lightnvm doesn't:

drivers/lightnvm/sysfs.c: In function 'nvm_sysfs_register_dev':
drivers/lightnvm/sysfs.c:184:2: error: ignoring return value of 'device_add',
  declared with attribute warn_unused_result [-Werror=unused-result]

This changes the caller to propagate any error codes, which avoids
the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 38c9e260b9f9 ("lightnvm: expose device geometry through sysfs")
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-21 07:57:32 -06:00
Simon A. F. Lund 40267efddc lightnvm: expose device geometry through sysfs
For a host to access an Open-Channel SSD, it has to know its geometry,
so that it writes and reads at the appropriate device bounds.

Currently, the geometry information is kept within the kernel, and not
exported to user-space for consumption. This patch exposes the
configuration through sysfs and enables user-space libraries, such as
liblightnvm, to use the sysfs implementation to get the geometry of an
Open-Channel SSD.

The sysfs entries are stored within the device hierarchy, and can be
found using the "lightnvm" device type.

An example configuration looks like this:

/sys/class/nvme/
└── nvme0n1
   ├── capabilities: 3
   ├── device_mode: 1
   ├── erase_max: 1000000
   ├── erase_typ: 1000000
   ├── flash_media_type: 0
   ├── media_capabilities: 0x00000001
   ├── media_type: 0
   ├── multiplane: 0x00010101
   ├── num_blocks: 1022
   ├── num_channels: 1
   ├── num_luns: 4
   ├── num_pages: 64
   ├── num_planes: 1
   ├── page_size: 4096
   ├── prog_max: 100000
   ├── prog_typ: 100000
   ├── read_max: 10000
   ├── read_typ: 10000
   ├── sector_oob_size: 0
   ├── sector_size: 4096
   ├── media_manager: gennvm
   ├── ppa_format: 0x380830082808001010102008
   ├── vendor_opcode: 0
   ├── max_phys_secs: 64
   └── version: 1

Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-21 07:57:31 -06:00
Matias Bjørling b0b4e09c1a lightnvm: control life of nvm_dev in driver
LightNVM compatible device drivers does not have a method to expose
LightNVM specific sysfs entries.

To enable LightNVM sysfs entries to be exposed, lightnvm device
drivers require a struct device to attach it to. To allow both the
actual device driver and lightnvm sysfs entries to coexist, the device
driver tracks the lifetime of the nvm_dev structure.

This patch refactors NVMe and null_blk to handle the lifetime of struct
nvm_dev, which eliminates the need for struct gendisk when a lightnvm
compatible device is provided.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-21 07:56:18 -06:00
Matias Bjørling ac81bfa986 nvme: refactor namespaces to support non-gendisk devices
With LightNVM enabled namespaces, the gendisk structure is not exposed
to the user. This prevents LightNVM users from accessing the NVMe device
driver specific sysfs entries, and LightNVM namespace geometry.

Refactor the revalidation process, so that a namespace, instead of a
gendisk, is revalidated. This later allows patches to wire up the
sysfs entries up to a non-gendisk namespace.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-21 07:56:12 -06:00
Geert Uytterhoeven e105ddb4a2 lightnvm: NVM should depend on HAS_DMA
If NO_DMA=y:

    drivers/built-in.o: In function `nvme_nvm_dev_dma_free':
    lightnvm.c:(.text+0x23df1a): undefined reference to `dma_pool_free'
    drivers/built-in.o: In function `nvme_nvm_dev_dma_alloc':
    lightnvm.c:(.text+0x23df38): undefined reference to `dma_pool_alloc'
    drivers/built-in.o: In function `nvme_nvm_destroy_dma_pool':
    lightnvm.c:(.text+0x23df4c): undefined reference to `dma_pool_destroy'
    drivers/built-in.o: In function `nvme_nvm_create_dma_pool':
    lightnvm.c:(.text+0x23df7e): undefined reference to `dma_pool_create'

and

    ERROR: "dma_pool_destroy" [drivers/nvme/host/nvme-core.ko] undefined!
    ERROR: "dma_pool_free" [drivers/nvme/host/nvme-core.ko] undefined!
    ERROR: "dma_pool_alloc" [drivers/nvme/host/nvme-core.ko] undefined!
    ERROR: "dma_pool_create" [drivers/nvme/host/nvme-core.ko] undefined!

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-21 07:56:10 -06:00
PrasannaKumar Muralidharan d6a38c0ba7 miscdevice: Use module_misc_device() macro
This patch removes module_init()/module_exit() from driver code by using
module_misc_device() macro. All modules in this patch has a print
statement which is removed when module_misc_device() macro is used.
If undesirable this patch can be dropped entirely, this is the only
purpose of making this as a separate patch.

Signed-off-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-31 14:12:35 +02:00
Christoph Hellwig 70246286e9 block: get rid of bio_rw and READA
These two are confusing leftover of the old world order, combining
values of the REQ_OP_ and REQ_ namespaces.  For callers that don't
special case we mostly just replace bi_rw with bio_data_dir or
op_is_write, except for the few cases where a switch over the REQ_OP_
values makes more sense.  Any check for READA is replaced with an
explicit check for REQ_RAHEAD.  Also remove the READA alias for
REQ_RAHEAD.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-20 17:37:01 -06:00
Matias Bjørling d9e46d5d7c lightnvm: make __nvm_submit_ppa static
The __nvm_submit_ppa() function is not used outside lightnvm core.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 8680f165ea lightnvm: make ppa_list const in nvm_set_rqd_list
The passed by reference ppa list in nvm_set_rqd_list() is updated when
multiple planes are available. In that case, each PPA plane is
incremented when the device side PPA list is created. This prevents the
caller to rely on the PPA list to be unmodified after a call.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 24d4a7d721 lightnvm: fix lun offset calculation for mark blk
The gen_mark_blk_bad function marks the wrong block when a block is on
a different channel. Fix the index calculation, so that it updates the
correct block.

Reported-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 855cdd2c0b lightnvm: make rrpc_map_page call nvm_get_blk outside locks
The nvm_get_blk() function is called with rlun->lock held. This is ok
when the media manager implementation doesn't go out of its atomic
context. However, if a media manager persists its metadata, and
guarantees that the block is given to the target, this is no longer
a viable approach. Therefore, clean up the flow of rrpc_map_page,
and make sure that nvm_get_blk() is called without any locks acquired.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 41285fad51 lightnvm: remove _unlocked variant of [get/put]_blk
The [get/put]_blk API enables targets to get ownership of blocks at
runtime. This information is currently not recorded on disk, and the
information is therefore lost on power failure. To restore the
metadata, the [get/put]_blk must persist its metadata. In that case,
we need to control the outer lock, so that we can disable them while
updating the on-disk metadata. Fortunately, the _unlocked versions can
be removed, which allows us to move the lock into the [get/put]_blk
functions.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 8c39eddbf2 lightnvm: remove unused lists from struct rrpc_block
The ->list, ->open_list, and ->closed_list lists were previously used
for statistics. However, their usage have been removed, and thus these
can safely be removed.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 5cd907853d lightnvm: remove nested lock conflict with mm
If a media manager tries to initialize it targets upon media manager
initialization, the media manager will need to know which target types
are available in LightNVM. The lists of which managers and target types
are available shares the same lock.

Therefore, on initialization, the nvm_lock is taken by LightNVM core,
which later leads to a deadlock when target types are enumerated by the
media manager.

Add an exclusive lock for target types to resolve this conflict.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling b76eb20bb0 lightnvm: move target mgmt into media mgr
To enable persistent block management to easily control creation and
removal of targets, we move target management into the media
manager. The LightNVM core continues to maintain which target types are
registered, while the media manager now keeps track of its initialized
targets.

Two new callbacks for the media manager are introduced. create_tgt and
remove_tgt. Note that remove_tgt returns 0 on successfully removing a
target, and returns 1 if the target was not found.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 5e60edb7dc lightnvm: rename gennvm and update description
The generic manager should be called the general media manager, and
instead of using the rather long name of "gennvm" in front of each data
structures, use "gen" instead to shorten it. Update the description of
the media manager as well to make the media manager purpose clearer.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 077d238999 lightnvm: remove open/close statistics for gennvm
The responsibility of the media manager is not to keep track of
open/closed blocks. This is better maintained within a target,
that already manages this information on writes.

Remove the statistics and merge the states NVM_BLK_ST_OPEN and
NVM_BLK_ST_CLOSED.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 12624af26e lightnvm: fix checkpatch terse errors
A couple of small checkpatch fixups to stop it from complaining.

./drivers/lightnvm/core.c:360: WARNING: line over 80 characters
./drivers/lightnvm/core.c:360: ERROR: trailing statements should be on
next line
./drivers/lightnvm/core.c:503: WARNING: Block comments use a trailing */
on a separate line

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 5114e2773a lightnvm: remove checkpatch warning for unsigned ints
Checkpatch found two incidents where the type was preferred to be
written out in full.

./drivers/lightnvm/rrpc.h:184: WARNING: Prefer 'unsigned int' to bare
use of 'unsigned'
./drivers/lightnvm/rrpc.h:209: WARNING: Prefer 'unsigned int' to bare
use of 'unsigned'
./drivers/lightnvm/rrpc.c:51: WARNING: Prefer 'unsigned int' to bare use
of 'unsigned'

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Johannes Thumshirn 58eaaf9b6c lightnvm: Make functions not used by ouside static
Mark functions not used by ouside of thier implementing file as static.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Javier González 529435e818 lightnvm: add media manager mark_blk helper
Expose media manager mark_blk() to targets, as done for the rest of the
media manager callback functions.

Signed-off-by: Javier González <javier@cnexlabs.com>
Updated description
Signed-off-by: Matias Bjørling <m@bjorling.me>

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Wenwei Tao 0de2415bb7 lightnvm: break the loop when rqd is not null
Break the loop when rqd is not null to reduce
an unnecessary schedule.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Mike Christie 95fe6c1a20 block, fs, mm, drivers: use bio set/get op accessors
This patch converts the simple bi_rw use cases in the block,
drivers, mm and fs code to set/get the bio operation using
bio_set_op_attrs/bio_op

These should be simple one or two liner cases, so I just did them
in one patch. The next patches handle the more complicated
cases in a module per patch.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-07 13:41:38 -06:00
Javier González 116f7d4a21 lightnvm: reserved space calculation incorrect
The nvm_dev->max_pages_per_blk variable was removed in favor of the new
nvm->sec_per_blk variable. The ->max_pages_per_blk variable was still
used in rrpc_capacity, reporting the reserved capacity to zero. Replace
with ->sec_per_blk to calculate the reserved area again.

Signed-off-by: Javier González <javier@cnexlabs.com>
Updated patch description. Was "lightnvm: eliminate redundant variable"
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00