Remove useless __dev_status call while processing an ioctl that sets up
device geometry and target message. The data is not returned to
userspace so there is no point collecting it and in the case of
target_message it is collected before processing the message so if it
did return it might be stale.
Signed-off-by: Peter Rajnoha <prajnoha@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Set a new DM_UEVENT_GENERATED_FLAG when returning from ioctls to
indicate that a uevent was actually generated. This tells the userspace
caller that it may need to wait for the event to be processed.
Signed-off-by: Peter Rajnoha <prajnoha@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Only issue a uevent on a resume if the state of the device changed,
i.e. if it was suspended and/or its table was replaced.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
When swapping a new table into place, retain the old table until
its replacement is in place.
An old check for an empty table is removed because this is enforced
in populate_table().
__unbind() becomes redundant when followed by __bind().
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Add the flag DM_QUERY_INACTIVE_TABLE_FLAG to the ioctls to return
infomation about the loaded-but-not-yet-active table instead of the live
table. Prior to this patch it was impossible to obtain this information
until the device had been 'resumed'.
Userspace dmsetup and libdevmapper support the flag as of version 1.02.40.
e.g. dmsetup info --inactive vg1-lv1
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Once we begin deleting a device, prevent any further messages being sent
to targets of its table (to avoid races).
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
strlcpy() will always null terminate the string.
The code should already guarantee this as the last bytes are already
NULs and the string lengths were restricted before being stored in
hc. Removing the '-1' becomes necessary so strlcpy() doesn't
lose the last character of a maximum-length string.
- agk
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Fix a reported deadlock if there are still unprocessed multipath events
on a device that is being removed.
_hash_lock is held during dev_remove while trying to send the
outstanding events. Sending the events requests the _hash_lock
again in dm_copy_name_and_uuid.
This patch introduces a separate lock around regions that modify the
link to the hash table (dm_set_mdptr) or the name or uuid so that
dm_copy_name_and_uuid no longer needs _hash_lock.
Additionally, dm_copy_name_and_uuid can only be called if md exists
so we can drop the dm_get() and dm_put() which can lead to a BUG()
while md is being freed.
The deadlock:
#0 [ffff8106298dfb48] schedule at ffffffff80063035
#1 [ffff8106298dfc20] __down_read at ffffffff8006475d
#2 [ffff8106298dfc60] dm_copy_name_and_uuid at ffffffff8824f740
#3 [ffff8106298dfc90] dm_send_uevents at ffffffff88252685
#4 [ffff8106298dfcd0] event_callback at ffffffff8824c678
#5 [ffff8106298dfd00] dm_table_event at ffffffff8824dd01
#6 [ffff8106298dfd10] __hash_remove at ffffffff882507ad
#7 [ffff8106298dfd30] dev_remove at ffffffff88250865
#8 [ffff8106298dfd60] ctl_ioctl at ffffffff88250d80
#9 [ffff8106298dfee0] do_ioctl at ffffffff800418c4
#10 [ffff8106298dff00] vfs_ioctl at ffffffff8002fab9
#11 [ffff8106298dff40] sys_ioctl at ffffffff8004bdaf
#12 [ffff8106298dff80] tracesys at ffffffff8005d28d (via system_call)
Cc: stable@kernel.org
Reported-by: guy keren <choo@actcom.co.il>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This allows subsytems to provide devtmpfs with non-default permissions
for the device node. Instead of the default mode of 0600, null, zero,
random, urandom, full, tty, ptmx now have a mode of 0666, which allows
non-privileged processes to access standard device nodes in case no
other userspace process applies the expected permissions.
This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch enables request-based dm.
o Request-based dm and bio-based dm coexist, since there are
some target drivers which are more fitting to bio-based dm.
Also, there are other bio-based devices in the kernel
(e.g. md, loop).
Since bio-based device can't receive struct request,
there are some limitations on device stacking between
bio-based and request-based.
type of underlying device
bio-based request-based
----------------------------------------------
bio-based OK OK
request-based -- OK
The device type is recognized by the queue flag in the kernel,
so dm follows that.
o The type of a dm device is decided at the first table binding time.
Once the type of a dm device is decided, the type can't be changed.
o Mempool allocations are deferred to at the table loading time, since
mempools for request-based dm are different from those for bio-based
dm and needed mempool type is fixed by the type of table.
o Currently, request-based dm supports only tables that have a single
target. To support multiple targets, we need to support request
splitting or prevent bio/request from spanning multiple targets.
The former needs lots of changes in the block layer, and the latter
needs that all target drivers support merge() function.
Both will take a time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Add support for passing a 32 bit "cookie" into the kernel with the
DM_SUSPEND, DM_DEV_RENAME and DM_DEV_REMOVE ioctls. The (unsigned)
value of this cookie is returned to userspace alongside the uevents
issued by these ioctls in the variable DM_COOKIE.
This means the userspace process issuing these ioctls can be notified
by udev after udev has completed any actions triggered.
To minimise the interface extension, we pass the cookie into the
kernel in the event_nr field which is otherwise unused when calling
these ioctls. Incrementing the version number allows userspace to
determine in advance whether or not the kernel supports the cookie.
If the kernel does support this but userspace does not, there should
be no impact as the new variable will just get ignored.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This adds support for misc devices to report their requested nodename to
userspace. It also updates a number of misc drivers to provide the
needed subdirectory and device name to be used for them.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch provides support for data integrity passthrough in the device
mapper.
- If one or more component devices support integrity an integrity
profile is preallocated for the DM device.
- If all component devices have compatible profiles the DM device is
flagged as capable.
- Handle integrity metadata when splitting and cloning bios.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Fix an error introduced in dm-table-rework-reference-counting.patch.
When there is failure after table initialization, we need to use
dm_table_destroy, not dm_table_put, to free the table.
dm_table_put may be used only after dm_table_get.
Cc: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
When renaming a mapped device validate the length of the new name.
The rename ioctl accepted any correctly-terminated string enclosed
within the data passed from userspace. The other ioctls enforce a
size limit of DM_NAME_LEN. If the name is changed and becomes longer
than that, the device can no longer be addressed by name.
Fix it by properly checking for device name length (including
terminating zero).
Cc: stable@kernel.org
Signed-off-by: Milan Broz <mbroz@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Rework table reference counting.
The existing code uses a reference counter. When the last reference is
dropped and the counter reaches zero, the table destructor is called.
Table reference counters are acquired/released from upcalls from other
kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all).
If the reference counter reaches zero in one of the upcalls, the table
destructor is called from almost random kernel code.
This leads to various problems:
* dm_any_congested being called under a spinlock, which calls the
destructor, which calls some sleeping function.
* the destructor attempting to take a lock that is already taken by the
same process.
* stale reference from some other kernel code keeps the table
constructed, which keeps some devices open, even after successful
return from "dmsetup remove". This can confuse lvm and prevent closing
of underlying devices or reusing device minor numbers.
The patch changes reference counting so that the table destructor can be
called only at predetermined places.
The table has always exactly one reference from either mapped_device->map
or hash_cell->new_map. After this patch, this reference is not counted
in table->holders. A pair of dm_create_table/dm_destroy_table functions
is used for table creation/destruction.
Temporary references from the other code increase table->holders. A pair
of dm_table_get/dm_table_put functions is used to manipulate it.
When the table is about to be destroyed, we wait for table->holders to
reach 0. Then, we call the table destructor. We use active waiting with
msleep(1), because the situation happens rarely (to one user in 5 years)
and removing the device isn't performance-critical task: the user doesn't
care if it takes one tick more or not.
This way, the destructor is called only at specific points
(dm_table_destroy function) and the above problems associated with lazy
destruction can't happen.
Finally remove the temporary protection added to dm_any_congested().
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allow NULL buffer in dm_copy_name_and_uuid if you only want to return one of
the fields.
(Required by a following patch that adds these fields to sysfs.)
Signed-off-by: Milan Broz <mbroz@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Split struct dm_dev in two and publish the part that other targets need in
include/linux/device-mapper.h.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* Implement disk_devt() and part_devt() and use them to directly
access devt instead of computing it from ->major and ->first_minor.
Note that all references to ->major and ->first_minor outside of
block layer is used to determine devt of the disk (the part0) and as
->major and ->first_minor will continue to represent devt for the
disk, converting these users aren't strictly necessary. However,
convert them for consistency.
* Implement disk_max_parts() to avoid directly deferencing
genhd->minors.
* Update bdget_disk() such that it doesn't assume consecutive minor
space.
* Move devt computation from register_disk() to add_disk() and make it
the only one (all other usages use the initially determined value).
These changes clean up the code and will help disk->part dereference
fix and extended block device numbers.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
drivers/md/dm-ioctl.c:1405: warning: 'param' may be used uninitialized in this function
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>