Commit Graph

57 Commits

Author SHA1 Message Date
Chris Mason 6db4a7335d Merge branch 'fix/waitqueue-barriers' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-12 16:24:40 -07:00
Chris Mason 62fb50ab7c Merge branch 'anand/sysfs-updates-v4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4
Signed-off-by: Chris Mason <clm@fb.com>
2015-10-12 16:24:15 -07:00
David Sterba 730d9ec36b btrfs: remove waitqueue_active check from btrfs_rm_dev_replace_unblocked
Normally the waitqueue_active would need a barrier, but this is not
necessary here because it's not a performance sensitive context and we
can call wake_up directly.

Suggested-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-10 18:16:38 +02:00
David Sterba ecaeb14b91 btrfs: switch message printers to _in_rcu variants
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-08 11:07:55 +02:00
Liu Bo 73416dab23 Btrfs: move kobj stuff out of dev_replace lock range
To avoid deadlock described in commit 084b6e7c76 ("btrfs: Fix a
lockdep warning when running xfstest."), we should move kobj stuff out
of dev_replace lock range.

  "It is because the btrfs_kobj_{add/rm}_device() will call memory
  allocation with GFP_KERNEL,
  which may flush fs page cache to free space, waiting for it self to do
  the commit, causing the deadlock.

  To solve the problem, move btrfs_kobj_{add/rm}_device() out of the
  dev_replace lock range, also involing split the
  btrfs_rm_dev_replace_srcdev() function into remove and free parts.

  Now only btrfs_rm_dev_replace_remove_srcdev() is called in dev_replace
  lock range, and kobj_{add/rm} and btrfs_rm_dev_replace_free_srcdev() are
  called out of the lock range."

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
[added lockup description]
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-01 18:07:59 +02:00
Anand Jain 9e271ae27e Btrfs: kernel operation should come after user input has been verified
By general rule of thumb there shouldn't be any way that user land
could trigger a kernel operation just by sending wrong arguments.

Here do commit cleanups after user input has been verified.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-01 17:45:10 +02:00
Anand Jain 3257604048 Btrfs: rename btrfs_kobj_rm_device to btrfs_sysfs_rm_device_link
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-09-29 16:29:59 +02:00
Anand Jain e3bd6973bc Btrfs: rename btrfs_kobj_add_device to btrfs_sysfs_add_device_link
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-09-29 16:29:59 +02:00
Tsutomu Itoh 527afb4493 Btrfs: cleanup: remove unnecessary check before btrfs_free_path is called
We need not check path before btrfs_free_path() is called because
path is checked in btrfs_free_path().

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-08-31 11:46:41 -07:00
Anand Jain 3e303ea60d btrfs: its btrfs_err() instead of btrfs_error()
sorry I indented to use btrfs_err() and I have no idea
how btrfs_error() got there.
infact I was thinking about these kind of oversights
since these two func are too closely named.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-07-22 18:20:53 -07:00
Anand Jain d2ff1b2008 Btrfs: sysfs: add support to show replacing target in the sysfs
This patch will add support to show the replacing target in sysfs
during the process of replacement.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2015-06-19 14:03:54 +02:00
Anand Jain 6c14a1641b Btrfs: sysfs btrfs_kobj_rm_device() pass fs_devices instead of fs_info
since btrfs_kobj_rm_device() does nothing with fs_info

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2015-05-27 12:27:21 +02:00
Anand Jain 1ba43816af Btrfs: sysfs btrfs_kobj_add_device() pass fs_devices instead of fs_info
btrfs_kobj_add_device() does not need fs_info any more.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2015-05-27 12:27:21 +02:00
David Sterba f8c269d722 btrfs: cleanup 64bit/32bit divs, compile time constants
Switch to div_u64 if the divisor is a numeric constant or sum of
sizeof()s. We can remove a few instances of do_div that has the hidden
semtantics of changing the 1st argument.

Small power-of-two divisors are converted to bitshifts, large values are
kept intact for clarity.

Signed-off-by: David Sterba <dsterba@suse.cz>
2015-03-03 17:23:57 +01:00
Zhao Lei 7653947fe6 Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event()
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21 18:06:48 -08:00
Zhao Lei 09dd7a01c3 Btrfs: Cleanup btrfs_bio_counter_inc_blocked()
1: Remove no-need DEFINE_WAIT(wait)
2: Add likely() for BTRFS_FS_STATE_DEV_REPLACING condition
3: Use while loop instead of goto

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21 18:06:48 -08:00
Chris Mason 9627aeee3e Merge branch 'raid56-scrub-replace' of git://github.com/miaoxie/linux-btrfs into for-linus 2014-12-02 18:42:03 -08:00
Zhao Lei 5d3edd8f44 Btrfs, replace: enable dev-replace for raid56
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
2014-12-03 10:20:48 +08:00
Miao Xie 4245215d6a Btrfs, raid56: fix use-after-free problem in the final device replace procedure on raid56
The commit c404e0dc (Btrfs: fix use-after-free in the finishing
procedure of the device replace) fixed a use-after-free problem
which happened when removing the source device at the end of device
replace, but at that time, btrfs didn't support device replace
on raid56, so we didn't fix the problem on the raid56 profile.
Currently, we implemented device replace for raid56, so we need
kick that problem out before we enable that function for raid56.

The fix method is very simple, we just increase the bio per-cpu
counter before we submit a raid56 io, and decrease the counter
when the raid56 io ends.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
2014-12-03 10:18:47 +08:00
Qu Wenruo 084b6e7c76 btrfs: Fix a lockdep warning when running xfstest.
The following lockdep warning is triggered during xfstests:

[ 1702.980872] =========================================================
[ 1702.981181] [ INFO: possible irq lock inversion dependency detected ]
[ 1702.981482] 3.18.0-rc1 #27 Not tainted
[ 1702.981781] ---------------------------------------------------------
[ 1702.982095] kswapd0/77 just changed the state of lock:
[ 1702.982415]  (&delayed_node->mutex){+.+.-.}, at: [<ffffffffa03b0b51>] __btrfs_release_delayed_node+0x41/0x1f0 [btrfs]
[ 1702.982794] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[ 1702.983160]  (&fs_info->dev_replace.lock){+.+.+.}

and interrupts could create inverse lock ordering between them.

[ 1702.984675]
other info that might help us debug this:
[ 1702.985524] Chain exists of:
  &delayed_node->mutex --> &found->groups_sem --> &fs_info->dev_replace.lock

[ 1702.986799]  Possible interrupt unsafe locking scenario:

[ 1702.987681]        CPU0                    CPU1
[ 1702.988137]        ----                    ----
[ 1702.988598]   lock(&fs_info->dev_replace.lock);
[ 1702.989069]                                local_irq_disable();
[ 1702.989534]                                lock(&delayed_node->mutex);
[ 1702.990038]                                lock(&found->groups_sem);
[ 1702.990494]   <Interrupt>
[ 1702.990938]     lock(&delayed_node->mutex);
[ 1702.991407]
 *** DEADLOCK ***

It is because the btrfs_kobj_{add/rm}_device() will call memory
allocation with GFP_KERNEL,
which may flush fs page cache to free space, waiting for it self to do
the commit, causing the deadlock.

To solve the problem, move btrfs_kobj_{add/rm}_device() out of the
dev_replace lock range, also involing split the
btrfs_rm_dev_replace_srcdev() function into remove and free parts.

Now only btrfs_rm_dev_replace_remove_srcdev() is called in dev_replace
lock range, and kobj_{add/rm} and btrfs_rm_dev_replace_free_srcdev() are
called out of the lock range.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-11-25 05:55:38 -08:00
Eryu Guan 2fc9f6baa2 Btrfs: return failure if btrfs_dev_replace_finishing() failed
device replace could fail due to another running scrub process or any
other errors btrfs_scrub_dev() may hit, but this failure doesn't get
returned to userspace.

The following steps could reproduce this issue

	mkfs -t btrfs -f /dev/sdb1 /dev/sdb2
	mount /dev/sdb1 /mnt/btrfs
	while true; do btrfs scrub start -B /mnt/btrfs >/dev/null 2>&1; done &
	btrfs replace start -Bf /dev/sdb2 /dev/sdb3 /mnt/btrfs
	# if this replace succeeded, do the following and repeat until
	# you see this log in dmesg
	# BTRFS: btrfs_scrub_dev(/dev/sdb2, 2, /dev/sdb3) failed -115
	#btrfs replace start -Bf /dev/sdb3 /dev/sdb2 /mnt/btrfs

	# once you see the error log in dmesg, check return value of
	# replace
	echo $?

Introduce a new dev replace result

BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS

to catch -EINPROGRESS explicitly and return other errors directly to
userspace.

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-11-20 17:14:28 -08:00
Miao Xie 82372bc816 Btrfs: make the logic of source device removing more clear
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-09-17 13:38:46 -07:00
Miao Xie 67a2c45ee7 Btrfs: fix use-after-free problem of the device during device replace
The problem is:
	Task0(device scan task)		Task1(device replace task)
	scan_one_device()
	mutex_lock(&uuid_mutex)
	device = find_device()
					mutex_lock(&device_list_mutex)
					lock_chunk()
					rm_and_free_source_device
					unlock_chunk()
					mutex_unlock(&device_list_mutex)
	check device

Destroying the target device if device replace fails also has the same problem.

We fix this problem by locking uuid_mutex during destroying source device or
target device, just like the device remove operation.

It is a temporary solution, we can fix this problem and make the code more
clear by atomic counter in the future.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-09-17 13:38:44 -07:00
Miao Xie 2196d6e8a7 Btrfs: Fix misuse of chunk mutex
There were several problems about chunk mutex usage:
- Lock chunk mutex when updating metadata. It would cause the nested
  deadlock because updating metadata might need allocate new chunks
  that need acquire chunk mutex. We remove chunk mutex at this case,
  because b-tree lock and other lock mechanism can help us.
- ABBA deadlock occured between device_list_mutex and chunk_mutex.
  When we update device status, we must acquire device_list_mutex at the
  beginning, and then we might get chunk_mutex during the device status
  update because we need allocate new chunks for metadata COW. But at
  most place, we acquire chunk_mutex at first and then acquire device list
  mutex. We need change the lock order.
- Some place we needn't acquire chunk_mutex. For example we needn't get
  chunk_mutex when we free a empty seed fs_devices structure.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-09-17 13:38:42 -07:00
Miao Xie 7cc8e58d53 Btrfs: fix unprotected device's variants on 32bits machine
->total_bytes,->disk_total_bytes,->bytes_used is protected by chunk
lock when we change them, but sometimes we read them without any lock,
and we might get unexpected value. We fix this problem like inode's
i_size.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-09-17 13:38:38 -07:00