Commit Graph

362878 Commits

Author SHA1 Message Date
Josef Bacik 69a85bd87c Btrfs: don't null pointer deref on abort
I'm sorry, theres no excuse for this sort of work.  We need to use
root->leafsize since eb may be NULL.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:12 -04:00
Gabriel de Perthuis 03b71c6ca6 btrfs: don't stop searching after encountering the wrong item
The search ioctl skips items that are too large for a result buffer, but
inline items of a certain size occuring before any search result is
found would trigger an overflow and stop the search entirely.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=57641

Cc: stable@vger.kernel.org
Signed-off-by: Gabriel de Perthuis <g2p.code+btrfs@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:10 -04:00
Liu Bo a52f4cd2b1 Btrfs: fix off-by-one in fiemap
lock_extent/unlock_extent expect an exclusive end.

Tested-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 16:27:26 -04:00
David Sterba 60b62978bc btrfs: annotate quota tree for lockdep
Quota tree has been missing from lockdep annotations, though no warning
has been seen in the wild.

There's currently one entry that does not belong there,
BTRFS_ORPHAN_OBJECTID.  No such tree exists, it's probably a copy &
paste mistake, the id is defined among tree ids.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 16:27:25 -04:00
Chris Mason 667e7d94a1 Btrfs: allow superblock mismatch from older mkfs
We've added new checks to make sure the super block crc is correct
during mount.  A fresh filesystem from an older mkfs won't have the
crc set.  This adds a warning when it finds a newly created filesystem
but doesn't fail the mount.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-05-07 11:00:13 -04:00
David Sterba 1104a88551 btrfs: enhance superblock checks
The superblock checksum is not verified upon mount. <awkward silence>

Add that check and also reorder existing checks to a more logical
order.

Current mkfs.btrfs does not calculate the correct checksum of
super_block and thus a freshly created filesytem will fail to mount when
this patch is applied.

First transaction commit calculates correct superblock checksum and
saves it to disk.

Reproducer:
$ mfks.btrfs /dev/sda
$ mount /dev/sda /mnt
$ btrfs scrub start /mnt
$ sleep 5
$ btrfs scrub status /mnt
... super:2 ...

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-05-07 10:50:27 -04:00
David Sterba b6919a58f0 btrfs: fix misleading variable name for flags
The variable was named 'data' in btrfs_reserve_extent and that's the
only function that actually uses it to let btrfs_get_alloc_profile know
what profile we want. Then it's passed down as u64 flags.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:27 -04:00
David Sterba 410748882a btrfs: use unsigned long type for extent state bits
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:27 -04:00
Liu Bo 625f1c8dc6 Btrfs: improve the loop of scrub_stripe
1) Right now scrub_stripe() is looping in some unnecessary cases:
* when the found extent item's objectid has been out of the dev extent's range
  but we haven't finish scanning all the range within the dev extent
* when all the items has been processed but we haven't finish scanning all the
  range within the dev extent

In both cases, we can just finish the loop to save costs.

2) Besides, when the found extent item's length is larger than the stripe
len(64k), we don't have to release the path and search again as it'll get at the
same key used in the last loop, we can instead increase the logical cursor in
place till all space of the extent is scanned.

3) And we use 0 as the key's offset to search btree, then get to previous item
to find a smaller item, and again have to move to the next one to get the right
item.  Setting offset=-1 and previous_item() is the correct way.

4) As we won't find any checksum at offset unless this 'offset' is in a data
extent, we can just find checksum when we're really going to scrub an extent.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:26 -04:00
David Sterba 55793c0d03 btrfs: read entire device info under lock
There's a theoretical possibility of reading stale (or even more
theoretically, freed) data from DEV_INFO ioctl when the device would
disappear between an early mutex unlock and data being copied from the
device structure.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:25 -04:00
David Sterba f7a52a40ca btrfs: remove unused gfp mask parameter from release_extent_buffer callchain
It's unused since 0b32f4bbb4.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:24 -04:00
David Sterba 34c2b29079 btrfs: handle errors returned from get_tree_block_key
Signed-off-by: David Sterba <dsterba@suse.cz>
Reviewed-by: Zach Brown <zab@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:24 -04:00
Eric Sandeen 48a3b6366f btrfs: make static code static & remove dead code
Big patch, but all it does is add statics to functions which
are in fact static, then remove the associated dead-code fallout.

removed functions:

btrfs_iref_to_path()
__btrfs_lookup_delayed_deletion_item()
__btrfs_search_delayed_insertion_item()
__btrfs_search_delayed_deletion_item()
find_eb_for_page()
btrfs_find_block_group()
range_straddles_pages()
extent_range_uptodate()
btrfs_file_extent_length()
btrfs_scrub_cancel_devid()
btrfs_start_transaction_lflush()

btrfs_print_tree() is left because it is used for debugging.
btrfs_start_transaction_lflush() and btrfs_reada_detach() are
left for symmetry.

ulist.c functions are left, another patch will take care of those.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:23 -04:00
Josef Bacik 634554dc0a Btrfs: deal with errors in write_dev_supers
If you try to mount -o loop a restored file system it will panic if the file
ends up being smaller than the original disk.  This is because we go to try and
get a block for a super that may be past the EOF which makes __getblk return
NULL for a buffer head when we aren't expecting it to.  Fix this by dealing with
this case and just jacking up the errors count.  With this patch we no longer
panic when mounting a restored file system loopback.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:22 -04:00
Josef Bacik 3650860b90 Btrfs: remove almost all of the BUG()'s from tree-log.c
There were a whole bunch and I was doing it for other things.  I haven't tested
these error paths but at the very least this is better than panicing.  I've only
left 2 BUG_ON()'s since they are logic errors and I want to replace them with a
ASSERT framework that we can compile out for production users.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:21 -04:00
Josef Bacik b50c6e250e Btrfs: deal with free space cache errors while replaying log
So everybody who got hit by my fsync bug will still continue to hit this
BUG_ON() in the free space cache, which is pretty heavy handed.  So I took a
file system that had this bug and fixed up all the BUG_ON()'s and leaks that
popped up when I tried to mount a broken file system like this.  With this patch
we just fail to mount instead of panicing.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:20 -04:00
Jan Schmidt 3d7b5a2882 Btrfs: automatic rescan after "quota enable" command
When qgroup tracking is enabled, we do an automatic cycle of the new rescan
mechanism.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:20 -04:00
Jan Schmidt 2f2320360b Btrfs: rescan for qgroups
If qgroup tracking is out of sync, a rescan operation can be started. It
iterates the complete extent tree and recalculates all qgroup tracking data.
This is an expensive operation and should not be used unless required.

A filesystem under rescan can still be umounted. The rescan continues on the
next mount.  Status information is provided with a separate ioctl while a
rescan operation is in progress.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:19 -04:00
Jan Schmidt 46b665ceb1 Btrfs: split btrfs_qgroup_account_ref into four functions
The function is separated into a preparation part and the three accounting
steps mentioned in the qgroups documentation. The goal is to make steps two
and three usable by the rescan functionality. A side effect is that the
function is restructured into readable subunits.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:18 -04:00
Miao Xie 3c76cd84e0 Btrfs: allocate new chunks if the space is not enough for global rsv
When running the 208th of xfstests, the fs returned the enospc
error when there was lots of free space in the disk.

By bisect debug, we found it was introduced by commit 96f1bb5777.
This commit makes the space check for the global reservation in
can_overcommit() be inconsistent with should_alloc_chunk().
can_overcommit() requires that the free space is 2 times the size
of the global reservation, or we can't do overcommit. And instead,
we need reclaim some reserved space, and if we still don't have
enough free space, we need allocate a new chunk. But unfortunately,
should_alloc_chunk() just requires that the free space is 1 time
the size of the global reservation, that is we would not try to
allocate a new chunk if the free space size is in the middle of
these two requires, and just return the enospc error. Fix it.

Cc: Jim Schutt <jaschut@sandia.gov>
Cc: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:17 -04:00
Jan Schmidt fc36ed7e0b Btrfs: separate sequence numbers for delayed ref tracking and tree mod log
Sequence numbers for delayed refs have been introduced in the first version
of the qgroup patch set. To solve the problem of find_all_roots on a busy
file system, the tree mod log was introduced. The sequence numbers for that
were simply shared between those two users.

However, at one point in qgroup's quota accounting, there's a statement
accessing the previous sequence number, that's still just doing (seq - 1)
just as it would have to in the very first version.

To satisfy that requirement, this patch makes the sequence number counter 64
bit and splits it into a major part (used for qgroup sequence number
counting) and a minor part (incremented for each tree modification in the
log). This enables us to go exactly one major step backwards, as required
for qgroups, while still incrementing the sequence counter for tree mod log
insertions to keep track of their order. Keeping them in a single variable
means there's no need to change all the code dealing with comparisons of two
sequence numbers.

The sequence number is reset to 0 on commit (not new in this patch), which
ensures we won't overflow the two 32 bit counters.

Without this fix, the qgroup tracking can occasionally go wrong and WARN_ONs
from the tree mod log code may happen.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:17 -04:00
Eric Sandeen 6d49ba1b47 btrfs: move leak debug code to functions
Clean up the leak debugging in extent_io.c by moving
the debug code into functions.  This also removes the
list_heads used for debugging from the extent_buffer
and extent_state structures when debug is not enabled.

Since we need a global debug config to do that last
part, implement CONFIG_BTRFS_DEBUG to accommodate.

Thanks to Dave Sterba for the Kconfig bit.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:16 -04:00
Liu Bo ace68bac61 Btrfs: return free space in cow error path
Replace some BUG_ONs with proper handling and take allocated space back to
free space cache for later use.

We don't have to worry about extent maps since they'd be freed in releasepage
path.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:15 -04:00
Stefan Behrens 6463fe58ea Btrfs: set UUID in root_item for created trees
It is a rare exception that a new tree is created, like the qgroups
tree. So far these new trees have an all-zero UUID in their root
items. All trees that mkfs.btrfs has created get an UUID during the
first mount when btrfs_read_root_item() rewrites the root_item to
the v2 structure style. These UUID are never used so far, but
anyway, since it is better to have it uniform for all trees, this
commit adds some lines that generate and write an UUID for newly
created trees.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:14 -04:00
Stefan Behrens 5fbf83c10c Btrfs: delete unused parameter to btrfs_read_root_item()
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-06 15:55:14 -04:00