For TREE_BLOCK_REF, SHARED_DATA_REF and SHARED_BLOCK_REF we need to
check:
| TREE_BLOCK_REF | SHARED_BLOCK_REF | SHARED_BLOCK_REF
--------------+----------------+-----------------+------------------
key->objectid | Alignment | Alignment | Alignment
key->offset | Any value | Alignment | Alignment
item_size | 0 | 0 | sizeof(le32) (*)
*: sizeof(struct btrfs_shared_data_ref)
So introduce a check to check all these 3 key types together.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch introduces the ability to check extent items.
This check involves:
- key->objectid check
Basic alignment check.
- key->type check
Against btrfs_extent_item::type and SKINNY_METADATA feature.
- key->offset alignment check for EXTENT_ITEM
- key->offset check for METADATA_ITEM
- item size check
Both against minimal size and stepping check.
- btrfs_extent_item check
Checks its flags and generation.
- btrfs_extent_inline_ref checks
Against 4 types inline ref.
Checks bytenr alignment and tree level.
- btrfs_extent_item::refs check
Check against total refs found in inline refs.
This check would be the most complex single item check due to its nature
of inlined items.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The btrfs_get_chunk_map() never returns NULL, it returns error pointers.
Fixes: 89b798ad1b ("btrfs: Use btrfs_get_io_geometry appropriately")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In the function btrfs_init_dev_stats() goto out is not needed, because the
alloc has failed. So just return -ENOMEM.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
%found_key is not used, drop it since it hasn't been used since the
beginning in 733f4fbbc1 ("Btrfs: read device stats on mount, write
modified ones during commit").
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function is used only for the readahead machinery. It makes no
sense to keep it external to reada.c file. Place it above its sole
caller and make it static. No functional changes.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The set_level callbacks do not do anything special and can be replaced
by a helper that uses the levels defined in the tables.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The maximum and default levels do not change and can be defined
directly. The set_level callback was a temporary solution and will be
removed.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The BTRFS_DEV_REPLACE_ITEM_STATE_x defines, as shown in [1], are
unused in both kernel and btrfs-progs (except for one instance of
BTRFS_DEV_REPLACE_ITEM_STATE_NEVER_STARTED in kernel).
[1]
btrfs.h:#define BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED 2
btrfs.h:#define BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED 3
btrfs.h:#define BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED 4
Further these define-values are different form its counterpart
BTRFS_IOCTL_DEV_REPLACE_STATE_x series as shown in [2].
[2]
btrfs_tree.h:#define BTRFS_DEV_REPLACE_ITEM_STATE_SUSPENDED 2
btrfs_tree.h:#define BTRFS_DEV_REPLACE_ITEM_STATE_FINISHED 3
btrfs_tree.h:#define BTRFS_DEV_REPLACE_ITEM_STATE_CANCELED 4
So this patch deletes the BTRFS_DEV_REPLACE_ITEM_STATE_x altogether, and
one instance of BTRFS_DEV_REPLACE_ITEM_STATE_NEVER_STARTED is replaced
with BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED in the kernel.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have this weird space flushing loop inside inode.c for evict where
we'll do the normal LIMIT flush, and then commit the transaction and
hope we get our space. This is super janky, and in fact there's really
nothing stopping us from using FLUSH_ALL except that we run delayed
iputs, which means we could deadlock. So introduce a new flush state
for eviction that does the normal priority flushing with all of the
states that are safe for eviction.
The nice side-effect of this is that we'll try harder for evictions.
Previously if (for example generic/269) you had a bunch of other
operations happening on the fs you could race with those reservations
when committing the transaction, and eventually miss getting a
reservation for the evict. With this code we'll have our ticket in
place through the transaction commit, so any pinned bytes will go to our
pending evictions first.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With the eviction flushing stuff we'll want to allow for different
states, but still work basically the same way that
priority_reclaim_metadata_space works currently. Refactor this to take
the flushing states and size as an argument so we can use the same logic
for limit flushing and eviction flushing.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We're going to make this logic a little more complicated for evict, so
factor the ticket flushing/waiting code out of __reserve_metadata_bytes.
This has no functional change.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently we handle the cleanup of errored out tickets in both the
priority flush path and the normal flushing path. This is the same code
in both places, so just refactor so we don't duplicate the cleanup work.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Delayed iputs could very well free up enough space without needing to
commit the transaction, so make this step it's own step. This will
allow us to skip the step for evictions in a later patch.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
These were renamed and exported to facilitate logical migration of
different code chunks into block-group.c. Now that all the users are in
one file go ahead and rename them back, move the code around, and make
them static.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This can now be easily migrated as well.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ refresh on top of sysfs cleanups ]
Signed-off-by: David Sterba <dsterba@suse.com>
These feel more at home in block-group.c.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ refresh, adjust btrfs_get_alloc_profile exports ]
Signed-off-by: David Sterba <dsterba@suse.com>
This feels more at home in block-group.c than in extent-tree.c.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>i
[ refresh ]
Signed-off-by: David Sterba <dsterba@suse.com>
This gets used by a few different logical chunks of the block group
code, export it while we move things around.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
All of the prep work has been done so we can now cleanly move this chunk
over.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ refresh, add btrfs_get_alloc_profile export, comment updates ]
Signed-off-by: David Sterba <dsterba@suse.com>