You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge branch 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason: "We have a good sized cleanup of our internal read ahead code, and the first series of commits from Chandan to enable PAGE_SIZE > sectorsize Otherwise, it's a normal series of cleanups and fixes, with many thanks to Dave Sterba for doing most of the patch wrangling this time" * 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (82 commits) btrfs: make sure we stay inside the bvec during __btrfs_lookup_bio_sums btrfs: Fix misspellings in comments. btrfs: Print Warning only if ENOSPC_DEBUG is enabled btrfs: scrub: silence an uninitialized variable warning btrfs: move btrfs_compression_type to compression.h btrfs: rename btrfs_print_info to btrfs_print_mod_info Btrfs: Show a warning message if one of objectid reaches its highest value Documentation: btrfs: remove usage specific information btrfs: use kbasename in btrfsic_mount Btrfs: do not collect ordered extents when logging that inode exists Btrfs: fix race when checking if we can skip fsync'ing an inode Btrfs: fix listxattrs not listing all xattrs packed in the same item Btrfs: fix deadlock between direct IO reads and buffered writes Btrfs: fix extent_same allowing destination offset beyond i_size Btrfs: fix file loss on log replay after renaming a file and fsync Btrfs: fix unreplayable log after snapshot delete + parent dir fsync Btrfs: fix lockdep deadlock warning due to dev_replace btrfs: drop unused argument in btrfs_ioctl_get_supported_features btrfs: add GET_SUPPORTED_FEATURES to the control device ioctls btrfs: change max_inline default to 2048 ...
This commit is contained in:
@@ -1,20 +1,10 @@
|
||||
|
||||
BTRFS
|
||||
=====
|
||||
|
||||
Btrfs is a copy on write filesystem for Linux aimed at
|
||||
implementing advanced features while focusing on fault tolerance,
|
||||
repair and easy administration. Initially developed by Oracle, Btrfs
|
||||
is licensed under the GPL and open for contribution from anyone.
|
||||
|
||||
Linux has a wealth of filesystems to choose from, but we are facing a
|
||||
number of challenges with scaling to the large storage subsystems that
|
||||
are becoming common in today's data centers. Filesystems need to scale
|
||||
in their ability to address and manage large storage, and also in
|
||||
their ability to detect, repair and tolerate errors in the data stored
|
||||
on disk. Btrfs is under heavy development, and is not suitable for
|
||||
any uses other than benchmarking and review. The Btrfs disk format is
|
||||
not yet finalized.
|
||||
Btrfs is a copy on write filesystem for Linux aimed at implementing advanced
|
||||
features while focusing on fault tolerance, repair and easy administration.
|
||||
Jointly developed by several companies, licensed under the GPL and open for
|
||||
contribution from anyone.
|
||||
|
||||
The main Btrfs features include:
|
||||
|
||||
@@ -28,243 +18,14 @@ The main Btrfs features include:
|
||||
* Checksums on data and metadata (multiple algorithms available)
|
||||
* Compression
|
||||
* Integrated multiple device support, with several raid algorithms
|
||||
* Online filesystem check (not yet implemented)
|
||||
* Very fast offline filesystem check
|
||||
* Efficient incremental backup and FS mirroring (not yet implemented)
|
||||
* Offline filesystem check
|
||||
* Efficient incremental backup and FS mirroring
|
||||
* Online filesystem defragmentation
|
||||
|
||||
For more information please refer to the wiki
|
||||
|
||||
Mount Options
|
||||
=============
|
||||
https://btrfs.wiki.kernel.org
|
||||
|
||||
When mounting a btrfs filesystem, the following option are accepted.
|
||||
Options with (*) are default options and will not show in the mount options.
|
||||
|
||||
alloc_start=<bytes>
|
||||
Debugging option to force all block allocations above a certain
|
||||
byte threshold on each block device. The value is specified in
|
||||
bytes, optionally with a K, M, or G suffix, case insensitive.
|
||||
Default is 1MB.
|
||||
|
||||
noautodefrag(*)
|
||||
autodefrag
|
||||
Disable/enable auto defragmentation.
|
||||
Auto defragmentation detects small random writes into files and queue
|
||||
them up for the defrag process. Works best for small files;
|
||||
Not well suited for large database workloads.
|
||||
|
||||
check_int
|
||||
check_int_data
|
||||
check_int_print_mask=<value>
|
||||
These debugging options control the behavior of the integrity checking
|
||||
module (the BTRFS_FS_CHECK_INTEGRITY config option required).
|
||||
|
||||
check_int enables the integrity checker module, which examines all
|
||||
block write requests to ensure on-disk consistency, at a large
|
||||
memory and CPU cost.
|
||||
|
||||
check_int_data includes extent data in the integrity checks, and
|
||||
implies the check_int option.
|
||||
|
||||
check_int_print_mask takes a bitmask of BTRFSIC_PRINT_MASK_* values
|
||||
as defined in fs/btrfs/check-integrity.c, to control the integrity
|
||||
checker module behavior.
|
||||
|
||||
See comments at the top of fs/btrfs/check-integrity.c for more info.
|
||||
|
||||
commit=<seconds>
|
||||
Set the interval of periodic commit, 30 seconds by default. Higher
|
||||
values defer data being synced to permanent storage with obvious
|
||||
consequences when the system crashes. The upper bound is not forced,
|
||||
but a warning is printed if it's more than 300 seconds (5 minutes).
|
||||
|
||||
compress
|
||||
compress=<type>
|
||||
compress-force
|
||||
compress-force=<type>
|
||||
Control BTRFS file data compression. Type may be specified as "zlib"
|
||||
"lzo" or "no" (for no compression, used for remounting). If no type
|
||||
is specified, zlib is used. If compress-force is specified,
|
||||
all files will be compressed, whether or not they compress well.
|
||||
If compression is enabled, nodatacow and nodatasum are disabled.
|
||||
|
||||
degraded
|
||||
Allow mounts to continue with missing devices. A read-write mount may
|
||||
fail with too many devices missing, for example if a stripe member
|
||||
is completely missing.
|
||||
|
||||
device=<devicepath>
|
||||
Specify a device during mount so that ioctls on the control device
|
||||
can be avoided. Especially useful when trying to mount a multi-device
|
||||
setup as root. May be specified multiple times for multiple devices.
|
||||
|
||||
nodiscard(*)
|
||||
discard
|
||||
Disable/enable discard mount option.
|
||||
Discard issues frequent commands to let the block device reclaim space
|
||||
freed by the filesystem.
|
||||
This is useful for SSD devices, thinly provisioned
|
||||
LUNs and virtual machine images, but may have a significant
|
||||
performance impact. (The fstrim command is also available to
|
||||
initiate batch trims from userspace).
|
||||
|
||||
noenospc_debug(*)
|
||||
enospc_debug
|
||||
Disable/enable debugging option to be more verbose in some ENOSPC conditions.
|
||||
|
||||
fatal_errors=<action>
|
||||
Action to take when encountering a fatal error:
|
||||
"bug" - BUG() on a fatal error. This is the default.
|
||||
"panic" - panic() on a fatal error.
|
||||
|
||||
noflushoncommit(*)
|
||||
flushoncommit
|
||||
The 'flushoncommit' mount option forces any data dirtied by a write in a
|
||||
prior transaction to commit as part of the current commit. This makes
|
||||
the committed state a fully consistent view of the file system from the
|
||||
application's perspective (i.e., it includes all completed file system
|
||||
operations). This was previously the behavior only when a snapshot is
|
||||
created.
|
||||
|
||||
inode_cache
|
||||
Enable free inode number caching. Defaults to off due to an overflow
|
||||
problem when the free space crcs don't fit inside a single page.
|
||||
|
||||
max_inline=<bytes>
|
||||
Specify the maximum amount of space, in bytes, that can be inlined in
|
||||
a metadata B-tree leaf. The value is specified in bytes, optionally
|
||||
with a K, M, or G suffix, case insensitive. In practice, this value
|
||||
is limited by the root sector size, with some space unavailable due
|
||||
to leaf headers. For a 4k sector size, max inline data is ~3900 bytes.
|
||||
|
||||
metadata_ratio=<value>
|
||||
Specify that 1 metadata chunk should be allocated after every <value>
|
||||
data chunks. Off by default.
|
||||
|
||||
acl(*)
|
||||
noacl
|
||||
Enable/disable support for Posix Access Control Lists (ACLs). See the
|
||||
acl(5) manual page for more information about ACLs.
|
||||
|
||||
barrier(*)
|
||||
nobarrier
|
||||
Enable/disable the use of block layer write barriers. Write barriers
|
||||
ensure that certain IOs make it through the device cache and are on
|
||||
persistent storage. If disabled on a device with a volatile
|
||||
(non-battery-backed) write-back cache, nobarrier option will lead to
|
||||
filesystem corruption on a system crash or power loss.
|
||||
|
||||
datacow(*)
|
||||
nodatacow
|
||||
Enable/disable data copy-on-write for newly created files.
|
||||
Nodatacow implies nodatasum, and disables all compression.
|
||||
|
||||
datasum(*)
|
||||
nodatasum
|
||||
Enable/disable data checksumming for newly created files.
|
||||
Datasum implies datacow.
|
||||
|
||||
treelog(*)
|
||||
notreelog
|
||||
Enable/disable the tree logging used for fsync and O_SYNC writes.
|
||||
|
||||
recovery
|
||||
Enable autorecovery attempts if a bad tree root is found at mount time.
|
||||
Currently this scans a list of several previous tree roots and tries to
|
||||
use the first readable.
|
||||
|
||||
rescan_uuid_tree
|
||||
Force check and rebuild procedure of the UUID tree. This should not
|
||||
normally be needed.
|
||||
|
||||
skip_balance
|
||||
Skip automatic resume of interrupted balance operation after mount.
|
||||
May be resumed with "btrfs balance resume."
|
||||
|
||||
space_cache (*)
|
||||
Enable the on-disk freespace cache.
|
||||
nospace_cache
|
||||
Disable freespace cache loading without clearing the cache.
|
||||
clear_cache
|
||||
Force clearing and rebuilding of the disk space cache if something
|
||||
has gone wrong.
|
||||
|
||||
ssd
|
||||
nossd
|
||||
ssd_spread
|
||||
Options to control ssd allocation schemes. By default, BTRFS will
|
||||
enable or disable ssd allocation heuristics depending on whether a
|
||||
rotational or non-rotational disk is in use. The ssd and nossd options
|
||||
can override this autodetection.
|
||||
|
||||
The ssd_spread mount option attempts to allocate into big chunks
|
||||
of unused space, and may perform better on low-end ssds. ssd_spread
|
||||
implies ssd, enabling all other ssd heuristics as well.
|
||||
|
||||
subvol=<path>
|
||||
Mount subvolume at <path> rather than the root subvolume. <path> is
|
||||
relative to the top level subvolume.
|
||||
|
||||
subvolid=<ID>
|
||||
Mount subvolume specified by an ID number rather than the root subvolume.
|
||||
This allows mounting of subvolumes which are not in the root of the mounted
|
||||
filesystem.
|
||||
You can use "btrfs subvolume list" to see subvolume ID numbers.
|
||||
|
||||
subvolrootid=<objectid> (deprecated)
|
||||
Mount subvolume specified by <objectid> rather than the root subvolume.
|
||||
This allows mounting of subvolumes which are not in the root of the mounted
|
||||
filesystem.
|
||||
You can use "btrfs subvolume show " to see the object ID for a subvolume.
|
||||
|
||||
thread_pool=<number>
|
||||
The number of worker threads to allocate. The default number is equal
|
||||
to the number of CPUs + 2, or 8, whichever is smaller.
|
||||
|
||||
user_subvol_rm_allowed
|
||||
Allow subvolumes to be deleted by a non-root user. Use with caution.
|
||||
|
||||
MAILING LIST
|
||||
============
|
||||
|
||||
There is a Btrfs mailing list hosted on vger.kernel.org. You can
|
||||
find details on how to subscribe here:
|
||||
|
||||
http://vger.kernel.org/vger-lists.html#linux-btrfs
|
||||
|
||||
Mailing list archives are available from gmane:
|
||||
|
||||
http://dir.gmane.org/gmane.comp.file-systems.btrfs
|
||||
|
||||
|
||||
|
||||
IRC
|
||||
===
|
||||
|
||||
Discussion of Btrfs also occurs on the #btrfs channel of the Freenode
|
||||
IRC network.
|
||||
|
||||
|
||||
|
||||
UTILITIES
|
||||
=========
|
||||
|
||||
Userspace tools for creating and manipulating Btrfs file systems are
|
||||
available from the git repository at the following location:
|
||||
|
||||
http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git
|
||||
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
|
||||
|
||||
These include the following tools:
|
||||
|
||||
* mkfs.btrfs: create a filesystem
|
||||
|
||||
* btrfs: a single tool to manage the filesystems, refer to the manpage for more details
|
||||
|
||||
* 'btrfsck' or 'btrfs check': do a consistency check of the filesystem
|
||||
|
||||
Other tools for specific tasks:
|
||||
|
||||
* btrfs-convert: in-place conversion from ext2/3/4 filesystems
|
||||
|
||||
* btrfs-image: dump filesystem metadata for debugging
|
||||
that maintains information about administration tasks, frequently asked
|
||||
questions, use cases, mount options, comprehensible changelogs, features,
|
||||
manual pages, source code repositories, contacts etc.
|
||||
|
||||
+4
-8
@@ -148,8 +148,7 @@ int __init btrfs_prelim_ref_init(void)
|
||||
|
||||
void btrfs_prelim_ref_exit(void)
|
||||
{
|
||||
if (btrfs_prelim_ref_cache)
|
||||
kmem_cache_destroy(btrfs_prelim_ref_cache);
|
||||
kmem_cache_destroy(btrfs_prelim_ref_cache);
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -566,17 +565,14 @@ static void __merge_refs(struct list_head *head, int mode)
|
||||
struct __prelim_ref *pos2 = pos1, *tmp;
|
||||
|
||||
list_for_each_entry_safe_continue(pos2, tmp, head, list) {
|
||||
struct __prelim_ref *xchg, *ref1 = pos1, *ref2 = pos2;
|
||||
struct __prelim_ref *ref1 = pos1, *ref2 = pos2;
|
||||
struct extent_inode_elem *eie;
|
||||
|
||||
if (!ref_for_same_block(ref1, ref2))
|
||||
continue;
|
||||
if (mode == 1) {
|
||||
if (!ref1->parent && ref2->parent) {
|
||||
xchg = ref1;
|
||||
ref1 = ref2;
|
||||
ref2 = xchg;
|
||||
}
|
||||
if (!ref1->parent && ref2->parent)
|
||||
swap(ref1, ref2);
|
||||
} else {
|
||||
if (ref1->parent != ref2->parent)
|
||||
continue;
|
||||
|
||||
@@ -95,6 +95,7 @@
|
||||
#include <linux/genhd.h>
|
||||
#include <linux/blkdev.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <linux/string.h>
|
||||
#include "ctree.h"
|
||||
#include "disk-io.h"
|
||||
#include "hash.h"
|
||||
@@ -105,6 +106,7 @@
|
||||
#include "locking.h"
|
||||
#include "check-integrity.h"
|
||||
#include "rcu-string.h"
|
||||
#include "compression.h"
|
||||
|
||||
#define BTRFSIC_BLOCK_HASHTABLE_SIZE 0x10000
|
||||
#define BTRFSIC_BLOCK_LINK_HASHTABLE_SIZE 0x10000
|
||||
@@ -176,7 +178,7 @@ struct btrfsic_block {
|
||||
* Elements of this type are allocated dynamically and required because
|
||||
* each block object can refer to and can be ref from multiple blocks.
|
||||
* The key to lookup them in the hashtable is the dev_bytenr of
|
||||
* the block ref to plus the one from the block refered from.
|
||||
* the block ref to plus the one from the block referred from.
|
||||
* The fact that they are searchable via a hashtable and that a
|
||||
* ref_cnt is maintained is not required for the btrfs integrity
|
||||
* check algorithm itself, it is only used to make the output more
|
||||
@@ -3076,7 +3078,7 @@ int btrfsic_mount(struct btrfs_root *root,
|
||||
|
||||
list_for_each_entry(device, dev_head, dev_list) {
|
||||
struct btrfsic_dev_state *ds;
|
||||
char *p;
|
||||
const char *p;
|
||||
|
||||
if (!device->bdev || !device->name)
|
||||
continue;
|
||||
@@ -3092,11 +3094,7 @@ int btrfsic_mount(struct btrfs_root *root,
|
||||
ds->state = state;
|
||||
bdevname(ds->bdev, ds->name);
|
||||
ds->name[BDEVNAME_SIZE - 1] = '\0';
|
||||
for (p = ds->name; *p != '\0'; p++);
|
||||
while (p > ds->name && *p != '/')
|
||||
p--;
|
||||
if (*p == '/')
|
||||
p++;
|
||||
p = kbasename(ds->name);
|
||||
strlcpy(ds->name, p, sizeof(ds->name));
|
||||
btrfsic_dev_state_hashtable_add(ds,
|
||||
&btrfsic_dev_state_hashtable);
|
||||
|
||||
@@ -48,6 +48,15 @@ int btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
|
||||
void btrfs_clear_biovec_end(struct bio_vec *bvec, int vcnt,
|
||||
unsigned long pg_index,
|
||||
unsigned long pg_offset);
|
||||
|
||||
enum btrfs_compression_type {
|
||||
BTRFS_COMPRESS_NONE = 0,
|
||||
BTRFS_COMPRESS_ZLIB = 1,
|
||||
BTRFS_COMPRESS_LZO = 2,
|
||||
BTRFS_COMPRESS_TYPES = 2,
|
||||
BTRFS_COMPRESS_LAST = 3,
|
||||
};
|
||||
|
||||
struct btrfs_compress_op {
|
||||
struct list_head *(*alloc_workspace)(void);
|
||||
|
||||
|
||||
+18
-18
@@ -311,7 +311,7 @@ struct tree_mod_root {
|
||||
|
||||
struct tree_mod_elem {
|
||||
struct rb_node node;
|
||||
u64 index; /* shifted logical */
|
||||
u64 logical;
|
||||
u64 seq;
|
||||
enum mod_log_op op;
|
||||
|
||||
@@ -435,11 +435,11 @@ void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info,
|
||||
|
||||
/*
|
||||
* key order of the log:
|
||||
* index -> sequence
|
||||
* node/leaf start address -> sequence
|
||||
*
|
||||
* the index is the shifted logical of the *new* root node for root replace
|
||||
* operations, or the shifted logical of the affected block for all other
|
||||
* operations.
|
||||
* The 'start address' is the logical address of the *new* root node
|
||||
* for root replace operations, or the logical address of the affected
|
||||
* block for all other operations.
|
||||
*
|
||||
* Note: must be called with write lock (tree_mod_log_write_lock).
|
||||
*/
|
||||
@@ -460,9 +460,9 @@ __tree_mod_log_insert(struct btrfs_fs_info *fs_info, struct tree_mod_elem *tm)
|
||||
while (*new) {
|
||||
cur = container_of(*new, struct tree_mod_elem, node);
|
||||
parent = *new;
|
||||
if (cur->index < tm->index)
|
||||
if (cur->logical < tm->logical)
|
||||
new = &((*new)->rb_left);
|
||||
else if (cur->index > tm->index)
|
||||
else if (cur->logical > tm->logical)
|
||||
new = &((*new)->rb_right);
|
||||
else if (cur->seq < tm->seq)
|
||||
new = &((*new)->rb_left);
|
||||
@@ -523,7 +523,7 @@ alloc_tree_mod_elem(struct extent_buffer *eb, int slot,
|
||||
if (!tm)
|
||||
return NULL;
|
||||
|
||||
tm->index = eb->start >> PAGE_CACHE_SHIFT;
|
||||
tm->logical = eb->start;
|
||||
if (op != MOD_LOG_KEY_ADD) {
|
||||
btrfs_node_key(eb, &tm->key, slot);
|
||||
tm->blockptr = btrfs_node_blockptr(eb, slot);
|
||||
@@ -588,7 +588,7 @@ tree_mod_log_insert_move(struct btrfs_fs_info *fs_info,
|
||||
goto free_tms;
|
||||
}
|
||||
|
||||
tm->index = eb->start >> PAGE_CACHE_SHIFT;
|
||||
tm->logical = eb->start;
|
||||
tm->slot = src_slot;
|
||||
tm->move.dst_slot = dst_slot;
|
||||
tm->move.nr_items = nr_items;
|
||||
@@ -699,7 +699,7 @@ tree_mod_log_insert_root(struct btrfs_fs_info *fs_info,
|
||||
goto free_tms;
|
||||
}
|
||||
|
||||
tm->index = new_root->start >> PAGE_CACHE_SHIFT;
|
||||
tm->logical = new_root->start;
|
||||
tm->old_root.logical = old_root->start;
|
||||
tm->old_root.level = btrfs_header_level(old_root);
|
||||
tm->generation = btrfs_header_generation(old_root);
|
||||
@@ -739,16 +739,15 @@ __tree_mod_log_search(struct btrfs_fs_info *fs_info, u64 start, u64 min_seq,
|
||||
struct rb_node *node;
|
||||
struct tree_mod_elem *cur = NULL;
|
||||
struct tree_mod_elem *found = NULL;
|
||||
u64 index = start >> PAGE_CACHE_SHIFT;
|
||||
|
||||
tree_mod_log_read_lock(fs_info);
|
||||
tm_root = &fs_info->tree_mod_log;
|
||||
node = tm_root->rb_node;
|
||||
while (node) {
|
||||
cur = container_of(node, struct tree_mod_elem, node);
|
||||
if (cur->index < index) {
|
||||
if (cur->logical < start) {
|
||||
node = node->rb_left;
|
||||
} else if (cur->index > index) {
|
||||
} else if (cur->logical > start) {
|
||||
node = node->rb_right;
|
||||
} else if (cur->seq < min_seq) {
|
||||
node = node->rb_left;
|
||||
@@ -1230,9 +1229,10 @@ __tree_mod_log_oldest_root(struct btrfs_fs_info *fs_info,
|
||||
return NULL;
|
||||
|
||||
/*
|
||||
* the very last operation that's logged for a root is the replacement
|
||||
* operation (if it is replaced at all). this has the index of the *new*
|
||||
* root, making it the very first operation that's logged for this root.
|
||||
* the very last operation that's logged for a root is the
|
||||
* replacement operation (if it is replaced at all). this has
|
||||
* the logical address of the *new* root, making it the very
|
||||
* first operation that's logged for this root.
|
||||
*/
|
||||
while (1) {
|
||||
tm = tree_mod_log_search_oldest(fs_info, root_logical,
|
||||
@@ -1336,7 +1336,7 @@ __tree_mod_log_rewind(struct btrfs_fs_info *fs_info, struct extent_buffer *eb,
|
||||
if (!next)
|
||||
break;
|
||||
tm = container_of(next, struct tree_mod_elem, node);
|
||||
if (tm->index != first_tm->index)
|
||||
if (tm->logical != first_tm->logical)
|
||||
break;
|
||||
}
|
||||
tree_mod_log_read_unlock(fs_info);
|
||||
@@ -5361,7 +5361,7 @@ int btrfs_compare_trees(struct btrfs_root *left_root,
|
||||
goto out;
|
||||
}
|
||||
|
||||
tmp_buf = kmalloc(left_root->nodesize, GFP_NOFS);
|
||||
tmp_buf = kmalloc(left_root->nodesize, GFP_KERNEL);
|
||||
if (!tmp_buf) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
|
||||
+61
-26
@@ -100,6 +100,9 @@ struct btrfs_ordered_sum;
|
||||
/* tracks free space in block groups. */
|
||||
#define BTRFS_FREE_SPACE_TREE_OBJECTID 10ULL
|
||||
|
||||
/* device stats in the device tree */
|
||||
#define BTRFS_DEV_STATS_OBJECTID 0ULL
|
||||
|
||||
/* for storing balance parameters in the root tree */
|
||||
#define BTRFS_BALANCE_OBJECTID -4ULL
|
||||
|
||||
@@ -715,14 +718,6 @@ struct btrfs_timespec {
|
||||
__le32 nsec;
|
||||
} __attribute__ ((__packed__));
|
||||
|
||||
enum btrfs_compression_type {
|
||||
BTRFS_COMPRESS_NONE = 0,
|
||||
BTRFS_COMPRESS_ZLIB = 1,
|
||||
BTRFS_COMPRESS_LZO = 2,
|
||||
BTRFS_COMPRESS_TYPES = 2,
|
||||
BTRFS_COMPRESS_LAST = 3,
|
||||
};
|
||||
|
||||
struct btrfs_inode_item {
|
||||
/* nfs style generation number */
|
||||
__le64 generation;
|
||||
@@ -793,7 +788,7 @@ struct btrfs_root_item {
|
||||
|
||||
/*
|
||||
* This generation number is used to test if the new fields are valid
|
||||
* and up to date while reading the root item. Everytime the root item
|
||||
* and up to date while reading the root item. Every time the root item
|
||||
* is written out, the "generation" field is copied into this field. If
|
||||
* anyone ever mounted the fs with an older kernel, we will have
|
||||
* mismatching generation values here and thus must invalidate the
|
||||
@@ -1002,8 +997,10 @@ struct btrfs_dev_replace {
|
||||
pid_t lock_owner;
|
||||
atomic_t nesting_level;
|
||||
struct mutex lock_finishing_cancel_unmount;
|
||||
struct mutex lock_management_lock;
|
||||
struct mutex lock;
|
||||
rwlock_t lock;
|
||||
atomic_t read_locks;
|
||||
atomic_t blocking_readers;
|
||||
wait_queue_head_t read_lock_wq;
|
||||
|
||||
struct btrfs_scrub_progress scrub_progress;
|
||||
};
|
||||
@@ -1222,10 +1219,10 @@ struct btrfs_space_info {
|
||||
* we've called update_block_group and dropped the bytes_used counter
|
||||
* and increased the bytes_pinned counter. However this means that
|
||||
* bytes_pinned does not reflect the bytes that will be pinned once the
|
||||
* delayed refs are flushed, so this counter is inc'ed everytime we call
|
||||
* btrfs_free_extent so it is a realtime count of what will be freed
|
||||
* once the transaction is committed. It will be zero'ed everytime the
|
||||
* transaction commits.
|
||||
* delayed refs are flushed, so this counter is inc'ed every time we
|
||||
* call btrfs_free_extent so it is a realtime count of what will be
|
||||
* freed once the transaction is committed. It will be zero'ed every
|
||||
* time the transaction commits.
|
||||
*/
|
||||
struct percpu_counter total_bytes_pinned;
|
||||
|
||||
@@ -1822,6 +1819,9 @@ struct btrfs_fs_info {
|
||||
spinlock_t reada_lock;
|
||||
struct radix_tree_root reada_tree;
|
||||
|
||||
/* readahead works cnt */
|
||||
atomic_t reada_works_cnt;
|
||||
|
||||
/* Extent buffer radix tree */
|
||||
spinlock_t buffer_lock;
|
||||
struct radix_tree_root buffer_radix;
|
||||
@@ -2185,13 +2185,43 @@ struct btrfs_ioctl_defrag_range_args {
|
||||
*/
|
||||
#define BTRFS_QGROUP_RELATION_KEY 246
|
||||
|
||||
/*
|
||||
* Obsolete name, see BTRFS_TEMPORARY_ITEM_KEY.
|
||||
*/
|
||||
#define BTRFS_BALANCE_ITEM_KEY 248
|
||||
|
||||
/*
|
||||
* Persistantly stores the io stats in the device tree.
|
||||
* One key for all stats, (0, BTRFS_DEV_STATS_KEY, devid).
|
||||
* The key type for tree items that are stored persistently, but do not need to
|
||||
* exist for extended period of time. The items can exist in any tree.
|
||||
*
|
||||
* [subtype, BTRFS_TEMPORARY_ITEM_KEY, data]
|
||||
*
|
||||
* Existing items:
|
||||
*
|
||||
* - balance status item
|
||||
* (BTRFS_BALANCE_OBJECTID, BTRFS_TEMPORARY_ITEM_KEY, 0)
|
||||
*/
|
||||
#define BTRFS_DEV_STATS_KEY 249
|
||||
#define BTRFS_TEMPORARY_ITEM_KEY 248
|
||||
|
||||
/*
|
||||
* Obsolete name, see BTRFS_PERSISTENT_ITEM_KEY
|
||||
*/
|
||||
#define BTRFS_DEV_STATS_KEY 249
|
||||
|
||||
/*
|
||||
* The key type for tree items that are stored persistently and usually exist
|
||||
* for a long period, eg. filesystem lifetime. The item kinds can be status
|
||||
* information, stats or preference values. The item can exist in any tree.
|
||||
*
|
||||
* [subtype, BTRFS_PERSISTENT_ITEM_KEY, data]
|
||||
*
|
||||
* Existing items:
|
||||
*
|
||||
* - device statistics, store IO stats in the device tree, one key for all
|
||||
* stats
|
||||
* (BTRFS_DEV_STATS_OBJECTID, BTRFS_DEV_STATS_KEY, 0)
|
||||
*/
|
||||
#define BTRFS_PERSISTENT_ITEM_KEY 249
|
||||
|
||||
/*
|
||||
* Persistantly stores the device replace state in the device tree.
|
||||
@@ -2241,7 +2271,7 @@ struct btrfs_ioctl_defrag_range_args {
|
||||
#define BTRFS_MOUNT_ENOSPC_DEBUG (1 << 15)
|
||||
#define BTRFS_MOUNT_AUTO_DEFRAG (1 << 16)
|
||||
#define BTRFS_MOUNT_INODE_MAP_CACHE (1 << 17)
|
||||
#define BTRFS_MOUNT_RECOVERY (1 << 18)
|
||||
#define BTRFS_MOUNT_USEBACKUPROOT (1 << 18)
|
||||
#define BTRFS_MOUNT_SKIP_BALANCE (1 << 19)
|
||||
#define BTRFS_MOUNT_CHECK_INTEGRITY (1 << 20)
|
||||
#define BTRFS_MOUNT_CHECK_INTEGRITY_INCLUDING_EXTENT_DATA (1 << 21)
|
||||
@@ -2250,9 +2280,10 @@ struct btrfs_ioctl_defrag_range_args {
|
||||
#define BTRFS_MOUNT_FRAGMENT_DATA (1 << 24)
|
||||
#define BTRFS_MOUNT_FRAGMENT_METADATA (1 << 25)
|
||||
#define BTRFS_MOUNT_FREE_SPACE_TREE (1 << 26)
|
||||
#define BTRFS_MOUNT_NOLOGREPLAY (1 << 27)
|
||||
|
||||
#define BTRFS_DEFAULT_COMMIT_INTERVAL (30)
|
||||
#define BTRFS_DEFAULT_MAX_INLINE (8192)
|
||||
#define BTRFS_DEFAULT_MAX_INLINE (2048)
|
||||
|
||||
#define btrfs_clear_opt(o, opt) ((o) &= ~BTRFS_MOUNT_##opt)
|
||||
#define btrfs_set_opt(o, opt) ((o) |= BTRFS_MOUNT_##opt)
|
||||
@@ -2353,6 +2384,9 @@ struct btrfs_map_token {
|
||||
unsigned long offset;
|
||||
};
|
||||
|
||||
#define BTRFS_BYTES_TO_BLKS(fs_info, bytes) \
|
||||
((bytes) >> (fs_info)->sb->s_blocksize_bits)
|
||||
|
||||
static inline void btrfs_init_map_token (struct btrfs_map_token *token)
|
||||
{
|
||||
token->kaddr = NULL;
|
||||
@@ -3448,8 +3482,7 @@ u64 btrfs_csum_bytes_to_leaves(struct btrfs_root *root, u64 csum_bytes);
|
||||
static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root,
|
||||
unsigned num_items)
|
||||
{
|
||||
return (root->nodesize + root->nodesize * (BTRFS_MAX_LEVEL - 1)) *
|
||||
2 * num_items;
|
||||
return root->nodesize * BTRFS_MAX_LEVEL * 2 * num_items;
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -4027,7 +4060,7 @@ int btrfs_unlink_subvol(struct btrfs_trans_handle *trans,
|
||||
struct btrfs_root *root,
|
||||
struct inode *dir, u64 objectid,
|
||||
const char *name, int name_len);
|
||||
int btrfs_truncate_page(struct inode *inode, loff_t from, loff_t len,
|
||||
int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
|
||||
int front);
|
||||
int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
|
||||
struct btrfs_root *root,
|
||||
@@ -4089,6 +4122,7 @@ void btrfs_test_inode_set_ops(struct inode *inode);
|
||||
|
||||
/* ioctl.c */
|
||||
long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
|
||||
int btrfs_ioctl_get_supported_features(void __user *arg);
|
||||
void btrfs_update_iflags(struct inode *inode);
|
||||
void btrfs_inherit_iflags(struct inode *inode, struct inode *dir);
|
||||
int btrfs_is_empty_uuid(u8 *uuid);
|
||||
@@ -4151,7 +4185,8 @@ void btrfs_sysfs_remove_mounted(struct btrfs_fs_info *fs_info);
|
||||
ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size);
|
||||
|
||||
/* super.c */
|
||||
int btrfs_parse_options(struct btrfs_root *root, char *options);
|
||||
int btrfs_parse_options(struct btrfs_root *root, char *options,
|
||||
unsigned long new_flags);
|
||||
int btrfs_sync_fs(struct super_block *sb, int wait);
|
||||
|
||||
#ifdef CONFIG_PRINTK
|
||||
@@ -4525,8 +4560,8 @@ struct reada_control *btrfs_reada_add(struct btrfs_root *root,
|
||||
struct btrfs_key *start, struct btrfs_key *end);
|
||||
int btrfs_reada_wait(void *handle);
|
||||
void btrfs_reada_detach(void *handle);
|
||||
int btree_readahead_hook(struct btrfs_root *root, struct extent_buffer *eb,
|
||||
u64 start, int err);
|
||||
int btree_readahead_hook(struct btrfs_fs_info *fs_info,
|
||||
struct extent_buffer *eb, u64 start, int err);
|
||||
|
||||
static inline int is_fstree(u64 rootid)
|
||||
{
|
||||
|
||||
@@ -43,8 +43,7 @@ int __init btrfs_delayed_inode_init(void)
|
||||
|
||||
void btrfs_delayed_inode_exit(void)
|
||||
{
|
||||
if (delayed_node_cache)
|
||||
kmem_cache_destroy(delayed_node_cache);
|
||||
kmem_cache_destroy(delayed_node_cache);
|
||||
}
|
||||
|
||||
static inline void btrfs_init_delayed_node(
|
||||
@@ -651,9 +650,14 @@ static int btrfs_delayed_inode_reserve_metadata(
|
||||
goto out;
|
||||
|
||||
ret = btrfs_block_rsv_migrate(src_rsv, dst_rsv, num_bytes);
|
||||
if (!WARN_ON(ret))
|
||||
if (!ret)
|
||||
goto out;
|
||||
|
||||
if (btrfs_test_opt(root, ENOSPC_DEBUG)) {
|
||||
btrfs_debug(root->fs_info,
|
||||
"block rsv migrate returned %d", ret);
|
||||
WARN_ON(1);
|
||||
}
|
||||
/*
|
||||
* Ok this is a problem, let's just steal from the global rsv
|
||||
* since this really shouldn't happen that often.
|
||||
|
||||
@@ -929,14 +929,10 @@ btrfs_find_delayed_ref_head(struct btrfs_trans_handle *trans, u64 bytenr)
|
||||
|
||||
void btrfs_delayed_ref_exit(void)
|
||||
{
|
||||
if (btrfs_delayed_ref_head_cachep)
|
||||
kmem_cache_destroy(btrfs_delayed_ref_head_cachep);
|
||||
if (btrfs_delayed_tree_ref_cachep)
|
||||
kmem_cache_destroy(btrfs_delayed_tree_ref_cachep);
|
||||
if (btrfs_delayed_data_ref_cachep)
|
||||
kmem_cache_destroy(btrfs_delayed_data_ref_cachep);
|
||||
if (btrfs_delayed_extent_op_cachep)
|
||||
kmem_cache_destroy(btrfs_delayed_extent_op_cachep);
|
||||
kmem_cache_destroy(btrfs_delayed_ref_head_cachep);
|
||||
kmem_cache_destroy(btrfs_delayed_tree_ref_cachep);
|
||||
kmem_cache_destroy(btrfs_delayed_data_ref_cachep);
|
||||
kmem_cache_destroy(btrfs_delayed_extent_op_cachep);
|
||||
}
|
||||
|
||||
int btrfs_delayed_ref_init(void)
|
||||
|
||||
+73
-63
@@ -202,13 +202,13 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans,
|
||||
struct btrfs_dev_replace_item *ptr;
|
||||
struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
|
||||
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 0);
|
||||
if (!dev_replace->is_valid ||
|
||||
!dev_replace->item_needs_writeback) {
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 0);
|
||||
return 0;
|
||||
}
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 0);
|
||||
|
||||
key.objectid = 0;
|
||||
key.type = BTRFS_DEV_REPLACE_KEY;
|
||||
@@ -264,7 +264,7 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans,
|
||||
ptr = btrfs_item_ptr(eb, path->slots[0],
|
||||
struct btrfs_dev_replace_item);
|
||||
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 1);
|
||||
if (dev_replace->srcdev)
|
||||
btrfs_set_dev_replace_src_devid(eb, ptr,
|
||||
dev_replace->srcdev->devid);
|
||||
@@ -287,7 +287,7 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans,
|
||||
btrfs_set_dev_replace_cursor_right(eb, ptr,
|
||||
dev_replace->cursor_right);
|
||||
dev_replace->item_needs_writeback = 0;
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
|
||||
btrfs_mark_buffer_dirty(eb);
|
||||
|
||||
@@ -356,7 +356,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
|
||||
return PTR_ERR(trans);
|
||||
}
|
||||
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 1);
|
||||
switch (dev_replace->replace_state) {
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
|
||||
@@ -395,7 +395,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
|
||||
dev_replace->is_valid = 1;
|
||||
dev_replace->item_needs_writeback = 1;
|
||||
args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR;
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
|
||||
ret = btrfs_sysfs_add_device_link(tgt_device->fs_devices, tgt_device);
|
||||
if (ret)
|
||||
@@ -407,7 +407,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
|
||||
trans = btrfs_start_transaction(root, 0);
|
||||
if (IS_ERR(trans)) {
|
||||
ret = PTR_ERR(trans);
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 1);
|
||||
goto leave;
|
||||
}
|
||||
|
||||
@@ -433,7 +433,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
|
||||
leave:
|
||||
dev_replace->srcdev = NULL;
|
||||
dev_replace->tgtdev = NULL;
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
btrfs_destroy_dev_replace_tgtdev(fs_info, tgt_device);
|
||||
return ret;
|
||||
}
|
||||
@@ -471,18 +471,18 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
|
||||
/* don't allow cancel or unmount to disturb the finishing procedure */
|
||||
mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
|
||||
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 0);
|
||||
/* was the operation canceled, or is it finished? */
|
||||
if (dev_replace->replace_state !=
|
||||
BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED) {
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 0);
|
||||
mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
|
||||
return 0;
|
||||
}
|
||||
|
||||
tgt_device = dev_replace->tgtdev;
|
||||
src_device = dev_replace->srcdev;
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 0);
|
||||
|
||||
/*
|
||||
* flush all outstanding I/O and inode extent mappings before the
|
||||
@@ -507,7 +507,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
|
||||
/* keep away write_all_supers() during the finishing procedure */
|
||||
mutex_lock(&root->fs_info->fs_devices->device_list_mutex);
|
||||
mutex_lock(&root->fs_info->chunk_mutex);
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 1);
|
||||
dev_replace->replace_state =
|
||||
scrub_ret ? BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED
|
||||
: BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED;
|
||||
@@ -528,7 +528,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
|
||||
rcu_str_deref(src_device->name),
|
||||
src_device->devid,
|
||||
rcu_str_deref(tgt_device->name), scrub_ret);
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
mutex_unlock(&root->fs_info->chunk_mutex);
|
||||
mutex_unlock(&root->fs_info->fs_devices->device_list_mutex);
|
||||
mutex_unlock(&uuid_mutex);
|
||||
@@ -565,7 +565,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
|
||||
list_add(&tgt_device->dev_alloc_list, &fs_info->fs_devices->alloc_list);
|
||||
fs_info->fs_devices->rw_devices++;
|
||||
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
|
||||
btrfs_rm_dev_replace_blocked(fs_info);
|
||||
|
||||
@@ -649,7 +649,7 @@ void btrfs_dev_replace_status(struct btrfs_fs_info *fs_info,
|
||||
struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
|
||||
struct btrfs_device *srcdev;
|
||||
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 0);
|
||||
/* even if !dev_replace_is_valid, the values are good enough for
|
||||
* the replace_status ioctl */
|
||||
args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR;
|
||||
@@ -675,7 +675,7 @@ void btrfs_dev_replace_status(struct btrfs_fs_info *fs_info,
|
||||
div_u64(btrfs_device_get_total_bytes(srcdev), 1000));
|
||||
break;
|
||||
}
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 0);
|
||||
}
|
||||
|
||||
int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info,
|
||||
@@ -698,13 +698,13 @@ static u64 __btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
|
||||
return -EROFS;
|
||||
|
||||
mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 1);
|
||||
switch (dev_replace->replace_state) {
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED:
|
||||
result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NOT_STARTED;
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
goto leave;
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED:
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED:
|
||||
@@ -717,7 +717,7 @@ static u64 __btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
|
||||
dev_replace->replace_state = BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED;
|
||||
dev_replace->time_stopped = get_seconds();
|
||||
dev_replace->item_needs_writeback = 1;
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
btrfs_scrub_cancel(fs_info);
|
||||
|
||||
trans = btrfs_start_transaction(root, 0);
|
||||
@@ -740,7 +740,7 @@ void btrfs_dev_replace_suspend_for_unmount(struct btrfs_fs_info *fs_info)
|
||||
struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
|
||||
|
||||
mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 1);
|
||||
switch (dev_replace->replace_state) {
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
|
||||
@@ -756,7 +756,7 @@ void btrfs_dev_replace_suspend_for_unmount(struct btrfs_fs_info *fs_info)
|
||||
break;
|
||||
}
|
||||
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
|
||||
}
|
||||
|
||||
@@ -766,12 +766,12 @@ int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info)
|
||||
struct task_struct *task;
|
||||
struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
|
||||
|
||||
btrfs_dev_replace_lock(dev_replace);
|
||||
btrfs_dev_replace_lock(dev_replace, 1);
|
||||
switch (dev_replace->replace_state) {
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED:
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED:
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
return 0;
|
||||
case BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED:
|
||||
break;
|
||||
@@ -784,10 +784,10 @@ int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info)
|
||||
btrfs_info(fs_info, "cannot continue dev_replace, tgtdev is missing");
|
||||
btrfs_info(fs_info,
|
||||
"you may cancel the operation after 'mount -o degraded'");
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
return 0;
|
||||
}
|
||||
btrfs_dev_replace_unlock(dev_replace);
|
||||
btrfs_dev_replace_unlock(dev_replace, 1);
|
||||
|
||||
WARN_ON(atomic_xchg(
|
||||
&fs_info->mutually_exclusive_operation_running, 1));
|
||||
@@ -802,7 +802,7 @@ static int btrfs_dev_replace_kthread(void *data)
|
||||
struct btrfs_ioctl_dev_replace_args *status_args;
|
||||
u64 progress;
|
||||
|
||||
status_args = kzalloc(sizeof(*status_args), GFP_NOFS);
|
||||
status_args = kzalloc(sizeof(*status_args), GFP_KERNEL);
|
||||
if (status_args) {
|
||||
btrfs_dev_replace_status(fs_info, status_args);
|
||||
progress = status_args->status.progress_1000;
|
||||
@@ -858,57 +858,67 @@ int btrfs_dev_replace_is_ongoing(struct btrfs_dev_replace *dev_replace)
|
||||
* not called and the the filesystem is remounted
|
||||
* in degraded state. This does not stop the
|
||||
* dev_replace procedure. It needs to be canceled
|
||||
* manually if the cancelation is wanted.
|
||||
* manually if the cancellation is wanted.
|
||||
*/
|
||||
break;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace)
|
||||
void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace, int rw)
|
||||
{
|
||||
/* the beginning is just an optimization for the typical case */
|
||||
if (atomic_read(&dev_replace->nesting_level) == 0) {
|
||||
acquire_lock:
|
||||
/* this is not a nested case where the same thread
|
||||
* is trying to acqurire the same lock twice */
|
||||
mutex_lock(&dev_replace->lock);
|
||||
mutex_lock(&dev_replace->lock_management_lock);
|
||||
dev_replace->lock_owner = current->pid;
|
||||
atomic_inc(&dev_replace->nesting_level);
|
||||
mutex_unlock(&dev_replace->lock_management_lock);
|
||||
return;
|
||||
if (rw == 1) {
|
||||
/* write */
|
||||
again:
|
||||
wait_event(dev_replace->read_lock_wq,
|
||||
atomic_read(&dev_replace->blocking_readers) == 0);
|
||||
write_lock(&dev_replace->lock);
|
||||
if (atomic_read(&dev_replace->blocking_readers)) {
|
||||
write_unlock(&dev_replace->lock);
|
||||
goto again;
|
||||
}
|
||||
} else {
|
||||
read_lock(&dev_replace->lock);
|
||||
atomic_inc(&dev_replace->read_locks);
|
||||
}
|
||||
|
||||
mutex_lock(&dev_replace->lock_management_lock);
|
||||
if (atomic_read(&dev_replace->nesting_level) > 0 &&
|
||||
dev_replace->lock_owner == current->pid) {
|
||||
WARN_ON(!mutex_is_locked(&dev_replace->lock));
|
||||
atomic_inc(&dev_replace->nesting_level);
|
||||
mutex_unlock(&dev_replace->lock_management_lock);
|
||||
return;
|
||||
}
|
||||
|
||||
mutex_unlock(&dev_replace->lock_management_lock);
|
||||
goto acquire_lock;
|
||||
}
|
||||
|
||||
void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace)
|
||||
void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace, int rw)
|
||||
{
|
||||
WARN_ON(!mutex_is_locked(&dev_replace->lock));
|
||||
mutex_lock(&dev_replace->lock_management_lock);
|
||||
WARN_ON(atomic_read(&dev_replace->nesting_level) < 1);
|
||||
WARN_ON(dev_replace->lock_owner != current->pid);
|
||||
atomic_dec(&dev_replace->nesting_level);
|
||||
if (atomic_read(&dev_replace->nesting_level) == 0) {
|
||||
dev_replace->lock_owner = 0;
|
||||
mutex_unlock(&dev_replace->lock_management_lock);
|
||||
mutex_unlock(&dev_replace->lock);
|
||||
if (rw == 1) {
|
||||
/* write */
|
||||
ASSERT(atomic_read(&dev_replace->blocking_readers) == 0);
|
||||
write_unlock(&dev_replace->lock);
|
||||
} else {
|
||||
mutex_unlock(&dev_replace->lock_management_lock);
|
||||
ASSERT(atomic_read(&dev_replace->read_locks) > 0);
|
||||
atomic_dec(&dev_replace->read_locks);
|
||||
read_unlock(&dev_replace->lock);
|
||||
}
|
||||
}
|
||||
|
||||
/* inc blocking cnt and release read lock */
|
||||
void btrfs_dev_replace_set_lock_blocking(
|
||||
struct btrfs_dev_replace *dev_replace)
|
||||
{
|
||||
/* only set blocking for read lock */
|
||||
ASSERT(atomic_read(&dev_replace->read_locks) > 0);
|
||||
atomic_inc(&dev_replace->blocking_readers);
|
||||
read_unlock(&dev_replace->lock);
|
||||
}
|
||||
|
||||
/* acquire read lock and dec blocking cnt */
|
||||
void btrfs_dev_replace_clear_lock_blocking(
|
||||
struct btrfs_dev_replace *dev_replace)
|
||||
{
|
||||
/* only set blocking for read lock */
|
||||
ASSERT(atomic_read(&dev_replace->read_locks) > 0);
|
||||
ASSERT(atomic_read(&dev_replace->blocking_readers) > 0);
|
||||
read_lock(&dev_replace->lock);
|
||||
if (atomic_dec_and_test(&dev_replace->blocking_readers) &&
|
||||
waitqueue_active(&dev_replace->read_lock_wq))
|
||||
wake_up(&dev_replace->read_lock_wq);
|
||||
}
|
||||
|
||||
void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
|
||||
{
|
||||
percpu_counter_inc(&fs_info->bio_counter);
|
||||
|
||||
@@ -34,8 +34,11 @@ int btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info,
|
||||
void btrfs_dev_replace_suspend_for_unmount(struct btrfs_fs_info *fs_info);
|
||||
int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info);
|
||||
int btrfs_dev_replace_is_ongoing(struct btrfs_dev_replace *dev_replace);
|
||||
void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace);
|
||||
void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace);
|
||||
void btrfs_dev_replace_lock(struct btrfs_dev_replace *dev_replace, int rw);
|
||||
void btrfs_dev_replace_unlock(struct btrfs_dev_replace *dev_replace, int rw);
|
||||
void btrfs_dev_replace_set_lock_blocking(struct btrfs_dev_replace *dev_replace);
|
||||
void btrfs_dev_replace_clear_lock_blocking(
|
||||
struct btrfs_dev_replace *dev_replace);
|
||||
|
||||
static inline void btrfs_dev_replace_stats_inc(atomic64_t *stat_value)
|
||||
{
|
||||
|
||||
+41
-30
@@ -50,6 +50,7 @@
|
||||
#include "raid56.h"
|
||||
#include "sysfs.h"
|
||||
#include "qgroup.h"
|
||||
#include "compression.h"
|
||||
|
||||
#ifdef CONFIG_X86
|
||||
#include <asm/cpufeature.h>
|
||||
@@ -110,8 +111,7 @@ int __init btrfs_end_io_wq_init(void)
|
||||
|
||||
void btrfs_end_io_wq_exit(void)
|
||||
{
|
||||
if (btrfs_end_io_wq_cache)
|
||||
kmem_cache_destroy(btrfs_end_io_wq_cache);
|
||||
kmem_cache_destroy(btrfs_end_io_wq_cache);
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -612,6 +612,7 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
|
||||
int found_level;
|
||||
struct extent_buffer *eb;
|
||||
struct btrfs_root *root = BTRFS_I(page->mapping->host)->root;
|
||||
struct btrfs_fs_info *fs_info = root->fs_info;
|
||||
int ret = 0;
|
||||
int reads_done;
|
||||
|
||||
@@ -637,21 +638,21 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
|
||||
|
||||
found_start = btrfs_header_bytenr(eb);
|
||||
if (found_start != eb->start) {
|
||||
btrfs_err_rl(eb->fs_info, "bad tree block start %llu %llu",
|
||||
found_start, eb->start);
|
||||
btrfs_err_rl(fs_info, "bad tree block start %llu %llu",
|
||||
found_start, eb->start);
|
||||
ret = -EIO;
|
||||
goto err;
|
||||
}
|
||||
if (check_tree_block_fsid(root->fs_info, eb)) {
|
||||
btrfs_err_rl(eb->fs_info, "bad fsid on block %llu",
|
||||
eb->start);
|
||||
if (check_tree_block_fsid(fs_info, eb)) {
|
||||
btrfs_err_rl(fs_info, "bad fsid on block %llu",
|
||||
eb->start);
|
||||
ret = -EIO;
|
||||
goto err;
|
||||
}
|
||||
found_level = btrfs_header_level(eb);
|
||||
if (found_level >= BTRFS_MAX_LEVEL) {
|
||||
btrfs_err(root->fs_info, "bad tree block level %d",
|
||||
(int)btrfs_header_level(eb));
|
||||
btrfs_err(fs_info, "bad tree block level %d",
|
||||
(int)btrfs_header_level(eb));
|
||||
ret = -EIO;
|
||||
goto err;
|
||||
}
|
||||
@@ -659,7 +660,7 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
|
||||
btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb),
|
||||
eb, found_level);
|
||||
|
||||
ret = csum_tree_block(root->fs_info, eb, 1);
|
||||
ret = csum_tree_block(fs_info, eb, 1);
|
||||
if (ret) {
|
||||
ret = -EIO;
|
||||
goto err;
|
||||
@@ -680,7 +681,7 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
|
||||
err:
|
||||
if (reads_done &&
|
||||
test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags))
|
||||
btree_readahead_hook(root, eb, eb->start, ret);
|
||||
btree_readahead_hook(fs_info, eb, eb->start, ret);
|
||||
|
||||
if (ret) {
|
||||
/*
|
||||
@@ -699,14 +700,13 @@ out:
|
||||
static int btree_io_failed_hook(struct page *page, int failed_mirror)
|
||||
{
|
||||
struct extent_buffer *eb;
|
||||
struct btrfs_root *root = BTRFS_I(page->mapping->host)->root;
|
||||
|
||||
eb = (struct extent_buffer *)page->private;
|
||||
set_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags);
|
||||
eb->read_mirror = failed_mirror;
|
||||
atomic_dec(&eb->io_pages);
|
||||
if (test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags))
|
||||
btree_readahead_hook(root, eb, eb->start, -EIO);
|
||||
btree_readahead_hook(eb->fs_info, eb, eb->start, -EIO);
|
||||
return -EIO; /* we fixed nothing */
|
||||
}
|
||||
|
||||
@@ -816,7 +816,7 @@ static void run_one_async_done(struct btrfs_work *work)
|
||||
waitqueue_active(&fs_info->async_submit_wait))
|
||||
wake_up(&fs_info->async_submit_wait);
|
||||
|
||||
/* If an error occured we just want to clean up the bio and move on */
|
||||
/* If an error occurred we just want to clean up the bio and move on */
|
||||
if (async->error) {
|
||||
async->bio->bi_error = async->error;
|
||||
bio_endio(async->bio);
|
||||
@@ -1296,9 +1296,10 @@ static void __setup_root(u32 nodesize, u32 sectorsize, u32 stripesize,
|
||||
spin_lock_init(&root->root_item_lock);
|
||||
}
|
||||
|
||||
static struct btrfs_root *btrfs_alloc_root(struct btrfs_fs_info *fs_info)
|
||||
static struct btrfs_root *btrfs_alloc_root(struct btrfs_fs_info *fs_info,
|
||||
gfp_t flags)
|
||||
{
|
||||
struct btrfs_root *root = kzalloc(sizeof(*root), GFP_NOFS);
|
||||
struct btrfs_root *root = kzalloc(sizeof(*root), flags);
|
||||
if (root)
|
||||
root->fs_info = fs_info;
|
||||
return root;
|
||||
@@ -1310,7 +1311,7 @@ struct btrfs_root *btrfs_alloc_dummy_root(void)
|
||||
{
|
||||
struct btrfs_root *root;
|
||||
|
||||
root = btrfs_alloc_root(NULL);
|
||||
root = btrfs_alloc_root(NULL, GFP_KERNEL);
|
||||
if (!root)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
__setup_root(4096, 4096, 4096, root, NULL, 1);
|
||||
@@ -1332,7 +1333,7 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
|
||||
int ret = 0;
|
||||
uuid_le uuid;
|
||||
|
||||
root = btrfs_alloc_root(fs_info);
|
||||
root = btrfs_alloc_root(fs_info, GFP_KERNEL);
|
||||
if (!root)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
@@ -1408,7 +1409,7 @@ static struct btrfs_root *alloc_log_tree(struct btrfs_trans_handle *trans,
|
||||
struct btrfs_root *tree_root = fs_info->tree_root;
|
||||
struct extent_buffer *leaf;
|
||||
|
||||
root = btrfs_alloc_root(fs_info);
|
||||
root = btrfs_alloc_root(fs_info, GFP_NOFS);
|
||||
if (!root)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
@@ -1506,7 +1507,7 @@ static struct btrfs_root *btrfs_read_tree_root(struct btrfs_root *tree_root,
|
||||
if (!path)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
root = btrfs_alloc_root(fs_info);
|
||||
root = btrfs_alloc_root(fs_info, GFP_NOFS);
|
||||
if (!root) {
|
||||
ret = -ENOMEM;
|
||||
goto alloc_fail;
|
||||
@@ -2272,9 +2273,11 @@ static void btrfs_init_dev_replace_locks(struct btrfs_fs_info *fs_info)
|
||||
fs_info->dev_replace.lock_owner = 0;
|
||||
atomic_set(&fs_info->dev_replace.nesting_level, 0);
|
||||
mutex_init(&fs_info->dev_replace.lock_finishing_cancel_unmount);
|
||||
mutex_init(&fs_info->dev_replace.lock_management_lock);
|
||||
mutex_init(&fs_info->dev_replace.lock);
|
||||
rwlock_init(&fs_info->dev_replace.lock);
|
||||
atomic_set(&fs_info->dev_replace.read_locks, 0);
|
||||
atomic_set(&fs_info->dev_replace.blocking_readers, 0);
|
||||
init_waitqueue_head(&fs_info->replace_wait);
|
||||
init_waitqueue_head(&fs_info->dev_replace.read_lock_wq);
|
||||
}
|
||||
|
||||
static void btrfs_init_qgroup(struct btrfs_fs_info *fs_info)
|
||||
@@ -2385,7 +2388,7 @@ static int btrfs_replay_log(struct btrfs_fs_info *fs_info,
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
log_tree_root = btrfs_alloc_root(fs_info);
|
||||
log_tree_root = btrfs_alloc_root(fs_info, GFP_KERNEL);
|
||||
if (!log_tree_root)
|
||||
return -ENOMEM;
|
||||
|
||||
@@ -2510,8 +2513,8 @@ int open_ctree(struct super_block *sb,
|
||||
int backup_index = 0;
|
||||
int max_active;
|
||||
|
||||
tree_root = fs_info->tree_root = btrfs_alloc_root(fs_info);
|
||||
chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info);
|
||||
tree_root = fs_info->tree_root = btrfs_alloc_root(fs_info, GFP_KERNEL);
|
||||
chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info, GFP_KERNEL);
|
||||
if (!tree_root || !chunk_root) {
|
||||
err = -ENOMEM;
|
||||
goto fail;
|
||||
@@ -2603,6 +2606,7 @@ int open_ctree(struct super_block *sb,
|
||||
atomic_set(&fs_info->nr_async_bios, 0);
|
||||
atomic_set(&fs_info->defrag_running, 0);
|
||||
atomic_set(&fs_info->qgroup_op_seq, 0);
|
||||
atomic_set(&fs_info->reada_works_cnt, 0);
|
||||
atomic64_set(&fs_info->tree_mod_seq, 0);
|
||||
fs_info->sb = sb;
|
||||
fs_info->max_inline = BTRFS_DEFAULT_MAX_INLINE;
|
||||
@@ -2622,7 +2626,7 @@ int open_ctree(struct super_block *sb,
|
||||
INIT_LIST_HEAD(&fs_info->ordered_roots);
|
||||
spin_lock_init(&fs_info->ordered_root_lock);
|
||||
fs_info->delayed_root = kmalloc(sizeof(struct btrfs_delayed_root),
|
||||
GFP_NOFS);
|
||||
GFP_KERNEL);
|
||||
if (!fs_info->delayed_root) {
|
||||
err = -ENOMEM;
|
||||
goto fail_iput;
|
||||
@@ -2750,7 +2754,7 @@ int open_ctree(struct super_block *sb,
|
||||
*/
|
||||
fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
|
||||
|
||||
ret = btrfs_parse_options(tree_root, options);
|
||||
ret = btrfs_parse_options(tree_root, options, sb->s_flags);
|
||||
if (ret) {
|
||||
err = ret;
|
||||
goto fail_alloc;
|
||||
@@ -3029,8 +3033,9 @@ retry_root_backup:
|
||||
if (ret)
|
||||
goto fail_trans_kthread;
|
||||
|
||||
/* do not make disk changes in broken FS */
|
||||
if (btrfs_super_log_root(disk_super) != 0) {
|
||||
/* do not make disk changes in broken FS or nologreplay is given */
|
||||
if (btrfs_super_log_root(disk_super) != 0 &&
|
||||
!btrfs_test_opt(tree_root, NOLOGREPLAY)) {
|
||||
ret = btrfs_replay_log(fs_info, fs_devices);
|
||||
if (ret) {
|
||||
err = ret;
|
||||
@@ -3146,6 +3151,12 @@ retry_root_backup:
|
||||
|
||||
fs_info->open = 1;
|
||||
|
||||
/*
|
||||
* backuproot only affect mount behavior, and if open_ctree succeeded,
|
||||
* no need to keep the flag
|
||||
*/
|
||||
btrfs_clear_opt(fs_info->mount_opt, USEBACKUPROOT);
|
||||
|
||||
return 0;
|
||||
|
||||
fail_qgroup:
|
||||
@@ -3200,7 +3211,7 @@ fail:
|
||||
return err;
|
||||
|
||||
recovery_tree_root:
|
||||
if (!btrfs_test_opt(tree_root, RECOVERY))
|
||||
if (!btrfs_test_opt(tree_root, USEBACKUPROOT))
|
||||
goto fail_tree_roots;
|
||||
|
||||
free_root_pointers(fs_info, 0);
|
||||
|
||||
+23
-17
@@ -4838,7 +4838,7 @@ static inline int need_do_async_reclaim(struct btrfs_space_info *space_info,
|
||||
u64 thresh = div_factor_fine(space_info->total_bytes, 98);
|
||||
|
||||
/* If we're just plain full then async reclaim just slows us down. */
|
||||
if (space_info->bytes_used >= thresh)
|
||||
if ((space_info->bytes_used + space_info->bytes_reserved) >= thresh)
|
||||
return 0;
|
||||
|
||||
return (used >= thresh && !btrfs_fs_closing(fs_info) &&
|
||||
@@ -5373,27 +5373,33 @@ static void update_global_block_rsv(struct btrfs_fs_info *fs_info)
|
||||
|
||||
block_rsv->size = min_t(u64, num_bytes, SZ_512M);
|
||||
|
||||
num_bytes = sinfo->bytes_used + sinfo->bytes_pinned +
|
||||
sinfo->bytes_reserved + sinfo->bytes_readonly +
|
||||
sinfo->bytes_may_use;
|
||||
|
||||
if (sinfo->total_bytes > num_bytes) {
|
||||
num_bytes = sinfo->total_bytes - num_bytes;
|
||||
block_rsv->reserved += num_bytes;
|
||||
sinfo->bytes_may_use += num_bytes;
|
||||
trace_btrfs_space_reservation(fs_info, "space_info",
|
||||
sinfo->flags, num_bytes, 1);
|
||||
}
|
||||
|
||||
if (block_rsv->reserved >= block_rsv->size) {
|
||||
if (block_rsv->reserved < block_rsv->size) {
|
||||
num_bytes = sinfo->bytes_used + sinfo->bytes_pinned +
|
||||
sinfo->bytes_reserved + sinfo->bytes_readonly +
|
||||
sinfo->bytes_may_use;
|
||||
if (sinfo->total_bytes > num_bytes) {
|
||||
num_bytes = sinfo->total_bytes - num_bytes;
|
||||
num_bytes = min(num_bytes,
|
||||
block_rsv->size - block_rsv->reserved);
|
||||
block_rsv->reserved += num_bytes;
|
||||
sinfo->bytes_may_use += num_bytes;
|
||||
trace_btrfs_space_reservation(fs_info, "space_info",
|
||||
sinfo->flags, num_bytes,
|
||||
1);
|
||||
}
|
||||
} else if (block_rsv->reserved > block_rsv->size) {
|
||||
num_bytes = block_rsv->reserved - block_rsv->size;
|
||||
sinfo->bytes_may_use -= num_bytes;
|
||||
trace_btrfs_space_reservation(fs_info, "space_info",
|
||||
sinfo->flags, num_bytes, 0);
|
||||
block_rsv->reserved = block_rsv->size;
|
||||
block_rsv->full = 1;
|
||||
}
|
||||
|
||||
if (block_rsv->reserved == block_rsv->size)
|
||||
block_rsv->full = 1;
|
||||
else
|
||||
block_rsv->full = 0;
|
||||
|
||||
spin_unlock(&block_rsv->lock);
|
||||
spin_unlock(&sinfo->lock);
|
||||
}
|
||||
@@ -5752,7 +5758,7 @@ out_fail:
|
||||
|
||||
/*
|
||||
* This is tricky, but first we need to figure out how much we
|
||||
* free'd from any free-ers that occured during this
|
||||
* free'd from any free-ers that occurred during this
|
||||
* reservation, so we reset ->csum_bytes to the csum_bytes
|
||||
* before we dropped our lock, and then call the free for the
|
||||
* number of bytes that were freed while we were trying our
|
||||
@@ -7018,7 +7024,7 @@ btrfs_lock_cluster(struct btrfs_block_group_cache *block_group,
|
||||
struct btrfs_free_cluster *cluster,
|
||||
int delalloc)
|
||||
{
|
||||
struct btrfs_block_group_cache *used_bg;
|
||||
struct btrfs_block_group_cache *used_bg = NULL;
|
||||
bool locked = false;
|
||||
again:
|
||||
spin_lock(&cluster->refill_lock);
|
||||
|
||||
+18
-22
@@ -206,10 +206,8 @@ void extent_io_exit(void)
|
||||
* destroy caches.
|
||||
*/
|
||||
rcu_barrier();
|
||||
if (extent_state_cache)
|
||||
kmem_cache_destroy(extent_state_cache);
|
||||
if (extent_buffer_cache)
|
||||
kmem_cache_destroy(extent_buffer_cache);
|
||||
kmem_cache_destroy(extent_state_cache);
|
||||
kmem_cache_destroy(extent_buffer_cache);
|
||||
if (btrfs_bioset)
|
||||
bioset_free(btrfs_bioset);
|
||||
}
|
||||
@@ -232,7 +230,7 @@ static struct extent_state *alloc_extent_state(gfp_t mask)
|
||||
if (!state)
|
||||
return state;
|
||||
state->state = 0;
|
||||
state->private = 0;
|
||||
state->failrec = NULL;
|
||||
RB_CLEAR_NODE(&state->rb_node);
|
||||
btrfs_leak_debug_add(&state->leak_list, &states);
|
||||
atomic_set(&state->refs, 1);
|
||||
@@ -1844,7 +1842,8 @@ out:
|
||||
* set the private field for a given byte offset in the tree. If there isn't
|
||||
* an extent_state there already, this does nothing.
|
||||
*/
|
||||
static int set_state_private(struct extent_io_tree *tree, u64 start, u64 private)
|
||||
static noinline int set_state_failrec(struct extent_io_tree *tree, u64 start,
|
||||
struct io_failure_record *failrec)
|
||||
{
|
||||
struct rb_node *node;
|
||||
struct extent_state *state;
|
||||
@@ -1865,13 +1864,14 @@ static int set_state_private(struct extent_io_tree *tree, u64 start, u64 private
|
||||
ret = -ENOENT;
|
||||
goto out;
|
||||
}
|
||||
state->private = private;
|
||||
state->failrec = failrec;
|
||||
out:
|
||||
spin_unlock(&tree->lock);
|
||||
return ret;
|
||||
}
|
||||
|
||||
int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private)
|
||||
static noinline int get_state_failrec(struct extent_io_tree *tree, u64 start,
|
||||
struct io_failure_record **failrec)
|
||||
{
|
||||
struct rb_node *node;
|
||||
struct extent_state *state;
|
||||
@@ -1892,7 +1892,7 @@ int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private)
|
||||
ret = -ENOENT;
|
||||
goto out;
|
||||
}
|
||||
*private = state->private;
|
||||
*failrec = state->failrec;
|
||||
out:
|
||||
spin_unlock(&tree->lock);
|
||||
return ret;
|
||||
@@ -1972,7 +1972,7 @@ int free_io_failure(struct inode *inode, struct io_failure_record *rec)
|
||||
int err = 0;
|
||||
struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree;
|
||||
|
||||
set_state_private(failure_tree, rec->start, 0);
|
||||
set_state_failrec(failure_tree, rec->start, NULL);
|
||||
ret = clear_extent_bits(failure_tree, rec->start,
|
||||
rec->start + rec->len - 1,
|
||||
EXTENT_LOCKED | EXTENT_DIRTY, GFP_NOFS);
|
||||
@@ -2089,7 +2089,6 @@ int clean_io_failure(struct inode *inode, u64 start, struct page *page,
|
||||
unsigned int pg_offset)
|
||||
{
|
||||
u64 private;
|
||||
u64 private_failure;
|
||||
struct io_failure_record *failrec;
|
||||
struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
|
||||
struct extent_state *state;
|
||||
@@ -2102,12 +2101,11 @@ int clean_io_failure(struct inode *inode, u64 start, struct page *page,
|
||||
if (!ret)
|
||||
return 0;
|
||||
|
||||
ret = get_state_private(&BTRFS_I(inode)->io_failure_tree, start,
|
||||
&private_failure);
|
||||
ret = get_state_failrec(&BTRFS_I(inode)->io_failure_tree, start,
|
||||
&failrec);
|
||||
if (ret)
|
||||
return 0;
|
||||
|
||||
failrec = (struct io_failure_record *)(unsigned long) private_failure;
|
||||
BUG_ON(!failrec->this_mirror);
|
||||
|
||||
if (failrec->in_validation) {
|
||||
@@ -2167,7 +2165,7 @@ void btrfs_free_io_failure_record(struct inode *inode, u64 start, u64 end)
|
||||
|
||||
next = next_state(state);
|
||||
|
||||
failrec = (struct io_failure_record *)(unsigned long)state->private;
|
||||
failrec = state->failrec;
|
||||
free_extent_state(state);
|
||||
kfree(failrec);
|
||||
|
||||
@@ -2177,10 +2175,9 @@ void btrfs_free_io_failure_record(struct inode *inode, u64 start, u64 end)
|
||||
}
|
||||
|
||||
int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
|
||||
struct io_failure_record **failrec_ret)
|
||||
struct io_failure_record **failrec_ret)
|
||||
{
|
||||
struct io_failure_record *failrec;
|
||||
u64 private;
|
||||
struct extent_map *em;
|
||||
struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree;
|
||||
struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
|
||||
@@ -2188,7 +2185,7 @@ int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
|
||||
int ret;
|
||||
u64 logical;
|
||||
|
||||
ret = get_state_private(failure_tree, start, &private);
|
||||
ret = get_state_failrec(failure_tree, start, &failrec);
|
||||
if (ret) {
|
||||
failrec = kzalloc(sizeof(*failrec), GFP_NOFS);
|
||||
if (!failrec)
|
||||
@@ -2237,8 +2234,7 @@ int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
|
||||
ret = set_extent_bits(failure_tree, start, end,
|
||||
EXTENT_LOCKED | EXTENT_DIRTY, GFP_NOFS);
|
||||
if (ret >= 0)
|
||||
ret = set_state_private(failure_tree, start,
|
||||
(u64)(unsigned long)failrec);
|
||||
ret = set_state_failrec(failure_tree, start, failrec);
|
||||
/* set the bits in the inode's tree */
|
||||
if (ret >= 0)
|
||||
ret = set_extent_bits(tree, start, end, EXTENT_DAMAGED,
|
||||
@@ -2248,7 +2244,6 @@ int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
|
||||
return ret;
|
||||
}
|
||||
} else {
|
||||
failrec = (struct io_failure_record *)(unsigned long)private;
|
||||
pr_debug("Get IO Failure Record: (found) logical=%llu, start=%llu, len=%llu, validation=%d\n",
|
||||
failrec->logical, failrec->start, failrec->len,
|
||||
failrec->in_validation);
|
||||
@@ -3177,7 +3172,8 @@ static int __extent_read_full_page(struct extent_io_tree *tree,
|
||||
|
||||
while (1) {
|
||||
lock_extent(tree, start, end);
|
||||
ordered = btrfs_lookup_ordered_extent(inode, start);
|
||||
ordered = btrfs_lookup_ordered_range(inode, start,
|
||||
PAGE_CACHE_SIZE);
|
||||
if (!ordered)
|
||||
break;
|
||||
unlock_extent(tree, start, end);
|
||||
|
||||
@@ -61,6 +61,7 @@
|
||||
struct extent_state;
|
||||
struct btrfs_root;
|
||||
struct btrfs_io_bio;
|
||||
struct io_failure_record;
|
||||
|
||||
typedef int (extent_submit_bio_hook_t)(struct inode *inode, int rw,
|
||||
struct bio *bio, int mirror_num,
|
||||
@@ -111,8 +112,7 @@ struct extent_state {
|
||||
atomic_t refs;
|
||||
unsigned state;
|
||||
|
||||
/* for use by the FS */
|
||||
u64 private;
|
||||
struct io_failure_record *failrec;
|
||||
|
||||
#ifdef CONFIG_BTRFS_DEBUG
|
||||
struct list_head leak_list;
|
||||
@@ -342,7 +342,6 @@ int extent_readpages(struct extent_io_tree *tree,
|
||||
get_extent_t get_extent);
|
||||
int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
|
||||
__u64 start, __u64 len, get_extent_t *get_extent);
|
||||
int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private);
|
||||
void set_page_extent_mapped(struct page *page);
|
||||
|
||||
struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
|
||||
|
||||
@@ -4,6 +4,7 @@
|
||||
#include <linux/hardirq.h>
|
||||
#include "ctree.h"
|
||||
#include "extent_map.h"
|
||||
#include "compression.h"
|
||||
|
||||
|
||||
static struct kmem_cache *extent_map_cache;
|
||||
@@ -20,8 +21,7 @@ int __init extent_map_init(void)
|
||||
|
||||
void extent_map_exit(void)
|
||||
{
|
||||
if (extent_map_cache)
|
||||
kmem_cache_destroy(extent_map_cache);
|
||||
kmem_cache_destroy(extent_map_cache);
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -62,7 +62,7 @@ struct extent_map *alloc_extent_map(void)
|
||||
|
||||
/**
|
||||
* free_extent_map - drop reference count of an extent_map
|
||||
* @em: extent map beeing releasead
|
||||
* @em: extent map being releasead
|
||||
*
|
||||
* Drops the reference out on @em by one and free the structure
|
||||
* if the reference count hits zero.
|
||||
@@ -422,7 +422,7 @@ struct extent_map *search_extent_mapping(struct extent_map_tree *tree,
|
||||
/**
|
||||
* remove_extent_mapping - removes an extent_map from the extent tree
|
||||
* @tree: extent tree to remove from
|
||||
* @em: extent map beeing removed
|
||||
* @em: extent map being removed
|
||||
*
|
||||
* Removes @em from @tree. No reference counts are dropped, and no checks
|
||||
* are done to see if the range is in use
|
||||
|
||||
+70
-33
@@ -25,6 +25,7 @@
|
||||
#include "transaction.h"
|
||||
#include "volumes.h"
|
||||
#include "print-tree.h"
|
||||
#include "compression.h"
|
||||
|
||||
#define __MAX_CSUM_ITEMS(r, size) ((unsigned long)(((BTRFS_LEAF_DATA_SIZE(r) - \
|
||||
sizeof(struct btrfs_item) * 2) / \
|
||||
@@ -172,6 +173,7 @@ static int __btrfs_lookup_bio_sums(struct btrfs_root *root,
|
||||
u64 item_start_offset = 0;
|
||||
u64 item_last_offset = 0;
|
||||
u64 disk_bytenr;
|
||||
u64 page_bytes_left;
|
||||
u32 diff;
|
||||
int nblocks;
|
||||
int bio_index = 0;
|
||||
@@ -220,6 +222,8 @@ static int __btrfs_lookup_bio_sums(struct btrfs_root *root,
|
||||
disk_bytenr = (u64)bio->bi_iter.bi_sector << 9;
|
||||
if (dio)
|
||||
offset = logical_offset;
|
||||
|
||||
page_bytes_left = bvec->bv_len;
|
||||
while (bio_index < bio->bi_vcnt) {
|
||||
if (!dio)
|
||||
offset = page_offset(bvec->bv_page) + bvec->bv_offset;
|
||||
@@ -243,7 +247,7 @@ static int __btrfs_lookup_bio_sums(struct btrfs_root *root,
|
||||
if (BTRFS_I(inode)->root->root_key.objectid ==
|
||||
BTRFS_DATA_RELOC_TREE_OBJECTID) {
|
||||
set_extent_bits(io_tree, offset,
|
||||
offset + bvec->bv_len - 1,
|
||||
offset + root->sectorsize - 1,
|
||||
EXTENT_NODATASUM, GFP_NOFS);
|
||||
} else {
|
||||
btrfs_info(BTRFS_I(inode)->root->fs_info,
|
||||
@@ -281,13 +285,29 @@ static int __btrfs_lookup_bio_sums(struct btrfs_root *root,
|
||||
found:
|
||||
csum += count * csum_size;
|
||||
nblocks -= count;
|
||||
bio_index += count;
|
||||
|
||||
while (count--) {
|
||||
disk_bytenr += bvec->bv_len;
|
||||
offset += bvec->bv_len;
|
||||
bvec++;
|
||||
disk_bytenr += root->sectorsize;
|
||||
offset += root->sectorsize;
|
||||
page_bytes_left -= root->sectorsize;
|
||||
if (!page_bytes_left) {
|
||||
bio_index++;
|
||||
/*
|
||||
* make sure we're still inside the
|
||||
* bio before we update page_bytes_left
|
||||
*/
|
||||
if (bio_index >= bio->bi_vcnt) {
|
||||
WARN_ON_ONCE(count);
|
||||
goto done;
|
||||
}
|
||||
bvec++;
|
||||
page_bytes_left = bvec->bv_len;
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
done:
|
||||
btrfs_free_path(path);
|
||||
return 0;
|
||||
}
|
||||
@@ -432,6 +452,8 @@ int btrfs_csum_one_bio(struct btrfs_root *root, struct inode *inode,
|
||||
struct bio_vec *bvec = bio->bi_io_vec;
|
||||
int bio_index = 0;
|
||||
int index;
|
||||
int nr_sectors;
|
||||
int i;
|
||||
unsigned long total_bytes = 0;
|
||||
unsigned long this_sum_bytes = 0;
|
||||
u64 offset;
|
||||
@@ -459,41 +481,56 @@ int btrfs_csum_one_bio(struct btrfs_root *root, struct inode *inode,
|
||||
if (!contig)
|
||||
offset = page_offset(bvec->bv_page) + bvec->bv_offset;
|
||||
|
||||
if (offset >= ordered->file_offset + ordered->len ||
|
||||
offset < ordered->file_offset) {
|
||||
unsigned long bytes_left;
|
||||
sums->len = this_sum_bytes;
|
||||
this_sum_bytes = 0;
|
||||
btrfs_add_ordered_sum(inode, ordered, sums);
|
||||
btrfs_put_ordered_extent(ordered);
|
||||
data = kmap_atomic(bvec->bv_page);
|
||||
|
||||
bytes_left = bio->bi_iter.bi_size - total_bytes;
|
||||
nr_sectors = BTRFS_BYTES_TO_BLKS(root->fs_info,
|
||||
bvec->bv_len + root->sectorsize
|
||||
- 1);
|
||||
|
||||
sums = kzalloc(btrfs_ordered_sum_size(root, bytes_left),
|
||||
GFP_NOFS);
|
||||
BUG_ON(!sums); /* -ENOMEM */
|
||||
sums->len = bytes_left;
|
||||
ordered = btrfs_lookup_ordered_extent(inode, offset);
|
||||
BUG_ON(!ordered); /* Logic error */
|
||||
sums->bytenr = ((u64)bio->bi_iter.bi_sector << 9) +
|
||||
total_bytes;
|
||||
index = 0;
|
||||
for (i = 0; i < nr_sectors; i++) {
|
||||
if (offset >= ordered->file_offset + ordered->len ||
|
||||
offset < ordered->file_offset) {
|
||||
unsigned long bytes_left;
|
||||
|
||||
kunmap_atomic(data);
|
||||
sums->len = this_sum_bytes;
|
||||
this_sum_bytes = 0;
|
||||
btrfs_add_ordered_sum(inode, ordered, sums);
|
||||
btrfs_put_ordered_extent(ordered);
|
||||
|
||||
bytes_left = bio->bi_iter.bi_size - total_bytes;
|
||||
|
||||
sums = kzalloc(btrfs_ordered_sum_size(root, bytes_left),
|
||||
GFP_NOFS);
|
||||
BUG_ON(!sums); /* -ENOMEM */
|
||||
sums->len = bytes_left;
|
||||
ordered = btrfs_lookup_ordered_extent(inode,
|
||||
offset);
|
||||
ASSERT(ordered); /* Logic error */
|
||||
sums->bytenr = ((u64)bio->bi_iter.bi_sector << 9)
|
||||
+ total_bytes;
|
||||
index = 0;
|
||||
|
||||
data = kmap_atomic(bvec->bv_page);
|
||||
}
|
||||
|
||||
sums->sums[index] = ~(u32)0;
|
||||
sums->sums[index]
|
||||
= btrfs_csum_data(data + bvec->bv_offset
|
||||
+ (i * root->sectorsize),
|
||||
sums->sums[index],
|
||||
root->sectorsize);
|
||||
btrfs_csum_final(sums->sums[index],
|
||||
(char *)(sums->sums + index));
|
||||
index++;
|
||||
offset += root->sectorsize;
|
||||
this_sum_bytes += root->sectorsize;
|
||||
total_bytes += root->sectorsize;
|
||||
}
|
||||
|
||||
data = kmap_atomic(bvec->bv_page);
|
||||
sums->sums[index] = ~(u32)0;
|
||||
sums->sums[index] = btrfs_csum_data(data + bvec->bv_offset,
|
||||
sums->sums[index],
|
||||
bvec->bv_len);
|
||||
kunmap_atomic(data);
|
||||
btrfs_csum_final(sums->sums[index],
|
||||
(char *)(sums->sums + index));
|
||||
|
||||
bio_index++;
|
||||
index++;
|
||||
total_bytes += bvec->bv_len;
|
||||
this_sum_bytes += bvec->bv_len;
|
||||
offset += bvec->bv_len;
|
||||
bvec++;
|
||||
}
|
||||
this_sum_bytes = 0;
|
||||
|
||||
+90
-68
@@ -41,6 +41,7 @@
|
||||
#include "locking.h"
|
||||
#include "volumes.h"
|
||||
#include "qgroup.h"
|
||||
#include "compression.h"
|
||||
|
||||
static struct kmem_cache *btrfs_inode_defrag_cachep;
|
||||
/*
|
||||
@@ -498,7 +499,7 @@ int btrfs_dirty_pages(struct btrfs_root *root, struct inode *inode,
|
||||
loff_t isize = i_size_read(inode);
|
||||
|
||||
start_pos = pos & ~((u64)root->sectorsize - 1);
|
||||
num_bytes = ALIGN(write_bytes + pos - start_pos, root->sectorsize);
|
||||
num_bytes = round_up(write_bytes + pos - start_pos, root->sectorsize);
|
||||
|
||||
end_of_last_block = start_pos + num_bytes - 1;
|
||||
err = btrfs_set_extent_delalloc(inode, start_pos, end_of_last_block,
|
||||
@@ -1379,16 +1380,19 @@ fail:
|
||||
static noinline int
|
||||
lock_and_cleanup_extent_if_need(struct inode *inode, struct page **pages,
|
||||
size_t num_pages, loff_t pos,
|
||||
size_t write_bytes,
|
||||
u64 *lockstart, u64 *lockend,
|
||||
struct extent_state **cached_state)
|
||||
{
|
||||
struct btrfs_root *root = BTRFS_I(inode)->root;
|
||||
u64 start_pos;
|
||||
u64 last_pos;
|
||||
int i;
|
||||
int ret = 0;
|
||||
|
||||
start_pos = pos & ~((u64)PAGE_CACHE_SIZE - 1);
|
||||
last_pos = start_pos + ((u64)num_pages << PAGE_CACHE_SHIFT) - 1;
|
||||
start_pos = round_down(pos, root->sectorsize);
|
||||
last_pos = start_pos
|
||||
+ round_up(pos + write_bytes - start_pos, root->sectorsize) - 1;
|
||||
|
||||
if (start_pos < inode->i_size) {
|
||||
struct btrfs_ordered_extent *ordered;
|
||||
@@ -1503,6 +1507,7 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
|
||||
|
||||
while (iov_iter_count(i) > 0) {
|
||||
size_t offset = pos & (PAGE_CACHE_SIZE - 1);
|
||||
size_t sector_offset;
|
||||
size_t write_bytes = min(iov_iter_count(i),
|
||||
nrptrs * (size_t)PAGE_CACHE_SIZE -
|
||||
offset);
|
||||
@@ -1511,6 +1516,8 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
|
||||
size_t reserve_bytes;
|
||||
size_t dirty_pages;
|
||||
size_t copied;
|
||||
size_t dirty_sectors;
|
||||
size_t num_sectors;
|
||||
|
||||
WARN_ON(num_pages > nrptrs);
|
||||
|
||||
@@ -1523,29 +1530,29 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
|
||||
break;
|
||||
}
|
||||
|
||||
reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
|
||||
sector_offset = pos & (root->sectorsize - 1);
|
||||
reserve_bytes = round_up(write_bytes + sector_offset,
|
||||
root->sectorsize);
|
||||
|
||||
if (BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
|
||||
BTRFS_INODE_PREALLOC)) {
|
||||
ret = check_can_nocow(inode, pos, &write_bytes);
|
||||
if (ret < 0)
|
||||
break;
|
||||
if (ret > 0) {
|
||||
/*
|
||||
* For nodata cow case, no need to reserve
|
||||
* data space.
|
||||
*/
|
||||
only_release_metadata = true;
|
||||
/*
|
||||
* our prealloc extent may be smaller than
|
||||
* write_bytes, so scale down.
|
||||
*/
|
||||
num_pages = DIV_ROUND_UP(write_bytes + offset,
|
||||
PAGE_CACHE_SIZE);
|
||||
reserve_bytes = num_pages << PAGE_CACHE_SHIFT;
|
||||
goto reserve_metadata;
|
||||
}
|
||||
if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
|
||||
BTRFS_INODE_PREALLOC)) &&
|
||||
check_can_nocow(inode, pos, &write_bytes) > 0) {
|
||||
/*
|
||||
* For nodata cow case, no need to reserve
|
||||
* data space.
|
||||
*/
|
||||
only_release_metadata = true;
|
||||
/*
|
||||
* our prealloc extent may be smaller than
|
||||
* write_bytes, so scale down.
|
||||
*/
|
||||
num_pages = DIV_ROUND_UP(write_bytes + offset,
|
||||
PAGE_CACHE_SIZE);
|
||||
reserve_bytes = round_up(write_bytes + sector_offset,
|
||||
root->sectorsize);
|
||||
goto reserve_metadata;
|
||||
}
|
||||
|
||||
ret = btrfs_check_data_free_space(inode, pos, write_bytes);
|
||||
if (ret < 0)
|
||||
break;
|
||||
@@ -1576,8 +1583,8 @@ again:
|
||||
break;
|
||||
|
||||
ret = lock_and_cleanup_extent_if_need(inode, pages, num_pages,
|
||||
pos, &lockstart, &lockend,
|
||||
&cached_state);
|
||||
pos, write_bytes, &lockstart,
|
||||
&lockend, &cached_state);
|
||||
if (ret < 0) {
|
||||
if (ret == -EAGAIN)
|
||||
goto again;
|
||||
@@ -1612,9 +1619,16 @@ again:
|
||||
* we still have an outstanding extent for the chunk we actually
|
||||
* managed to copy.
|
||||
*/
|
||||
if (num_pages > dirty_pages) {
|
||||
release_bytes = (num_pages - dirty_pages) <<
|
||||
PAGE_CACHE_SHIFT;
|
||||
num_sectors = BTRFS_BYTES_TO_BLKS(root->fs_info,
|
||||
reserve_bytes);
|
||||
dirty_sectors = round_up(copied + sector_offset,
|
||||
root->sectorsize);
|
||||
dirty_sectors = BTRFS_BYTES_TO_BLKS(root->fs_info,
|
||||
dirty_sectors);
|
||||
|
||||
if (num_sectors > dirty_sectors) {
|
||||
release_bytes = (write_bytes - copied)
|
||||
& ~((u64)root->sectorsize - 1);
|
||||
if (copied > 0) {
|
||||
spin_lock(&BTRFS_I(inode)->lock);
|
||||
BTRFS_I(inode)->outstanding_extents++;
|
||||
@@ -1633,7 +1647,8 @@ again:
|
||||
}
|
||||
}
|
||||
|
||||
release_bytes = dirty_pages << PAGE_CACHE_SHIFT;
|
||||
release_bytes = round_up(copied + sector_offset,
|
||||
root->sectorsize);
|
||||
|
||||
if (copied > 0)
|
||||
ret = btrfs_dirty_pages(root, inode, pages,
|
||||
@@ -1654,8 +1669,7 @@ again:
|
||||
|
||||
if (only_release_metadata && copied > 0) {
|
||||
lockstart = round_down(pos, root->sectorsize);
|
||||
lockend = lockstart +
|
||||
(dirty_pages << PAGE_CACHE_SHIFT) - 1;
|
||||
lockend = round_up(pos + copied, root->sectorsize) - 1;
|
||||
|
||||
set_extent_bit(&BTRFS_I(inode)->io_tree, lockstart,
|
||||
lockend, EXTENT_NORESERVE, NULL,
|
||||
@@ -1761,6 +1775,8 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
|
||||
ssize_t err;
|
||||
loff_t pos;
|
||||
size_t count;
|
||||
loff_t oldsize;
|
||||
int clean_page = 0;
|
||||
|
||||
inode_lock(inode);
|
||||
err = generic_write_checks(iocb, from);
|
||||
@@ -1799,14 +1815,17 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
|
||||
pos = iocb->ki_pos;
|
||||
count = iov_iter_count(from);
|
||||
start_pos = round_down(pos, root->sectorsize);
|
||||
if (start_pos > i_size_read(inode)) {
|
||||
oldsize = i_size_read(inode);
|
||||
if (start_pos > oldsize) {
|
||||
/* Expand hole size to cover write data, preventing empty gap */
|
||||
end_pos = round_up(pos + count, root->sectorsize);
|
||||
err = btrfs_cont_expand(inode, i_size_read(inode), end_pos);
|
||||
err = btrfs_cont_expand(inode, oldsize, end_pos);
|
||||
if (err) {
|
||||
inode_unlock(inode);
|
||||
goto out;
|
||||
}
|
||||
if (start_pos > round_up(oldsize, root->sectorsize))
|
||||
clean_page = 1;
|
||||
}
|
||||
|
||||
if (sync)
|
||||
@@ -1818,6 +1837,9 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
|
||||
num_written = __btrfs_buffered_write(file, from, pos);
|
||||
if (num_written > 0)
|
||||
iocb->ki_pos = pos + num_written;
|
||||
if (clean_page)
|
||||
pagecache_isize_extended(inode, oldsize,
|
||||
i_size_read(inode));
|
||||
}
|
||||
|
||||
inode_unlock(inode);
|
||||
@@ -1825,7 +1847,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
|
||||
/*
|
||||
* We also have to set last_sub_trans to the current log transid,
|
||||
* otherwise subsequent syncs to a file that's been synced in this
|
||||
* transaction will appear to have already occured.
|
||||
* transaction will appear to have already occurred.
|
||||
*/
|
||||
spin_lock(&BTRFS_I(inode)->lock);
|
||||
BTRFS_I(inode)->last_sub_trans = root->log_transid;
|
||||
@@ -1996,10 +2018,11 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
|
||||
*/
|
||||
smp_mb();
|
||||
if (btrfs_inode_in_log(inode, root->fs_info->generation) ||
|
||||
(BTRFS_I(inode)->last_trans <=
|
||||
root->fs_info->last_trans_committed &&
|
||||
(full_sync ||
|
||||
!btrfs_have_ordered_extents_in_range(inode, start, len)))) {
|
||||
(full_sync && BTRFS_I(inode)->last_trans <=
|
||||
root->fs_info->last_trans_committed) ||
|
||||
(!btrfs_have_ordered_extents_in_range(inode, start, len) &&
|
||||
BTRFS_I(inode)->last_trans
|
||||
<= root->fs_info->last_trans_committed)) {
|
||||
/*
|
||||
* We'v had everything committed since the last time we were
|
||||
* modified so clear this flag in case it was set for whatever
|
||||
@@ -2293,10 +2316,10 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
|
||||
int ret = 0;
|
||||
int err = 0;
|
||||
unsigned int rsv_count;
|
||||
bool same_page;
|
||||
bool same_block;
|
||||
bool no_holes = btrfs_fs_incompat(root->fs_info, NO_HOLES);
|
||||
u64 ino_size;
|
||||
bool truncated_page = false;
|
||||
bool truncated_block = false;
|
||||
bool updated_inode = false;
|
||||
|
||||
ret = btrfs_wait_ordered_range(inode, offset, len);
|
||||
@@ -2304,7 +2327,7 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
|
||||
return ret;
|
||||
|
||||
inode_lock(inode);
|
||||
ino_size = round_up(inode->i_size, PAGE_CACHE_SIZE);
|
||||
ino_size = round_up(inode->i_size, root->sectorsize);
|
||||
ret = find_first_non_hole(inode, &offset, &len);
|
||||
if (ret < 0)
|
||||
goto out_only_mutex;
|
||||
@@ -2317,31 +2340,30 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
|
||||
lockstart = round_up(offset, BTRFS_I(inode)->root->sectorsize);
|
||||
lockend = round_down(offset + len,
|
||||
BTRFS_I(inode)->root->sectorsize) - 1;
|
||||
same_page = ((offset >> PAGE_CACHE_SHIFT) ==
|
||||
((offset + len - 1) >> PAGE_CACHE_SHIFT));
|
||||
|
||||
same_block = (BTRFS_BYTES_TO_BLKS(root->fs_info, offset))
|
||||
== (BTRFS_BYTES_TO_BLKS(root->fs_info, offset + len - 1));
|
||||
/*
|
||||
* We needn't truncate any page which is beyond the end of the file
|
||||
* We needn't truncate any block which is beyond the end of the file
|
||||
* because we are sure there is no data there.
|
||||
*/
|
||||
/*
|
||||
* Only do this if we are in the same page and we aren't doing the
|
||||
* entire page.
|
||||
* Only do this if we are in the same block and we aren't doing the
|
||||
* entire block.
|
||||
*/
|
||||
if (same_page && len < PAGE_CACHE_SIZE) {
|
||||
if (same_block && len < root->sectorsize) {
|
||||
if (offset < ino_size) {
|
||||
truncated_page = true;
|
||||
ret = btrfs_truncate_page(inode, offset, len, 0);
|
||||
truncated_block = true;
|
||||
ret = btrfs_truncate_block(inode, offset, len, 0);
|
||||
} else {
|
||||
ret = 0;
|
||||
}
|
||||
goto out_only_mutex;
|
||||
}
|
||||
|
||||
/* zero back part of the first page */
|
||||
/* zero back part of the first block */
|
||||
if (offset < ino_size) {
|
||||
truncated_page = true;
|
||||
ret = btrfs_truncate_page(inode, offset, 0, 0);
|
||||
truncated_block = true;
|
||||
ret = btrfs_truncate_block(inode, offset, 0, 0);
|
||||
if (ret) {
|
||||
inode_unlock(inode);
|
||||
return ret;
|
||||
@@ -2376,9 +2398,10 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
|
||||
if (!ret) {
|
||||
/* zero the front end of the last page */
|
||||
if (tail_start + tail_len < ino_size) {
|
||||
truncated_page = true;
|
||||
ret = btrfs_truncate_page(inode,
|
||||
tail_start + tail_len, 0, 1);
|
||||
truncated_block = true;
|
||||
ret = btrfs_truncate_block(inode,
|
||||
tail_start + tail_len,
|
||||
0, 1);
|
||||
if (ret)
|
||||
goto out_only_mutex;
|
||||
}
|
||||
@@ -2544,7 +2567,7 @@ out_trans:
|
||||
goto out_free;
|
||||
|
||||
inode_inc_iversion(inode);
|
||||
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
|
||||
inode->i_mtime = inode->i_ctime = current_fs_time(inode->i_sb);
|
||||
|
||||
trans->block_rsv = &root->fs_info->trans_block_rsv;
|
||||
ret = btrfs_update_inode(trans, root, inode);
|
||||
@@ -2558,7 +2581,7 @@ out:
|
||||
unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart, lockend,
|
||||
&cached_state, GFP_NOFS);
|
||||
out_only_mutex:
|
||||
if (!updated_inode && truncated_page && !ret && !err) {
|
||||
if (!updated_inode && truncated_block && !ret && !err) {
|
||||
/*
|
||||
* If we only end up zeroing part of a page, we still need to
|
||||
* update the inode item, so that all the time fields are
|
||||
@@ -2611,7 +2634,7 @@ static int add_falloc_range(struct list_head *head, u64 start, u64 len)
|
||||
return 0;
|
||||
}
|
||||
insert:
|
||||
range = kmalloc(sizeof(*range), GFP_NOFS);
|
||||
range = kmalloc(sizeof(*range), GFP_KERNEL);
|
||||
if (!range)
|
||||
return -ENOMEM;
|
||||
range->start = start;
|
||||
@@ -2678,10 +2701,10 @@ static long btrfs_fallocate(struct file *file, int mode,
|
||||
} else if (offset + len > inode->i_size) {
|
||||
/*
|
||||
* If we are fallocating from the end of the file onward we
|
||||
* need to zero out the end of the page if i_size lands in the
|
||||
* middle of a page.
|
||||
* need to zero out the end of the block if i_size lands in the
|
||||
* middle of a block.
|
||||
*/
|
||||
ret = btrfs_truncate_page(inode, inode->i_size, 0, 0);
|
||||
ret = btrfs_truncate_block(inode, inode->i_size, 0, 0);
|
||||
if (ret)
|
||||
goto out;
|
||||
}
|
||||
@@ -2712,7 +2735,7 @@ static long btrfs_fallocate(struct file *file, int mode,
|
||||
btrfs_put_ordered_extent(ordered);
|
||||
unlock_extent_cached(&BTRFS_I(inode)->io_tree,
|
||||
alloc_start, locked_end,
|
||||
&cached_state, GFP_NOFS);
|
||||
&cached_state, GFP_KERNEL);
|
||||
/*
|
||||
* we can't wait on the range with the transaction
|
||||
* running or with the extent lock held
|
||||
@@ -2794,7 +2817,7 @@ static long btrfs_fallocate(struct file *file, int mode,
|
||||
if (IS_ERR(trans)) {
|
||||
ret = PTR_ERR(trans);
|
||||
} else {
|
||||
inode->i_ctime = CURRENT_TIME;
|
||||
inode->i_ctime = current_fs_time(inode->i_sb);
|
||||
i_size_write(inode, actual_end);
|
||||
btrfs_ordered_update_i_size(inode, actual_end, NULL);
|
||||
ret = btrfs_update_inode(trans, root, inode);
|
||||
@@ -2806,7 +2829,7 @@ static long btrfs_fallocate(struct file *file, int mode,
|
||||
}
|
||||
out_unlock:
|
||||
unlock_extent_cached(&BTRFS_I(inode)->io_tree, alloc_start, locked_end,
|
||||
&cached_state, GFP_NOFS);
|
||||
&cached_state, GFP_KERNEL);
|
||||
out:
|
||||
/*
|
||||
* As we waited the extent range, the data_rsv_map must be empty
|
||||
@@ -2939,8 +2962,7 @@ const struct file_operations btrfs_file_operations = {
|
||||
|
||||
void btrfs_auto_defrag_exit(void)
|
||||
{
|
||||
if (btrfs_inode_defrag_cachep)
|
||||
kmem_cache_destroy(btrfs_inode_defrag_cachep);
|
||||
kmem_cache_destroy(btrfs_inode_defrag_cachep);
|
||||
}
|
||||
|
||||
int btrfs_auto_defrag_init(void)
|
||||
|
||||
@@ -556,6 +556,9 @@ int btrfs_find_free_objectid(struct btrfs_root *root, u64 *objectid)
|
||||
mutex_lock(&root->objectid_mutex);
|
||||
|
||||
if (unlikely(root->highest_objectid >= BTRFS_LAST_FREE_OBJECTID)) {
|
||||
btrfs_warn(root->fs_info,
|
||||
"the objectid of root %llu reaches its highest value",
|
||||
root->root_key.objectid);
|
||||
ret = -ENOSPC;
|
||||
goto out;
|
||||
}
|
||||
|
||||
+226
-100
File diff suppressed because it is too large
Load Diff
+18
-17
@@ -59,6 +59,8 @@
|
||||
#include "props.h"
|
||||
#include "sysfs.h"
|
||||
#include "qgroup.h"
|
||||
#include "tree-log.h"
|
||||
#include "compression.h"
|
||||
|
||||
#ifdef CONFIG_64BIT
|
||||
/* If we have a 32-bit userspace and 64-bit kernel, then the UAPI
|
||||
@@ -347,7 +349,7 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
|
||||
|
||||
btrfs_update_iflags(inode);
|
||||
inode_inc_iversion(inode);
|
||||
inode->i_ctime = CURRENT_TIME;
|
||||
inode->i_ctime = current_fs_time(inode->i_sb);
|
||||
ret = btrfs_update_inode(trans, root, inode);
|
||||
|
||||
btrfs_end_transaction(trans, root);
|
||||
@@ -443,7 +445,7 @@ static noinline int create_subvol(struct inode *dir,
|
||||
struct btrfs_root *root = BTRFS_I(dir)->root;
|
||||
struct btrfs_root *new_root;
|
||||
struct btrfs_block_rsv block_rsv;
|
||||
struct timespec cur_time = CURRENT_TIME;
|
||||
struct timespec cur_time = current_fs_time(dir->i_sb);
|
||||
struct inode *inode;
|
||||
int ret;
|
||||
int err;
|
||||
@@ -844,10 +846,6 @@ static noinline int btrfs_mksubvol(struct path *parent,
|
||||
if (IS_ERR(dentry))
|
||||
goto out_unlock;
|
||||
|
||||
error = -EEXIST;
|
||||
if (d_really_is_positive(dentry))
|
||||
goto out_dput;
|
||||
|
||||
error = btrfs_may_create(dir, dentry);
|
||||
if (error)
|
||||
goto out_dput;
|
||||
@@ -2097,8 +2095,6 @@ static noinline int search_ioctl(struct inode *inode,
|
||||
key.offset = (u64)-1;
|
||||
root = btrfs_read_fs_root_no_name(info, &key);
|
||||
if (IS_ERR(root)) {
|
||||
btrfs_err(info, "could not find root %llu",
|
||||
sk->tree_id);
|
||||
btrfs_free_path(path);
|
||||
return -ENOENT;
|
||||
}
|
||||
@@ -2476,6 +2472,8 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
|
||||
trans->block_rsv = &block_rsv;
|
||||
trans->bytes_reserved = block_rsv.size;
|
||||
|
||||
btrfs_record_snapshot_destroy(trans, dir);
|
||||
|
||||
ret = btrfs_unlink_subvol(trans, root, dir,
|
||||
dest->root_key.objectid,
|
||||
dentry->d_name.name,
|
||||
@@ -2960,8 +2958,8 @@ static int btrfs_cmp_data_prepare(struct inode *src, u64 loff,
|
||||
* of the array is bounded by len, which is in turn bounded by
|
||||
* BTRFS_MAX_DEDUPE_LEN.
|
||||
*/
|
||||
src_pgarr = kzalloc(num_pages * sizeof(struct page *), GFP_NOFS);
|
||||
dst_pgarr = kzalloc(num_pages * sizeof(struct page *), GFP_NOFS);
|
||||
src_pgarr = kcalloc(num_pages, sizeof(struct page *), GFP_KERNEL);
|
||||
dst_pgarr = kcalloc(num_pages, sizeof(struct page *), GFP_KERNEL);
|
||||
if (!src_pgarr || !dst_pgarr) {
|
||||
kfree(src_pgarr);
|
||||
kfree(dst_pgarr);
|
||||
@@ -3066,6 +3064,9 @@ static int btrfs_extent_same(struct inode *src, u64 loff, u64 olen,
|
||||
inode_lock(src);
|
||||
|
||||
ret = extent_same_check_offsets(src, loff, &len, olen);
|
||||
if (ret)
|
||||
goto out_unlock;
|
||||
ret = extent_same_check_offsets(src, dst_loff, &len, olen);
|
||||
if (ret)
|
||||
goto out_unlock;
|
||||
|
||||
@@ -3217,7 +3218,7 @@ static int clone_finish_inode_update(struct btrfs_trans_handle *trans,
|
||||
|
||||
inode_inc_iversion(inode);
|
||||
if (!no_time_update)
|
||||
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
|
||||
inode->i_mtime = inode->i_ctime = current_fs_time(inode->i_sb);
|
||||
/*
|
||||
* We round up to the block size at eof when determining which
|
||||
* extents to clone above, but shouldn't round up the file size.
|
||||
@@ -3889,8 +3890,9 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src,
|
||||
* Truncate page cache pages so that future reads will see the cloned
|
||||
* data immediately and not the previous data.
|
||||
*/
|
||||
truncate_inode_pages_range(&inode->i_data, destoff,
|
||||
PAGE_CACHE_ALIGN(destoff + len) - 1);
|
||||
truncate_inode_pages_range(&inode->i_data,
|
||||
round_down(destoff, PAGE_CACHE_SIZE),
|
||||
round_up(destoff + len, PAGE_CACHE_SIZE) - 1);
|
||||
out_unlock:
|
||||
if (!same_inode)
|
||||
btrfs_double_inode_unlock(src, inode);
|
||||
@@ -5031,7 +5033,7 @@ static long _btrfs_ioctl_set_received_subvol(struct file *file,
|
||||
struct btrfs_root *root = BTRFS_I(inode)->root;
|
||||
struct btrfs_root_item *root_item = &root->root_item;
|
||||
struct btrfs_trans_handle *trans;
|
||||
struct timespec ct = CURRENT_TIME;
|
||||
struct timespec ct = current_fs_time(inode->i_sb);
|
||||
int ret = 0;
|
||||
int received_uuid_changed;
|
||||
|
||||
@@ -5262,8 +5264,7 @@ out_unlock:
|
||||
.compat_ro_flags = BTRFS_FEATURE_COMPAT_RO_##suffix, \
|
||||
.incompat_flags = BTRFS_FEATURE_INCOMPAT_##suffix }
|
||||
|
||||
static int btrfs_ioctl_get_supported_features(struct file *file,
|
||||
void __user *arg)
|
||||
int btrfs_ioctl_get_supported_features(void __user *arg)
|
||||
{
|
||||
static const struct btrfs_ioctl_feature_flags features[3] = {
|
||||
INIT_FEATURE_FLAGS(SUPP),
|
||||
@@ -5542,7 +5543,7 @@ long btrfs_ioctl(struct file *file, unsigned int
|
||||
case BTRFS_IOC_SET_FSLABEL:
|
||||
return btrfs_ioctl_set_fslabel(file, argp);
|
||||
case BTRFS_IOC_GET_SUPPORTED_FEATURES:
|
||||
return btrfs_ioctl_get_supported_features(file, argp);
|
||||
return btrfs_ioctl_get_supported_features(argp);
|
||||
case BTRFS_IOC_GET_FEATURES:
|
||||
return btrfs_ioctl_get_features(file, argp);
|
||||
case BTRFS_IOC_SET_FEATURES:
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user