Commit Graph

659 Commits

Author SHA1 Message Date
Chris Mason 7c2fe32a23 Btrfs: Fix add_extent_mapping to check for duplicates across the whole range
add_extent_mapping was allowing the insertion of overlapping extents.
This never used to happen because it only inserted the extents from disk
and those were never overlapping.

But, with the data=ordered code, the disk and memory representations of the
file are not the same.  add_extent_mapping needs to ensure a new extent
does not overlap before it inserts.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
David Woodhouse 902b22f341 Btrfs: Remove broken optimisations in end_bio functions.
These ended up freeing objects while they were still using them. Under
guidance from Chris, just rip out the 'clever' bits and do things the
simple way.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 53863232ef Btrfs: Lower contention on the csum mutex
This takes the csum mutex deeper in the call chain and releases it
more often.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 4854ddd0ed Btrfs: Wait for kernel threads to make progress during async submission
Before this change, btrfs would use a bdi congestion function to make
sure there weren't too many pending async checksum work items.

This change makes the process creating async work items wait instead,
leading to fewer congestion returns from the bdi.  This improves
pdflush background_writeout scanning.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 5443be45f5 Btrfs: Give all the worker threads descriptive names
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 777e6bd706 Btrfs: Transaction commit: don't use filemap_fdatawait
After writing out all the remaining btree blocks in the transaction,
the commit code would use filemap_fdatawait to make sure it was all
on disk.  This means it would wait for blocks written by other procs
as well.

The new code walks the list of blocks for this transaction again
and waits only for those required by this transaction.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 0986fe9eac Btrfs: Count async bios separately from async checksum work items
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason b720d20952 Btrfs: Limit the number of async bio submission kthreads to the number of devices
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason db69e0ebae Btrfs: Init address_space->writeback_index properly
The writeback_index field is used by write_cache_pages to pick up where
writeback on a given inode left off.  But, it is never set to a sane
value, so writeback can often start at a random offset in the file.

Kernels 2.6.28 and higher will have this fixed, but for everyone else,
we also fill in the value in btrfs.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
David Woodhouse 2db04966ae Btrfs: Change TestSetPageLocked() to trylock_page()
Add backwards compatibility in compat.h

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 compat.h    |    3 +++
 extent_io.c |    3 ++-
 2 files changed, 5 insertions(+), 1 deletions(-)

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Eric Sandeen 5036f53868 Btrfs: fix RHEL test for ClearPageFsMisc
Newer RHEL5 kernels define both ClearPageFSMisc and
ClearPageChecked, so test for both before redefining.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 5707e3b6f7 Btrfs: Update version.sh to v0.16
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 4ca8b41e3f Btrfs: Avoid calling into the FS for the final iput on fake root inodes
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Yan Zheng 7ea394f119 Btrfs: Fix nodatacow for the new data=ordered mode
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 00e4e6b33a Get rid of BTRFS_I(inode)->index and use local vars instead
rename and link don't always have a lock on the source inode, and
our use of a per-inode index variable was racy.  This changes things to
store the index in a local variable instead.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 7d2b4daa67 Btrfs: Fix the multi-bio code to save the original bio for completion
The multi-bio code is responsible for duplicating blocks in raid1 and
single spindle duplication.  It has counters to make sure all of
the locations for a given extent are properly written before io completion
is returned to the higher layers.

But, it didn't always complete the same bio it was given, sometimes a
clone was completed instead.  This lead to problems with the async
work queues because they saved a pointer to the bio in a struct off
bi_private.

The fix is to remember the original bio and only complete that one.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Yan Zheng ae01a0abf3 Btrfs: Update clone file ioctl
This patch updates the file clone ioctl for the tree locking and new
data ordered code.

---

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Yan Zheng b48652c101 Btrfs: Various small fixes.
This trivial patch contains two locking fixes and a off by one fix.

---

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 3de9d6b649 btrfs_lookup_bio_sums seems broken, go back to the readpage_io_hook for now
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason ea8c281947 Btrfs: Maintain a list of inodes that are delalloc and a way to wait on them
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason d7a029a89e Btrfs: Don't corrupt ram in shrink_extent_tree, leak it instead
Far from the perfect fix, but these structs are small.  TODO for the
next release.  The block group cache structs are referenced in many
different places, and it isn't safe to just free them while resizing.

A real fix will be a larger change to the allocator so that it doesn't
have to carry about the block group cache structs to find good places
to search for free blocks.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Sage Weil 9ca9ee09c1 Btrfs: fix ioctl-initiated transactions vs wait_current_trans()
Commit 597:466b27332893 (btrfs_start_transaction: wait for commits in
progress) breaks the transaction start/stop ioctls by making
btrfs_start_transaction conditionally wait for the next transaction to
start.  If an application artificially is holding a transaction open,
things deadlock.

This workaround maintains a count of open ioctl-initiated transactions in
fs_info, and avoids wait_current_trans() if any are currently open (in
start_transaction() and btrfs_throttle()).  The start transaction ioctl
uses a new btrfs_start_ioctl_transaction() that _does_ call
wait_current_trans(), effectively pushing the join/wait decision to the
outer ioctl-initiated transaction.

This more or less neuters btrfs_throttle() when ioctl-initiated
transactions are in use, but that seems like a pretty fundamental
consequence of wrapping lots of write()'s in a transaction.  Btrfs has no
way to tell if the application considers a given operation as part of it's
transaction.

Obviously, if the transaction start/stop ioctls aren't being used, there
is no effect on current behavior.

Signed-off-by: Sage Weil <sage@newdream.net>
---
 ctree.h       |    1 +
 ioctl.c       |   12 +++++++++++-
 transaction.c |   18 +++++++++++++-----
 transaction.h |    2 ++
 4 files changed, 27 insertions(+), 6 deletions(-)

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 3117a77370 Btrfs: Add support for HW assisted crc32c
Intel doesn't yet ship hardware to the public with this enabled, but when they
do, they will be ready.  Original code from:

Austin Zhang <austin_zhang@linux.intel.com>

It is currently disabled, but edit crc32c.h to turn it on.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 6dab815743 Btrfs: Hold csum mutex while reading in sums during readpages
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00
Chris Mason 2dd3e67b1e Btrfs: More throttle tuning
* Make walk_down_tree wake up throttled tasks more often
* Make walk_down_tree call cond_resched during long loops
* As the size of the ref cache grows, wait longer in throttle
* Get rid of the reada code in walk_down_tree, the leaves don't get
  read anymore, thanks to the ref cache.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25 11:04:06 -04:00