Commit Graph

35847 Commits

Author SHA1 Message Date
Alex Thorlton ab0e113f6b exec: kill the unnecessary mm->def_flags setting in load_elf_binary()
load_elf_binary() sets current->mm->def_flags = def_flags and def_flags
is always zero.  Not only this looks strange, this is unnecessary
because mm_init() has already set ->def_flags = 0.

Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:35:52 -07:00
Fabian Frederick 87c1b497c2 ntfs: logging clean-up
- Convert spinlock/static array to va_format (inspired by Joe Perches
  help on previous logging patches).

- Convert printk(KERN_ERR to pr_warn in __ntfs_warning.

- Convert printk(KERN_ERR to pr_err in __ntfs_error.

- Convert printk(KERN_DEBUG to pr_debug in __ntfs_debug.  (Note that
  __ntfs_debug is still guarded by #if DEBUG)

- Improve !DEBUG to parse all arguments (Joe Perches).

- Sparse pr_foo() conversions in super.c

NTFS, NTFS-fs prefixes as well as 'warning' and 'error' were removed :
pr_foo() automatically adds module name and error level is already
specified.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:35:49 -07:00
Linus Torvalds 240cd6a817 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
 "The biggest chunk is a series of patches from Ilya that add support
  for new Ceph osd and crush map features, including some new tunables,
  primary affinity, and the new encoding that is needed for erasure
  coding support.  This brings things into parity with the server side
  and the looming firefly release.  There is also support for allocation
  hints in RBD that help limit fragmentation on the server side.

  There is also a series of patches from Zheng fixing NFS reexport,
  directory fragmentation support, flock vs fnctl behavior, and some
  issues with clustered MDS.

  Finally, there are some miscellaneous fixes from Yunchuan Wen for
  fscache, Fabian Frederick for ACLs, and from me for fsync(dirfd)
  behavior"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (79 commits)
  ceph: skip invalid dentry during dcache readdir
  libceph: dump pool {read,write}_tier to debugfs
  libceph: output primary affinity values on osdmap updates
  ceph: flush cap release queue when trimming session caps
  ceph: don't grabs open file reference for aborted request
  ceph: drop extra open file reference in ceph_atomic_open()
  ceph: preallocate buffer for readdir reply
  libceph: enable PRIMARY_AFFINITY feature bit
  libceph: redo ceph_calc_pg_primary() in terms of ceph_calc_pg_acting()
  libceph: add support for osd primary affinity
  libceph: add support for primary_temp mappings
  libceph: return primary from ceph_calc_pg_acting()
  libceph: switch ceph_calc_pg_acting() to new helpers
  libceph: introduce apply_temps() helper
  libceph: introduce pg_to_raw_osds() and raw_to_up_osds() helpers
  libceph: ceph_can_shift_osds(pool) and pool type defines
  libceph: ceph_osd_{exists,is_up,is_down}(osd) definitions
  libceph: enable OSDMAP_ENC feature bit
  libceph: primary_affinity decode bits
  libceph: primary_affinity infrastructure
  ...
2014-04-07 11:09:13 -07:00
Linus Torvalds 3021112598 Merge tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "This patch-set includes the following major enhancement patches.
   - introduce large directory support
   - introduce f2fs_issue_flush to merge redundant flush commands
   - merge write IOs as much as possible aligned to the segment
   - add sysfs entries to tune the f2fs configuration
   - use radix_tree for the free_nid_list to reduce in-memory operations
   - remove costly bit operations in f2fs_find_entry
   - enhance the readahead flow for CP/NAT/SIT/SSA blocks

  The other bug fixes are as follows:
   - recover xattr node blocks correctly after sudden-power-cut
   - fix to calculate the maximum number of node ids
   - enhance to handle many error cases

  And, there are a bunch of cleanups"

* tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (62 commits)
  f2fs: fix wrong statistics of inline data
  f2fs: check the acl's validity before setting
  f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
  f2fs: fix to cover io->bio with io_rwsem
  f2fs: fix error path when fail to read inline data
  f2fs: use list_for_each_entry{_safe} for simplyfying code
  f2fs: avoid free slab cache under spinlock
  f2fs: avoid unneeded lookup when xattr name length is too long
  f2fs: avoid unnecessary bio submit when wait page writeback
  f2fs: return -EIO when node id is not matched
  f2fs: avoid RECLAIM_FS-ON-W warning
  f2fs: skip unnecessary node writes during fsync
  f2fs: introduce fi->i_sem to protect fi's info
  f2fs: change reclaim rate in percentage
  f2fs: add missing documentation for dir_level
  f2fs: remove unnecessary threshold
  f2fs: throttle the memory footprint with a sysfs entry
  f2fs: avoid to drop nat entries due to the negative nr_shrink
  f2fs: call f2fs_wait_on_page_writeback instead of native function
  f2fs: introduce nr_pages_to_write for segment alignment
  ...
2014-04-07 10:55:36 -07:00
David Sterba 36523e9512 btrfs: export global block reserve size as space_info
Introduce a block group type bit for a global reserve and fill the space
info for SPACE_INFO ioctl. This should replace the newly added ioctl
(01e219e806) to get just the 'size' part
of the global reserve, while the actual usage can be now visible in the
'btrfs fi df' output during ENOSPC stress.

The unpatched userspace tools will show the blockgroup as 'unknown'.

CC: Jeff Mahoney <jeffm@suse.com>
CC: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 10:41:53 -07:00
Sergei Trofimovich 800ee2247f btrfs: fix crash in remount(thread_pool=) case
Reproducer:
    mount /dev/ubda /mnt
    mount -oremount,thread_pool=42 /mnt

Gives a crash:
    ? btrfs_workqueue_set_max+0x0/0x70
    btrfs_resize_thread_pool+0xe3/0xf0
    ? sync_filesystem+0x0/0xc0
    ? btrfs_resize_thread_pool+0x0/0xf0
    btrfs_remount+0x1d2/0x570
    ? kern_path+0x0/0x80
    do_remount_sb+0xd9/0x1c0
    do_mount+0x26a/0xbf0
    ? kfree+0x0/0x1b0
    SyS_mount+0xc4/0x110

It's a call
    btrfs_workqueue_set_max(fs_info->scrub_wr_completion_workers, new_pool_size);
with
    fs_info->scrub_wr_completion_workers = NULL;

as scrub wqs get created only on user's demand.

Patch skips not-created-yet workqueues.

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Qu Wenruo <quwenruo@cn.fujitsu.com>
CC: Chris Mason <clm@fb.com>
CC: Josef Bacik <jbacik@fb.com>
CC: linux-btrfs@vger.kernel.org
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 10:41:52 -07:00
Linus Torvalds c29aa153ef Merge tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
 - A few SPI NOR ID definitions
 - Kill the NAND "max pagesize" restriction
 - Fix some x16 bus-width NAND support
 - Add NAND JEDEC parameter page support
 - DT bindings for NAND ECC
 - GPMI NAND updates (subpage reads)
 - More OMAP NAND refactoring
 - New STMicro SPI NOR driver (now in 40 patches!)
 - A few other random bugfixes

* tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd: (120 commits)
  Fix index regression in nand_read_subpage
  mtd: diskonchip: mem resource name is not optional
  mtd: nand: fix mention to CONFIG_MTD_NAND_ECC_BCH
  mtd: nand: fix GET/SET_FEATURES address on 16-bit devices
  mtd: omap2: Use devm_ioremap_resource()
  mtd: denali_dt: Use devm_ioremap_resource()
  mtd: devices: elm: update DRIVER_NAME as "omap-elm"
  mtd: devices: elm: configure parallel channels based on ecc_steps
  mtd: devices: elm: clean elm_load_syndrome
  mtd: devices: elm: check for hardware engine's design constraints
  mtd: st_spi_fsm: Succinctly reorganise .remove()
  mtd: st_spi_fsm: Allow loop to run at least once before giving up CPU
  mtd: st_spi_fsm: Correct vendor name spelling issue - missing "M"
  mtd: st_spi_fsm: Avoid duplicating MTD core code
  mtd: st_spi_fsm: Remove useless consts from function arguments
  mtd: st_spi_fsm: Convert ST SPI FSM (NOR) Flash driver to new DT partitions
  mtd: st_spi_fsm: Move runtime configurable msg sequences into device's struct
  mtd: st_spi_fsm: Supply the W25Qxxx chip specific configuration call-back
  mtd: st_spi_fsm: Supply the S25FLxxx chip specific configuration call-back
  mtd: st_spi_fsm: Supply the MX25xxx chip specific configuration call-back
  ...
2014-04-07 10:17:30 -07:00
Josef Bacik c4a050bbbb Btrfs: abort the transaction when we don't find our extent ref
I'm not sure why we weren't aborting here in the first place, it is obviously a
bad time from the fact that we print the leaf and yell loudly about it.  Fix
this up, otherwise we panic because our path could be pointing into oblivion.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:51 -07:00
Chris Mason 3a29bc0928 Btrfs: fix EINVAL checks in btrfs_clone
btrfs_drop_extents can now return -EINVAL, but only one caller
in btrfs_clone was checking for it.  This adds it to the
caller for inline extents, which is where we really need it.

Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:50 -07:00
Wang Shilong a1ecaabbf9 Btrfs: fix unlock in __start_delalloc_inodes()
This patch fix a regression caused by the following patch:
Btrfs: don't flush all delalloc inodes when we doesn't get s_umount lock

break while loop will make us call @spin_unlock() without
calling @spin_lock() before, fix it.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:50 -07:00
Wang Shilong 3b080b2564 Btrfs: scrub raid56 stripes in the right way
Steps to reproduce:
 # mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5
 # mount /dev/sda8 /mnt
 # btrfs scrub start -BR /mnt
 # echo $? <--unverified errors make return value be 3

This is because we don't setup right mapping between physical
and logical address for raid56, which makes checksum mismatch.
But we will find everthing is fine later when rechecking using
btrfs_map_block().

This patch fixed the problem by settuping right mappings and
we only verify data stripes' checksums.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:49 -07:00
Wang Shilong 68bb462d42 Btrfs: don't compress for a small write
To compress a small file range(<=blocksize) that is not
an inline extent can not save disk space at all. skip it can
save us some cpu time.

This patch can also fix wrong setting nocompression flag for
inode, say a case when @total_in is 4096, and then we get
@total_compressed 52,because we do aligment to page cache size
firstly, and then we get into conclusion @total_in=@total_compressed
thus we will clear this inode's compression flag.

An exception comes from inserting inline extent failure but we
still have @total_compressed < @total_in,so we will still reset
inode's flag, this is ok, because we don't have good compression
effect.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:48 -07:00
Filipe Manana c50d3e71c3 Btrfs: more efficient io tree navigation on wait_extent_bit
If we don't reschedule use rb_next to find the next extent state
instead of a full tree search, which is more efficient and safe
since we didn't release the io tree's lock.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:47 -07:00
Filipe Manana c715e155c9 Btrfs: send, build path string only once in send_hole
There's no point building the path string in each iteration of the
send_hole loop, as it produces always the same string.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:46 -07:00
Gui Hecheng 9a40f1222a btrfs: filter invalid arg for btrfs resize
Originally following cmds will work:
	# btrfs fi resize -10A  <mnt>
	# btrfs fi resize -10Gaha <mnt>
Filter the arg by checking the return pointer of memparse.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:45 -07:00
Filipe Manana 766b5e5ae7 Btrfs: send, fix data corruption due to incorrect hole detection
During an incremental send, when we finish processing an inode (corresponding to
a regular file) we would assume the gap between the end of the last processed file
extent and the file's size corresponded to a file hole, and therefore incorrectly
send a bunch of zero bytes to overwrite that region in the file.

This affects only kernel 3.14.

Reproducer:

    mkfs.btrfs -f /dev/sdc
    mount /dev/sdc /mnt

    xfs_io -f -c "falloc -k 0 268435456" /mnt/foo

    btrfs subvolume snapshot -r /mnt /mnt/mysnap0

    xfs_io -c "pwrite -S 0x01 -b 9216 16190218 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x02 -b 1121 198720104 1121" /mnt/foo
    xfs_io -c "pwrite -S 0x05 -b 9216 107887439 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x06 -b 9216 225520207 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x07 -b 67584 102138300 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x08 -b 7000 94897484 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x09 -b 113664 245083212 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x10 -b 123 17937788 123" /mnt/foo
    xfs_io -c "pwrite -S 0x11 -b 39936 229573311 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x12 -b 67584 174792222 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x13 -b 9216 249253213 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x16 -b 67584 150046083 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x17 -b 39936 118246040 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x18 -b 67584 215965442 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x19 -b 33792 97096725 33792" /mnt/foo
    xfs_io -c "pwrite -S 0x20 -b 125952 166300596 125952" /mnt/foo
    xfs_io -c "pwrite -S 0x21 -b 123 1078957 123" /mnt/foo
    xfs_io -c "pwrite -S 0x25 -b 9216 212044492 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x26 -b 7000 265037146 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x27 -b 42757 215922685 42757" /mnt/foo
    xfs_io -c "pwrite -S 0x28 -b 7000 69865411 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x29 -b 67584 67948958 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x30 -b 39936 266967019 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x31 -b 1121 19582453 1121" /mnt/foo
    xfs_io -c "pwrite -S 0x32 -b 17408 257710255 17408" /mnt/foo
    xfs_io -c "pwrite -S 0x33 -b 39936 3895518 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x34 -b 125952 12045847 125952" /mnt/foo
    xfs_io -c "pwrite -S 0x35 -b 17408 19156379 17408" /mnt/foo
    xfs_io -c "pwrite -S 0x36 -b 39936 50160066 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x37 -b 113664 9549793 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x38 -b 105472 94391506 105472" /mnt/foo
    xfs_io -c "pwrite -S 0x39 -b 23552 143632863 23552" /mnt/foo
    xfs_io -c "pwrite -S 0x40 -b 39936 241283845 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x41 -b 113664 199937606 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x42 -b 67584 67380093 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x43 -b 67584 26793129 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x44 -b 39936 14421913 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x45 -b 123 253097405 123" /mnt/foo
    xfs_io -c "pwrite -S 0x46 -b 1121 128233424 1121" /mnt/foo
    xfs_io -c "pwrite -S 0x47 -b 105472 91577959 105472" /mnt/foo
    xfs_io -c "pwrite -S 0x48 -b 1121 7245381 1121" /mnt/foo
    xfs_io -c "pwrite -S 0x49 -b 113664 182414694 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x50 -b 9216 32750608 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x51 -b 67584 266546049 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x52 -b 67584 87969398 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x53 -b 9216 260848797 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x54 -b 39936 119461243 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x55 -b 7000 200178693 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x56 -b 9216 243316029 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x57 -b 7000 209658229 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x58 -b 101376 179745192 101376" /mnt/foo
    xfs_io -c "pwrite -S 0x59 -b 9216 64012300 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x60 -b 125952 181705139 125952" /mnt/foo
    xfs_io -c "pwrite -S 0x61 -b 23552 235737348 23552" /mnt/foo
    xfs_io -c "pwrite -S 0x62 -b 113664 106021355 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x63 -b 67584 135753552 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x64 -b 23552 95730888 23552" /mnt/foo
    xfs_io -c "pwrite -S 0x65 -b 11 17311415 11" /mnt/foo
    xfs_io -c "pwrite -S 0x66 -b 33792 120695553 33792" /mnt/foo
    xfs_io -c "pwrite -S 0x67 -b 9216 17164631 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x68 -b 9216 136065853 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x69 -b 67584 37752198 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x70 -b 101376 189717473 101376" /mnt/foo
    xfs_io -c "pwrite -S 0x71 -b 7000 227463698 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x72 -b 9216 12655137 9216" /mnt/foo
    xfs_io -c "pwrite -S 0x73 -b 7000 7488866 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x74 -b 113664 87813649 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x75 -b 33792 25802183 33792" /mnt/foo
    xfs_io -c "pwrite -S 0x76 -b 39936 93524024 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x77 -b 33792 113336388 33792" /mnt/foo
    xfs_io -c "pwrite -S 0x78 -b 105472 184955320 105472" /mnt/foo
    xfs_io -c "pwrite -S 0x79 -b 101376 225691598 101376" /mnt/foo
    xfs_io -c "pwrite -S 0x80 -b 23552 77023155 23552" /mnt/foo
    xfs_io -c "pwrite -S 0x81 -b 11 201888192 11" /mnt/foo
    xfs_io -c "pwrite -S 0x82 -b 11 115332492 11" /mnt/foo
    xfs_io -c "pwrite -S 0x83 -b 67584 230278015 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x84 -b 11 120589073 11" /mnt/foo
    xfs_io -c "pwrite -S 0x85 -b 125952 202207819 125952" /mnt/foo
    xfs_io -c "pwrite -S 0x86 -b 113664 86672080 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x87 -b 17408 208459603 17408" /mnt/foo
    xfs_io -c "pwrite -S 0x88 -b 7000 73372211 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x89 -b 7000 42252122 7000" /mnt/foo
    xfs_io -c "pwrite -S 0x90 -b 23552 46784881 23552" /mnt/foo
    xfs_io -c "pwrite -S 0x91 -b 101376 63172351 101376" /mnt/foo
    xfs_io -c "pwrite -S 0x92 -b 23552 59341931 23552" /mnt/foo
    xfs_io -c "pwrite -S 0x93 -b 39936 239599283 39936" /mnt/foo
    xfs_io -c "pwrite -S 0x94 -b 67584 175643105 67584" /mnt/foo
    xfs_io -c "pwrite -S 0x97 -b 23552 105534880 23552" /mnt/foo
    xfs_io -c "pwrite -S 0x98 -b 113664 8236844 113664" /mnt/foo
    xfs_io -c "pwrite -S 0x99 -b 125952 144489686 125952" /mnt/foo
    xfs_io -c "pwrite -S 0xa0 -b 7000 73273112 7000" /mnt/foo
    xfs_io -c "pwrite -S 0xa1 -b 125952 194580243 125952" /mnt/foo
    xfs_io -c "pwrite -S 0xa2 -b 123 56296779 123" /mnt/foo
    xfs_io -c "pwrite -S 0xa3 -b 11 233066845 11" /mnt/foo
    xfs_io -c "pwrite -S 0xa4 -b 39936 197727090 39936" /mnt/foo
    xfs_io -c "pwrite -S 0xa5 -b 101376 53579812 101376" /mnt/foo
    xfs_io -c "pwrite -S 0xa6 -b 9216 85669738 9216" /mnt/foo
    xfs_io -c "pwrite -S 0xa7 -b 125952 21266322 125952" /mnt/foo
    xfs_io -c "pwrite -S 0xa8 -b 23552 125726568 23552" /mnt/foo
    xfs_io -c "pwrite -S 0xa9 -b 9216 18423680 9216" /mnt/foo
    xfs_io -c "pwrite -S 0xb0 -b 1121 165901483 1121" /mnt/foo

    btrfs subvolume snapshot -r /mnt /mnt/mysnap1

    xfs_io -c "pwrite -S 0xff -b 10 16190218 10" /mnt/foo

    btrfs subvolume snapshot -r /mnt /mnt/mysnap2

    md5sum /mnt/foo          # returns 79e53f1466bfc09fd82b450689e6119e
    md5sum /mnt/mysnap2/foo  # returns 79e53f1466bfc09fd82b450689e6119e too

    btrfs send /mnt/mysnap1 -f /tmp/1.snap
    btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/2.snap

    mkfs.btrfs -f /dev/sdc
    mount /dev/sdc /mnt

    btrfs receive /mnt -f /tmp/1.snap
    btrfs receive /mnt -f /tmp/2.snap

    md5sum /mnt/mysnap2/foo  # returns 2bb414c5155767cedccd7063e51beabd !!

A testcase for xfstests follows soon too.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:45 -07:00
Dan Carpenter 84dbeb87d1 Btrfs: kmalloc() doesn't return an ERR_PTR
The error handling was copy and pasted from memdup_user().  It should be
checking for NULL obviously.

Fixes: abccd00f8a ('btrfs: Fix 32/64-bit problem with BTRFS_SET_RECEIVED_SUBVOL ioctl')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:44 -07:00
Wang Shilong e9894fd3e3 Btrfs: fix snapshot vs nocow writting
While running fsstress and snapshots concurrently, we will hit something
like followings:

Thread 1			Thread 2

|->fallocate
  |->write pages
    |->join transaction
       |->add ordered extent
    |->end transaction
				|->flushing data
				  |->creating pending snapshots
|->write data into src root's
   fallocated space

After above work flows finished, we will get a state that source and
snapshot root share same space, but source root have written data into
fallocated space, this will make fsck fail to verify checksums for
snapshot root's preallocating file extent data.Nocow writting also
has this same problem.

Fix this problem by syncing snapshots with nocow writting:

 1.for nocow writting,if there are pending snapshots, we will
 fall into COW way.

 2.if there are pending nocow writes, snapshots for this root
 will be blocked until nocow writting finish.

Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:43 -07:00
Qu Wenruo 3ac0d7b96a btrfs: Change the expanding write sequence to fix snapshot related bug.
When testing fsstress with snapshot making background, some snapshot
following problem.

Snapshot 270:
inode 323: size 0

Snapshot 271:
inode 323: size 349145
|-------Hole---|---------Empty gap-------|-------Hole-----|
0	    122880			172032	      349145

Snapshot 272:
inode 323: size 349145
|-------Hole---|------------Data---------|-------Hole-----|
0	    122880			172032	      349145

The fsstress operation on inode 323 is the following:
write: 		offset 	126832 	len 43124
truncate: 	size 	349145

Since the write with offset is consist of 2 operations:
1. punch hole
2. write data
Hole punching is faster than data write, so hole punching in write
and truncate is done first and then buffered write, so the snapshot 271 got
empty gap, which will not pass btrfsck.

To fix the bug, this patch will change the write sequence which will
first punch a hole covering the write end if a hole is needed.

Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:42 -07:00
David Sterba 60999ca4b4 btrfs: make device scan less noisy
Print the message only when the device is seen for the first time.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:41 -07:00
Jeff Mahoney ed55b6ac07 btrfs: fix lockdep warning with reclaim lock inversion
When encountering memory pressure, testers have run into the following
lockdep warning. It was caused by __link_block_group calling kobject_add
with the groups_sem held. kobject_add calls kvasprintf with GFP_KERNEL,
which gets us into reclaim context. The kobject doesn't actually need
to be added under the lock -- it just needs to ensure that it's only
added for the first block group to be linked.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.14.0-rc8-default #1 Not tainted
---------------------------------------------------------
kswapd0/169 just changed the state of lock:
 (&delayed_node->mutex){+.+.-.}, at: [<ffffffffa018baea>] __btrfs_release_delayed_node+0x3a/0x200 [btrfs]
but this lock took another, RECLAIM_FS-unsafe lock in the past:
 (&found->groups_sem){+++++.}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&found->groups_sem);
                               local_irq_disable();
                               lock(&delayed_node->mutex);
                               lock(&found->groups_sem);
  <Interrupt>
    lock(&delayed_node->mutex);

 *** DEADLOCK ***
2 locks held by kswapd0/169:
 #0:  (shrinker_rwsem){++++..}, at: [<ffffffff81159e8a>] shrink_slab+0x3a/0x160
 #1:  (&type->s_umount_key#27){++++..}, at: [<ffffffff811bac6f>] grab_super_passive+0x3f/0x90

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:40 -07:00
Josef Bacik 3f8a18cc53 Btrfs: hold the commit_root_sem when getting the commit root during send
We currently rely too heavily on roots being read-only to save us from just
accessing root->commit_root.  We can easily balance blocks out from underneath a
read only root, so to save us from getting screwed make sure we only access
root->commit_root under the commit root sem.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07 09:08:39 -07:00
Chao Yu 48b230a583 f2fs: fix wrong statistics of inline data
If we remove a file that has inline data after mount, our statistics turns to
inaccurate.

cat /sys/kernel/debug/f2fs/status
  - Inline_data Inode: 4294967295

Let's add stat_inc_inline_inode() to stat inline info of the file when lookup.

Change log from v1:
 o stat in f2fs_lookup() instead of in do_read_inode() for excluding wrong stat.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-07 12:40:58 +09:00
ZhangZhen 3a8861e271 f2fs: check the acl's validity before setting
Before setting the acl, call posix_acl_valid() to check if it is
valid or not.

Signed-off-by: zhangzhen <zhenzhang.zhang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-07 12:18:30 +09:00
Jaegeuk Kim 6b4afdd794 f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.

If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.

Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-07 09:50:58 +09:00