Commit Graph

3899 Commits

Author SHA1 Message Date
Filipe Manana 8afebc0223 btrfs: stress send with deduplication and balance running in parallel
Stress send running in parallel with balance and deduplication against
files that belong to the snapshots used by send. The goal is to verify
that these operations running in parallel do not lead to send crashing
(trigger assertion failures and BUG_ONs), or send finding an inconsistent
snapshot that leads to a failure (reported in dmesg/syslog). The test
needs big trees (snapshots) with large differences between the parent and
send snapshots in order to hit such issues with a good probability.

This currently fails on btrfs, hitting a BUG_ON() often, and with btrfs
error messages in dmesg/syslog. The problem has always existed and it is
not new, but probably unnoticed due to lack of test cases that exercise
these btrfs features running in parallel.

The following patches for btrfs fix the problems:

 "Btrfs: fix race between send and deduplication that lead to failures and
  crashes"

 "Btrfs: prevent send failures and crashes due to concurrent relocation"

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-25 15:30:58 +08:00
Darrick J. Wong 872050db3c check: filter lockdep bugs when scanning dmesg
Ignore lockdep complaining about its own bugginess when scanning dmesg
output, because we shouldn't be failing filesystem tests on account of
lockdep.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-21 23:37:19 +08:00
Darrick J. Wong d0e484ac69 check: wipe scratch devices between tests
Wipe the scratch devices in between each test to ensure that tests are
formatting them and not making assumptions about previous contents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-21 23:37:19 +08:00
Darrick J. Wong 5bb196119d check: remove require_{test,scratch}* after a test fails
Remove the require_{test,scratch]* sentinel files after a test fails.
This eliminates false fsck corruption reports such as the following:

1. Test A calls _require_scratch, which creates the sentinel file
$RESULT_DIR/require_scratch to facilitate fsck after the test completes.

2. Test A runs some test, which corrupts the scratch filesystem due to
kernel bug or something.

3. Test A calls _fail because of the errors in (2).  Note that the test
case returned 1, so ./check unmounts the test and scratch filesystems
without checking them or removing $RESULT_DIR/require_scratch

4. Test B starts up, but does not call _require_scratch.  The
$RESULT_DIR/require_scratch file is still there.

5. Test B completes successfully.

6. ./check calls _check_filesystems, which sees the
$RESULT_DIR/require_scratch file and runs fsck.

7. fsck reports the corrupt scratch device (which is associated with
test B) even though B did not ever touch the scratch device and it was
actually test A that corrupted the filesystem.

Note that with the "check: wipe scratch devices between tests" patch
applied, we can also reproduce this problem by running xfs/172 and
xfs/195 with a scratch device small enough that the files created in 172
span multiple AGs and therefore cause 172 to fail.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-21 23:37:19 +08:00
Filipe Manana 18562c4228 btrfs: test send on subvolume with delalloc after setting it to RO mode
Test that if we have a subvolume/snapshot that is writable, has a file
with unflushed delalloc (buffered writes not yet flushed), turn the
subvolume to readonly mode and then use it for send a operation, the send
stream will contain the delalloc data - that is, no data loss happens.

This currently files on btrfs (data loss) but is fixed by a patch for
the linux kernel titled:

  "Btrfs: send, flush dellaloc in order to avoid data loss"

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-20 15:34:43 +08:00
Filipe Manana c7d96baf9c fssum: add support for checking xattrs
Currently fssum, mostly used for btrfs test cases that test the btrfs send
feature, ignores completely the existence of xattrs. This change teaches
fssum to find xattrs and make them contribute to the checksum of a
filesystem, so that we can catch filesystem bugs regarding missing, corrupt
or not supposed to exist xattrs (i.e. that an incremental btrfs send does
not forget to create, update or remove xattrs).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-20 15:34:43 +08:00
Filipe Manana 378722046a fsstress: add operation for listing xattrs from files and directories
The previous patches added support for operations to set, get and delete
xattrs on regular files and directories, this patch just adds an operation
to list the xattrs of a file/directory.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-20 15:34:36 +08:00
Filipe Manana 0babbf1604 fsstress: add operation for deleting xattrs from files and directories
The previous patches added support for operations to set and get xattrs on
regular files and directories, this patch just adds one operation to delete
xattrs on files and directories.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-20 15:34:29 +08:00
Filipe Manana e9ffa8daec fsstress: add operation for reading xattrs from files and directories
The previous patch added support for an operation to set xattrs on regular
files and directories, this patch just adds one operation to read (get)
them.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-20 15:34:22 +08:00
Filipe Manana fc4b0a55b9 fsstress: add operation for setting xattrs on files and directories
Currently fsstress does not exercise creating, reading or deleting xattrs
on files or directories. This change adds support for setting xattrs on
files and directories, using only the xattr user namespace (the other
namespaces are not general purpose and are used for security, capabilities,
ACLs, etc). This adds a counter for each file entry structure that keeps
track of the number of xattrs set for the file entry, and each new xattr
has a name that includes the counter's value (example: "user.x4").
Values for the xattrs have at most 100 bytes, which is more than the
maximum size supported for all major filesystems.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-20 15:34:15 +08:00
Filipe Manana e1a9b1db89 fsstress: allow afsync on directories too
Currently the afsync function can only be performed against regular files.
Allow it to operate on directories too, to increase test coverage and
allow for chances of finding bugs in a filesystem's implementation of
fsync against directories.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-19 15:25:09 +08:00
Filipe Manana 2c492c4d16 fsstress: allow fsync on directories too
Currently the fsync function can only be performed against regular files.
Allow it to operate on directories too, to increase test coverage and
allow for chances of finding bugs in a filesystem's implementation of
fsync against directories.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-19 15:25:01 +08:00
Darrick J. Wong 830349865e clonerange: test remapping the rainbow
Add some more clone range tests that missed various "wacky" combinations
of file state.  Specifically, we test reflinking into and out of rainbow
ranges (a mix of real, unwritten, hole, delalloc, and shared extents),
and also we test that we can correctly handle double-inode locking no
matter what order of inodes or the filesystem's locking rules.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-14 19:21:52 +08:00
Murphy Zhou 13818987d5 generic/504: fix hard coded fd number
Bash supports file discriptor assignment in this way. So remove the hard
coded numbers. Also close this opened fd in cleanup.

Signed-off-by: Murphy Zhou <xzhou@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-14 19:05:36 +08:00
Anand Jain 4529b20e1a btrfs/048: amend property validation cases
Add more property validation cases which are fixed by the patches [1]
 [1]
  btrfs: fix vanished compression property after failed set
  btrfs: fix zstd compression parameter

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-14 10:48:15 +08:00
Brian Foster 2516e6cd4e fsx: test copy_file_range() using non-zero length copy
The copy_file_range() test detection code performs a zero-length
copy to determine whether to perform such calls during the test run.
While this detects the common case of syscall availability,
copy_file_range() has a somewhat variable implementation on the
kernel side that can depend on certain per-filesystem features, etc.
In some implementations, a zero length copy can shortcut and return
success before ever invoking per-filesystem functionality and thus
not thoroughly testing the copy mechanism on the current system.
This can cause the test detection code to pass only to run into an
immediate failure on the first copy_file_range() call during the
test.

Tweak test_copy_range() to perform a small single byte copy to avoid
this problem. Also fix a typo bug in the errno check of the clone
range detection logic.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-13 11:01:30 +08:00
zhangyi (F) 4fbdfbe246 generic/230: reset grace time before overcome hardlimit
Currently, we call repquota to report the latest quota information
after each test case. But repquota will invoke Q_SYNC on the ext4 file
system with old quota, which may be time consuming on the low speed or
busy scratch device. If we call repquota between the "overcome
softlimit" and the "overcome hardlimit" cases, the softlimit grace time
may be exceed after repquota return, and lead to test failure.

Now, we capture the following failure when the disk is busy:

   pwrite: Disk quota exceeded
   Touch 3+4
   Touch 5+6
  +touch: cannot touch 'SCRATCH_MNT/file5': Disk quota exceeded
   touch: cannot touch 'SCRATCH_MNT/file6': Disk quota exceeded
   Touch 5
   touch: cannot touch 'SCRATCH_MNT/file5': Disk quota exceeded

This patch reset grace time before the "overcome hardlimit" case to
avoid this failure.

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-13 10:58:18 +08:00
Zorro Lang fe23418828 src/t_attr_corruption: covert value to little endian order
generic/529 always fails on ppc64 or s390x big-endian machine as:

  set posix acl: Operation not supported

Due to the members of struct posix_acl_xattr_entry/header need to be
little-endian byte order, so use htole*() helper to make sure that.

Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-09 16:14:08 +08:00
Xiaoli Feng c1edf46dbc common/rc: use _try_scratch_mount for scratch_remount
When call _scratch_remount for cifs , it always requires to input
password. This will make generic/306 generic/452 failed because
cifs remount failed.

Signed-off-by: Xiaoli Feng <xifeng@redhat.com>
Reviewed-and-tested-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-09 11:34:30 +08:00
Yang Xu b1887f84e6 common/populate: decrease the step of rm file
Now that we have allocated 2*4096*64/16(32768) inodes after "Inode btree",
 but the step of rm file is too large to create enough free inodes in agi.
 So the freecount is not enough large to make free_level gt 1 and call
 _scratch__populate on xfs will report the following failure(such as xfs/083):

Failed to create fino of sufficient height!

By decreasing the step of rm file, xfs/083 will pass.

Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-09 09:46:31 +08:00
Amir Goldstein f3c1bca7fb generic: Test that SEEK_HOLE can find a punched hole
Added a test case to seek_sanity_test and a test to run it.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-06 19:54:57 +08:00
Amir Goldstein 3408108774 fstests: Add more sanity to seek_sanity_test
seek_sanity_test checks for one of several SEEK_DATA/HOLE
behaviors and allows for the default behavior of filesystems,
where SEEK_HOLE always returns EOF.

This means that if filesystem has a regression in finding
holes, the sanity test won't catch it. And indeed this regression
happened in overlayfs on kernel v4.19 and went unnoticed.

To improve test coverage, add a flag -f to seek_sanity_test to
indicate that the default behavior is not acceptable.
Whitelist all filesystem types that are expected to detect holes
and use wrapper when invoking seek_sanity_test to add the -f flag
to those filesystems.

Overlayfs inherits expected behavior from base fs type.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-06 19:47:53 +08:00
Anand Jain cbae878f41 btrfs: try use forget to unregister device
Some btrfs test cases use btrfs module-reload to unregister devices
in the btrfs kernel. The problem with the module-reload approach is,
if test system contains btrfs as rootfs, then you can't run these
test cases.

Patches [1] introduced btrfs forget feature which can unregister
devices without the module-reload approach.

 [1]
 btrfs-progs: device scan: add new option to forget one or all scanned devices
 btrfs: introduce new ioctl to unregister a btrfs device

And this patch makes relevant changes in the fstests to use this new
feature, when available.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-06 19:41:47 +08:00
Anand Jain 1f139bbe01 fstests: btrfs verify hardening agaist duplicate fsid
We have a known bug in btrfs, that we let the device path be changed
after the device has been mounted. So using this loop hole the new
copied device would appears as if its mounted immediately after its
been copied. So this test case reproduces this issue.

For example:

Initially.. /dev/mmcblk0p4 is mounted as /

lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
mmcblk0     179:0    0 29.2G  0 disk
|-mmcblk0p4 179:4    0    4G  0 part /
|-mmcblk0p2 179:2    0  500M  0 part /boot
|-mmcblk0p3 179:3    0  256M  0 part [SWAP]
`-mmcblk0p1 179:1    0  256M  0 part /boot/efi

btrfs fi show
Label: none  uuid: 07892354-ddaa-4443-90ea-f76a06accaba
    Total devices 1 FS bytes used 1.40GiB
    devid    1 size 4.00GiB used 3.00GiB path /dev/mmcblk0p4

Copy mmcblk0 to sda
dd if=/dev/mmcblk0 of=/dev/sda

And immediately after the copy completes the change in the device
superblock is notified which the automount scans using
btrfs device scan and the new device sda becomes the mounted root
device.

lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    1 14.9G  0 disk
|-sda4        8:4    1    4G  0 part /
|-sda2        8:2    1  500M  0 part
|-sda3        8:3    1  256M  0 part
`-sda1        8:1    1  256M  0 part
mmcblk0     179:0    0 29.2G  0 disk
|-mmcblk0p4 179:4    0    4G  0 part
|-mmcblk0p2 179:2    0  500M  0 part /boot
|-mmcblk0p3 179:3    0  256M  0 part [SWAP]
`-mmcblk0p1 179:1    0  256M  0 part /boot/efi
btrfs fi show /
Label: none  uuid: 07892354-ddaa-4443-90ea-f76a06accaba
    Total devices 1 FS bytes used 1.40GiB
    devid    1 size 4.00GiB used 3.00GiB path /dev/sda4

The bug is quite nasty that you can't either unmount /dev/sda4 or
/dev/mmcblk0p4. And the problem does not get solved until you take
the sda out of the system on to another system to change its fsid using
the 'btrfstune -u' command.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-06 19:33:15 +08:00
Nikolay Borisov 0cd952cb42 btrfs/003: enable test with virtio_blk devices in VM
For a long time this test has been failing on all kinds of VM
configuration, which are using virtio_blk devices. This is due to
the fact that scsi devices are deletable and virtio_blk are not.
However, this only prevents device replace case to run and has no
negative effect on the other useful test cases.

Re-enable btrfs/003 to run by making
_require_deletable_scratch_dev_pool private to the test case and
modifying it to return success (0) or failure (1) if devices are not
deletable. Further modify the replace test case to check the return
value of this function and skip it if devices are not deletable.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-04-06 19:23:47 +08:00