In btrfs-progs v4.10 we had a behaviour change where starting a balance
operation without any filters results in a delay of 10 seconds and a
warning is printed to stdout that warns that a full balance is about to
be made and that it can be a slow operation. The new flag '--full-balance'
was added in that release to avoid the 10 seconds delay and the warning
message.
Our existing helper _run_btrfs_balance_start() uses that new balance flag
if we are running a btrfs-progs version that has it, to avoid that 10
seconds wait.
Make all existing btrfs tests that trigger balance operations use the
_run_btrfs_balance_start() helper, so that we avoid wasting time and
speed up some of the tests. In particular test btrfs/014 is now about 10x
faster and tests btrfs/060 to btrfs/064 3 to 5 times faster (depending
on the fsstress random load).
Besides speeding up many tests that do balance operations it also fixes
functional problems:
1) Since btrfs-progs v4.10 the test case btrfs/014 got broken, because
its purpose is to run balance and snapshot creation in parallel,
and that wasn't happening anymore because all snapshots were being
created during the 10 seconds delay of the first balance operation,
so balance and snapshot creation was being serialized instead of
running in parallel.
Fixing this test to avoid the 10 seconds delay immediately
exposes a regression that went into kernel 5.7-rc1 which is fixed
by the following commit
aec7db3b13a0 ("btrfs: fix setting last_trans for reloc roots")
2) Test cases btrfs/060 to btrfs/064 now spend much more time running
fsstress, balance and other operations in parallel, there's no
longer intervals of 10 seconds where balance is not running
concurrently with those other operations, making the tests a lot
more useful again.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
In my testing on 1GB zram devices btrfs/187 usually fails with
ENOSPC.
Add a requirement for 8GB scratch devices (empirically measured).
Cc: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Stress send running in parallel with balance and deduplication against
files that belong to the snapshots used by send. The goal is to verify
that these operations running in parallel do not lead to send crashing
(trigger assertion failures and BUG_ONs), or send finding an inconsistent
snapshot that leads to a failure (reported in dmesg/syslog). The test
needs big trees (snapshots) with large differences between the parent and
send snapshots in order to hit such issues with a good probability.
This currently fails on btrfs, hitting a BUG_ON() often, and with btrfs
error messages in dmesg/syslog. The problem has always existed and it is
not new, but probably unnoticed due to lack of test cases that exercise
these btrfs features running in parallel.
The following patches for btrfs fix the problems:
"Btrfs: fix race between send and deduplication that lead to failures and
crashes"
"Btrfs: prevent send failures and crashes due to concurrent relocation"
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>