Files
apfstests/tests/btrfs/157
T
Qu Wenruo a0b0287cb7 btrfs/15[78]: Use proper helper to get both devid and physical offset for corruption
[BUG]
When using btrfs-progs v5.4, btrfs/157 and btrfs/158 will fail:

btrfs/157 1s ... - output mismatch (see xfstests/results//btrfs/157.out.bad)
    --- tests/btrfs/157.out 2018-09-16 21:30:48.505104287 +0100
    +++ xfstests/results//btrfs/157.out.bad
2019-12-10 15:35:43.112390076 +0000
    @@ -1,9 +1,9 @@
     QA output created by 157
     wrote 131072/131072 bytes at offset 0
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    +wrote 65536/65536 bytes at offset 22020096
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    ...
    (Run 'diff -u xfstests/tests/btrfs/157.out xfstests/results//btrfs/157.out.bad'  to see the entire diff)
btrfs/158 2s ... - output mismatch (see xfstests/results//btrfs/158.out.bad)
    --- tests/btrfs/158.out 2018-09-16 21:30:48.505104287 +0100
    +++ xfstests/results//btrfs/158.out.bad
2019-12-10 15:35:44.844388521 +0000
    @@ -1,9 +1,9 @@
     QA output created by 158
     wrote 131072/131072 bytes at offset 0
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    +wrote 65536/65536 bytes at offset 22020096
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    ...
    (Run 'diff -u xfstests/tests/btrfs/158.out xfstests/results//btrfs/158.out.bad'  to see the entire diff)

[CAUSE]
This two tests use physical offset as golden output, while mkfs.btrfs
can do whatever it likes to arrange its chunk layout, thus physical
offset is never reliable.

And btrfs-progs commit c501c9e3b816 ("btrfs-progs: mkfs: match devid
order to the stripe index") just changed the layout.

So the output mismatch and failed.

[FIX]
In fact, that btrfs-progs commit not only changed offset, but also the
device sequence.

So we can't just simply remove the physical offset, but also need to use
proper helper to get both devid (as its device path) and physical offset
for corruption.

As long as mkfs.btrfs still uses sequential devid, these tests should
handle future chunk layout change without problem.

Reported-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
2019-12-30 02:01:21 +08:00

123 lines
3.3 KiB
Bash
Executable File

#! /bin/bash
# SPDX-License-Identifier: GPL-2.0
# Copyright (c) 2017 Oracle. All Rights Reserved.
#
# FS QA Test 157
#
# The test case is to reproduce a bug in raid6 reconstruction process that
# would end up with read failure if there is data corruption on two disks in
# the same horizontal stripe, e.g. due to bitrot.
#
# The bug happens when
# a) all disks are good to read,
# b) there is corrupted data on two disks in the same horizontal stripe due to
# something like bitrot,
# c) when rebuilding data after crc fails, btrfs is not able to tell whether
# other copies are good or corrupted because btrfs doesn't have crc for
# unallocated blocks.
#
# The kernel fixes are
# Btrfs: do not merge rbios if their fail stripe index are not identical
# Btrfs: make raid6 rebuild retry more
#
seq=`basename $0`
seqres=$RESULT_DIR/$seq
echo "QA output created by $seq"
here=`pwd`
tmp=/tmp/$$
status=1 # failure is the default!
trap "_cleanup; exit \$status" 0 1 2 3 15
_cleanup()
{
cd /
rm -f $tmp.*
}
# get standard environment, filters and checks
. ./common/rc
. ./common/filter
# remove previous $seqres.full before test
rm -f $seqres.full
# real QA test starts here
# Modify as appropriate.
_supported_fs btrfs
_supported_os Linux
_require_scratch_dev_pool 4
_require_btrfs_command inspect-internal dump-tree
_require_btrfs_fs_feature raid56
get_physical()
{
local stripe=$1
$BTRFS_UTIL_PROG inspect-internal dump-tree -t 3 $SCRATCH_DEV | \
grep " DATA\|RAID6" -A 10 | \
$AWK_PROG "(\$1 ~ /stripe/ && \$3 ~ /devid/ && \$2 ~ /$stripe/) { print \$6 }"
}
get_devid()
{
local stripe=$1
$BTRFS_UTIL_PROG inspect-internal dump-tree -t 3 $SCRATCH_DEV | \
grep " DATA\|RAID6" -A 10 | \
$AWK_PROG "(\$1 ~ /stripe/ && \$3 ~ /devid/ && \$2 ~ /$stripe/) { print \$4 }"
}
get_device_path()
{
local devid=$1
echo "$SCRATCH_DEV_POOL" | $AWK_PROG "{print \$$devid}"
}
_scratch_dev_pool_get 4
# step 1: create a raid6 btrfs and create a 128K file
echo "step 1......mkfs.btrfs" >>$seqres.full
mkfs_opts="-d raid6 -b 1G"
_scratch_pool_mkfs $mkfs_opts >>$seqres.full 2>&1
# -o nospace_cache makes sure data is written to the start position of the data
# chunk
_scratch_mount -o nospace_cache
# [0,64K) is written to stripe 0 and [64K, 128K) is written to stripe 1
$XFS_IO_PROG -f -d -c "pwrite -S 0xaa 0 128K" -c "fsync" \
"$SCRATCH_MNT/foobar" | _filter_xfs_io
logical=`${FILEFRAG_PROG} -v $SCRATCH_MNT/foobar | _filter_filefrag | cut -d '#' -f 1`
_scratch_unmount
phy0=$(get_physical 0)
devid0=$(get_devid 0)
devpath0=$(get_device_path $devid0)
phy1=$(get_physical 1)
devid1=$(get_devid 1)
devpath1=$(get_device_path $devid1)
# step 2: corrupt stripe #0 and #1
echo "step 2......simulate bitrot at:" >>$seqres.full
echo " ......stripe #0: devid $devid0 devpath $devpath0 phy $phy0" \
>>$seqres.full
echo " ......stripe #1: devid $devid1 devpath $devpath1 phy $phy1" \
>>$seqres.full
$XFS_IO_PROG -f -d -c "pwrite -S 0xbb $phy0 64K" $devpath0 > /dev/null
$XFS_IO_PROG -f -d -c "pwrite -S 0xbb $phy1 64K" $devpath1 > /dev/null
# step 3: read foobar to repair the bitrot
echo "step 3......repair the bitrot" >> $seqres.full
_scratch_mount -o nospace_cache
# read the 2nd stripe, i.e. [64K, 128K), to trigger repair
od -x -j 64K $SCRATCH_MNT/foobar
_scratch_dev_pool_put
# success, all done
status=0
exit