The block sizes reported by stat and DIOINFO aren't required to be
powers of two. This can happen on an XFS filesystem with a realtime
extent size that isn't a power of two; on such filesystems, certain IO
calls will fail due to alignment issues. Fix that by providing rounding
helpers that work for all sizes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
If fsx encounters a situation where copy_file_range reports that it
copied more than it was asked to, we report this as a failure.
Unfortunately, the parameters to the print function are backwards,
leading to this bogus complaint about a short copy:
do_copy_range: asked 28672, copied 24576??
When we really asked to copy 24k but 28k was copied instead.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
We shouldn't feed sizeof() to strncpy as the string length. Just use
snprintf, which at least doesn't have the zero termination problems.
In file included from /usr/include/string.h:495,
from ../src/global.h:73,
from fsx.c:16:
In function 'strncpy',
inlined from 'main' at fsx.c:2944:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:10: warning:
'__builtin_strncpy' specified bound 4096 equals destination size
[-Wstringop-truncation]
106 | return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'strncpy',
inlined from 'main' at fsx.c:2914:4:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:10: warning:
'__builtin_strncpy' specified bound 1024 equals destination size
[-Wstringop-truncation]
106 | return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
The fsx contains different read and write operations, especially the
AIO and general IO read/write. The fsx chooses one kind of read/write
from AIO and general IO to run. To make the logic clear, use a common
fsx_rw() function to swith between these two kinds of read/write.
Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
We have very similar code that generates the destination range for clone,
dedupe and copy_file_range operations, so avoid duplicating the code three
times and move it into a helper function.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
While running generic/521 I've had fsx taking a lot of CPU time and not
making any progress for several hours. Attaching gdb to the fsx process
revealed that fsx was in the loop that generates the ranges for a
copy_file_range operation, in particular the loop seemed to never end
because the range defined by 'offset2' kept overlapping with the range
defined by 'offset'.
So far this happened one time only in one of my test VMs with generic/521.
Fix this by breaking out of the loop after trying 30 times, like we
currently do for dedupe operations, which results in logging the operation
as skipped.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
While running generic/457 I've had fsx taking a lot of CPU time and not
making any progress for over an hour. Attaching gdb to the fsx process
revealed that fsx was in the loop that generates the ranges for a clone
operation, in particular the loop seemed to never end because the range
defined by 'offset2' kept overlapping with the range defined by 'offset'.
So far this happened two times in one of my test VMs with generic/457.
Fix this by breaking out of the loop after trying 30 times, like we
currently do for dedupe operations, which results in logging the operation
as skipped.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Currently we are limiting the range for zero range operations to stay
within the i_size boundary. This is not optimal because like this we lose
coverage of the filesystem's zero range implementation, since zero range
operations are allowed to cross the i_size. Fix this by limiting the range
to 'maxfilelen' and not 'file_size', and update the 'file_size' after each
zero range operation if needed.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
We are never using the FALLOC_FL_KEEP_SIZE flag for zero range operations
even when we intend to use it. So fix it by setting that flag for the
call to fallocate(2) if the 'keep_size' parameter is true.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Support 64-bit operation counts so that we can run long-soak tests for
more than 2 billion fsxops.
I figured that testcalls and simulatedopcount should both be signed
because numops is also signed. Granted, I guess numops is signed so
that we can set it to the magic value -1 and have fsx run "forever".
[Eryu: add Darrick's explanation about why we change testcalls to a
signed value.]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
fsx has a closeopen option where it will close and then re-open the
file. This is handy, but what is really more useful is to drop the file
from cache completely, so add a drop_caches into this operation so that
the file is read back completely from disk to be really evil.
[Eryu: fix drop cache failure number to stay within 190]
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
I was running down a i_size problem and was missing the failure until
the next iteration of fsx operations because we do the file size check
_before_ the closeopen operation. Move it after the closeopen operation
so we can catch problems where the file gets messed up on disk.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
On 32-bit systems, the offsets are 'unsigned long' (32-bit) which means
that we must cast the explicitly to unsigned long long before feeding
them to llabs. Without the type conversion we fail to sign-extend the
llabs parameter, try to make a copy/clone/dedupe call with overlapping
ranges, and fsx aborts and the test fails.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
The copy_file_range() test detection code performs a zero-length
copy to determine whether to perform such calls during the test run.
While this detects the common case of syscall availability,
copy_file_range() has a somewhat variable implementation on the
kernel side that can depend on certain per-filesystem features, etc.
In some implementations, a zero length copy can shortcut and return
success before ever invoking per-filesystem functionality and thus
not thoroughly testing the copy mechanism on the current system.
This can cause the test detection code to pass only to run into an
immediate failure on the first copy_file_range() call during the
test.
Tweak test_copy_range() to perform a small single byte copy to avoid
this problem. Also fix a typo bug in the errno check of the clone
range detection logic.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Older kernels (prior commit 494633fac7896 "vfs: vfs_dedupe_file_range()
doesn't return EOPNOTSUPP") will return EINVAL when operation is not
supported. Make fsx treat this error as a sign of unsupported
deduplication as well to make it usable with these older kernels.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
In configure script, we only check whether or not the build of test
program succeeds, but that doesn't mean the kernel has implemented
the syscall, so checking for this case.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Add support for the copy_file_range system call to fsx.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[dchinner: copy_file_range() needs to obey read/write constraints
otherwise is blows up when direct IO is used]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Use an enum to define operation codes and the boundaries between
operation classes so that we can add new commands without having to
change a bunch of unrelated #defines.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-By: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Add a new option to make fsx read the file after each operation and
compare it with the good buffer to try to catch corruptions as soon as
they occur.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-By: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
When running multiple fsx processes simultaneously (e.g.
generic/455), it is difficult to tell the seed value for one fsx
process if the seed value is needed to reproduce a log-replay
failure.
Fix it by outputting the seed value after logid is initialized.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>