We limit the number of blocks written in a single loop of
ext4_da_writepages() to 64 when inode uses indirect blocks. That is
unnecessary as credit estimates for mapping logically continguous run
of blocks is rather low even for inode with indirect blocks. So just
lift this limitation and properly calculate the number of necessary
credits.
This better credit estimate will also later allow us to always write
at least a single page in one iteration.
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext4_ind_trans_blocks() wrongly used 'chunk' argument to decide whether
blocks mapped are logically contiguous. That is wrong since the argument
informs whether the blocks are physically contiguous. As the blocks
mapped are always logically contiguous and that's all
ext4_ind_trans_blocks() cares about, just remove the 'chunk' argument.
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This attribute is now unused so deprecate it. We still show the old
default value to keep some compatibility but we don't allow writing to
that attribute anymore.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Writeback code got better in how it submits IO and now the number of
pages requested to be written is usually higher than original 1024.
The number is now dynamically computed based on observed throughput
and is set to be about 0.5 s worth of writeback. E.g. on ordinary
SATA drive this ends up somewhere around 10000 as my testing shows.
So remove the unnecessary smarts from ext4_da_writepages().
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In some cases we cannot start a transaction because of locking
constraints and passing started transaction into those places is not
handy either because we could block transaction commit for too long.
Transaction reservation is designed to solve these issues. It
reserves a handle with given number of credits in the journal and the
handle can be later attached to the running transaction without
blocking on commit or checkpointing. Reserved handles do not block
transaction commit in any way, they only reduce maximum size of the
running transaction (because we have to always be prepared to
accomodate request for attaching reserved handle).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Change writeback path to create just one io_end structure for the
extent to which we submit IO and share it among bios writing that
extent. This prevents needless splitting and joining of unwritten
extents when they cannot be submitted as a single bio.
Bugs in ENOMEM handling found by Linux File System Verification project
(linuxtesting.org) and fixed by Alexey Khoroshilov
<khoroshilov@ispras.ru>.
CC: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The arithmetics adding delalloc blocks to the number of used blocks in
ext4_getattr() can easily overflow on 32-bit archs as we first multiply
number of blocks by blocksize and then divide back by 512. Make the
arithmetics more clever and also use proper type (unsigned long long
instead of unsigned long).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
On 32-bit architectures with 32-bit sector_t computation of data offset
in ext4_xattr_fiemap() can overflow resulting in reporting bogus data
location. Fix the problem by typing block number to proper type before
shifting.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4_lblk_t is just u32 so multiplying it by blocksize can easily
overflow for files larger than 4 GB. Fix that by properly typing the
block offsets before shifting.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
On 32-bit archs when sector_t is defined as 32-bit the logic computing
data offset in ext4_inline_data_fiemap(). Fix that by properly typing
the shifted value.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Suppress the messages releating to processing the ext4 orphan list
("truncating inode" and "deleting unreferenced inode") unless the
debug option is on, since otherwise they end up taking up space in the
log that could be used for more useful information.
Tested by opening several files, unlinking them, then
crashing the system, rebooting the system and examining
/var/log/messages.
Addresses the problem described in http://crbug.com/220976
Signed-off-by: Paul Taysom <taysom@chromium.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently punch hole is disabled in file systems with bigalloc
feature enabled. However the recent changes in punch hole patch should
make it easier to support punching holes on bigalloc enabled file
systems.
This commit changes partial_cluster handling in ext4_remove_blocks(),
ext4_ext_rm_leaf() and ext4_ext_remove_space(). Currently
partial_cluster is unsigned long long type and it makes sure that we
will free the partial cluster if all extents has been released from that
cluster. However it has been specifically designed only for truncate.
With punch hole we can be freeing just some extents in the cluster
leaving the rest untouched. So we have to make sure that we will notice
cluster which still has some extents. To do this I've changed
partial_cluster to be signed long long type. The only scenario where
this could be a problem is when cluster_size == block size, however in
that case there would not be any partial clusters so we're safe. For
bigger clusters the signed type is enough. Now we use the negative value
in partial_cluster to mark such cluster used, hence we know that we must
not free it even if all other extents has been freed from such cluster.
This scenario can be described in simple diagram:
|FFF...FF..FF.UUU|
^----------^
punch hole
. - free space
| - cluster boundary
F - freed extent
U - used extent
Also update respective tracepoints to use signed long long type for
partial_cluster.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The "head removal" branch in the condition is never used in any code
path in ext4 since the function only caller ext4_ext_rm_leaf() will make
sure that the extent is properly split before removing blocks. Note that
there is a bug in this branch anyway.
This commit removes the unused code completely and makes use of
ext4_error() instead of printk if dubious range is provided.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The discard_partial_page_buffers is no longer used anywhere so we can
simply remove it including the *_no_lock variant and
EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED define.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We're doing to get rid of ext4_discard_partial_page_buffers() since it is
duplicating some code and also partially duplicating work of
truncate_pagecache_range(), moreover the old implementation was much
clearer.
Now when the truncate_inode_pages_range() can handle truncating non page
aligned regions we can use this to invalidate and zero out block aligned
region of the punched out range and then use ext4_block_truncate_page()
to zero the unaligned blocks on the start and end of the range. This
will greatly simplify the punch hole code. Moreover after this commit we
can get rid of the ext4_discard_partial_page_buffers() completely.
We also introduce function ext4_prepare_punch_hole() to do come common
operations before we attempt to do the actual punch hole on
indirect or extent file which saves us some code duplication.
This has been tested on ppc64 with 1k block size with fsx and xfstests
without any problems.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently we do not tell mm to zero out tail of the page before truncate
in orphan_cleanup(). This is ok, because the page should not be
uptodate, however this may eventually change and I might cause problems.
Call truncate_inode_pages() as precautionary measure. Thanks Jan Kara
for pointing this out.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This reverts commit 189e868fa8.
This commit reintroduces the use of ext4_block_truncate_page() in ext4
truncate operation instead of ext4_discard_partial_page_buffers().
The statement in the commit description that the truncate operation only
zero block unaligned portion of the last page is not exactly right,
since truncate_pagecache_range() also zeroes and invalidate the unaligned
portion of the page. Then there is no need to zero and unmap it once more
and ext4_block_truncate_page() was doing the right job, although we
still need to update the buffer head containing the last block, which is
exactly what ext4_block_truncate_page() is doing.
Moreover the problem described in the commit is fixed more properly with
commit
15291164b2
jbd2: clear BH_Delay & BH_Unwritten in journal_unmap_buffer
This was tested on ppc64 machine with block size of 1024 bytes without
any problems.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In data=ordered mode we should call ext4_jbd2_file_inode() so that crash
after the truncate transaction has committed does not expose stall data
in the tail of the block.
Thanks Jan Kara for pointing that out.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This reverts commit ccb4d7af91.
This commit reintroduces functions ext4_block_truncate_page() and
ext4_block_zero_page_range() which has been previously removed in favour
of ext4_discard_partial_page_buffers().
In future commits we want to reintroduce those function and remove
ext4_discard_partial_page_buffers() since it is duplicating some code
and also partially duplicating work of truncate_pagecache_range(),
moreover the old implementation was much clearer.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
->invalidatepage() aop now accepts range to invalidate so we can make
use of it in all ext4 invalidatepage routines.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
invalidatepage now accepts range to invalidate and there are two file
system using jbd2 also implementing punch hole feature which can benefit
from this. We need to implement the same thing for jbd2 layer in order to
allow those file system take benefit of this functionality.
This commit adds length argument to the jbd2_journal_invalidatepage()
and updates all instances in ext4 and ocfs2.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Currently there is no way to truncate partial page where the end
truncate point is not at the end of the page. This is because it was not
needed and the functionality was enough for file system truncate
operation to work properly. However more file systems now support punch
hole feature and it can benefit from mm supporting truncating page just
up to the certain point.
Specifically, with this functionality truncate_inode_pages_range() can
be changed so it supports truncating partial page at the end of the
range (currently it will BUG_ON() if 'end' is not at the end of the
page).
This commit changes the invalidatepage() address space operation
prototype to accept range to be invalidated and update all the instances
for it.
We also change the block_invalidatepage() in the same way and actually
make a use of the new length argument implementing range invalidation.
Actual file system implementations will follow except the file systems
where the changes are really simple and should not change the behaviour
in any way .Implementation for truncate_page_range() which will be able
to accept page unaligned ranges will follow as well.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Pull ext4 update from Ted Ts'o:
"Fixed regressions (two stability regressions and a performance
regression) introduced during the 3.10-rc1 merge window.
Also included is a bug fix relating to allocating blocks after
resizing an ext3 file system when using the ext4 file system driver"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd,jbd2: fix oops in jbd2_journal_put_journal_head()
ext4: revert "ext4: use io_end for multiple bios"
ext4: limit group search loop for non-extent files
ext4: fix fio regression