Commit Graph

348542 Commits

Author SHA1 Message Date
Eric Sandeen e7b04ac00e jbd2: don't wake kjournald unnecessarily
Don't send an extra wakeup to kjournald in the case where we
already have the proper target in j_commit_request, i.e. that
transaction has already been requested for commit.

commit deeeaf13 "jbd2: fix fsync() tid wraparound bug" changed
the logic leading to a wakeup, but it caused some extra wakeups
which were found to lead to a measurable performance regression.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
[tytso@mit.edu: reworked check to make it clearer]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-30 00:39:28 -05:00
Jan Kara 091e26dfc1 ext4: fix possible use-after-free with AIO
Running AIO is pinning inode in memory using file reference. Once AIO
is completed using aio_complete(), file reference is put and inode can
be freed from memory. So we have to be sure that calling aio_complete()
is the last thing we do with the inode.

CC: stable@vger.kernel.org
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-29 22:48:17 -05:00
Guo Chao b1deefc99e ext4: remove unnecessary NULL pointer check
brelse() and ext4_journal_force_commit() are both inlined and able
to handle NULL.

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 21:41:02 -05:00
Guo Chao 41be871f74 ext4: remove useless assignment in dx_probe()
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 21:33:28 -05:00
Guo Chao 2bbbee2a68 ext4: remove unused variable in add_dirent_to_buf()
After commit 978fef9 (create __ext4_insert_dentry for dir entry
insertion), 'reclen' is not used anymore.

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2013-01-28 21:26:44 -05:00
Guo Chao d5ac777305 ext4: release buffer when checksum failed
Commit b0336e8d (ext4: calculate and verify checksums of directory
leaf blocks) and commit dbe89444 (ext4: Calculate and verify checksums
for htree nodes) forget to release buffer when checksum failed, at
some places.

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2013-01-28 21:23:24 -05:00
Lukas Czerner b06acd38a4 ext4: remove explicit WARN_ON when ext4_map_blocks() fails
In two places we call WARN_ON() before we print out the debug message,
however we agreed that the WARN_ON() is unnecessary at those places so
remove them.

Also use ext4_warning() instead of ext4_msg() and printk().

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 21:21:12 -05:00
Lukas Czerner cfa7275482 ext4: remove unused variable flags
Remove unused variable flags from dump_completed_IO(). The code is
only exercised when EXT4FS_DEBUG is defined.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-01-28 21:14:11 -05:00
Jan Kara fe386132f6 ext4: fix ext4_writepage() to achieve data=ordered guarantees
So far ext4_writepage() skipped writing pages that had any delayed or
unwritten buffers attached. When blocksize < pagesize this breaks
data=ordered mode guarantees as we can have a page with one freshly
allocated buffer whose allocation is part of the committing
transaction and another buffer in the page which is delayed or
unwritten. So fix this problem by calling ext4_bio_writepage()
anyway. It will submit mapped buffers and leave others alone.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 21:06:42 -05:00
Jan Kara 8a850c3fb8 ext4: Make ext4_bio_writepage() handle unprepared buffers
So far ext4_bio_writepage() unconditionally cleared dirty bit on all
buffers underlying the page. That implicitely assumes we can write all
buffers. So far that is true because callers call into
ext4_bio_writepage() make sure all buffers in the page are mapped but:

a) it's a data corruption bug waiting to happen
b) in data=ordered mode when blocksize < pagesize we do need to write
   pages that may have only some of dirty buffers mapped.

So change ext4_bio_writepage() to skip buffers that cannot be written without
clearing their dirty bit.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 20:53:28 -05:00
Jan Kara b6a8e62f8b ext4: simplify mpage_add_bh_to_extent()
The argument b_size of mpage_add_bh_to_extent() was bogus since it was
always == blocksize (which we can easily derive from inode->i_blkbits).
Also second branch of condition:
	if (nrblocks >= EXT4_MAX_TRANS_DATA) {
	} else if ((nrblocks + (b_size >> mpd->inode->i_blkbits)) >
						EXT4_MAX_TRANS_DATA) {
	}
was never taken because (b_size >> mpd->inode->i_blkbits) == 1.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 13:06:48 -05:00
Jan Kara f8bec37037 ext4: dirty page has always buffers attached
ext4_writepage(), write_cache_pages_da(), and mpage_da_submit_io()
doesn't have to deal with the case when page doesn't have buffers. We
attach buffers to a page in ->write_begin() and ->page_mkwrite() which
covers all places where a page can become dirty.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 12:55:08 -05:00
Jan Kara 002bd7fa3a ext4: simplify list handling in ext4_do_flush_completed_IO()
The function splices i_completed_io_list to its private list
first.  From that moment on we don't need any lock for working with
io_end structures because all io_end structure on the list are only
our own. So we can remove the other two lists in the function and free
io_end immediately after we are done with it.

CC: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 09:49:15 -05:00
Jan Kara 84c17543ab ext4: move work from io_end to inode
It does not make much sense to have struct work in ext4_io_end_t
because we always use it for only one ext4_io_end_t per inode (the
first one in the i_completed_io list). So just move the structure to
inode itself.  This also allows for a small simplification in
processing io_end structures.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 09:43:46 -05:00
Jan Kara fe089c77f1 ext4: remove __ext4_journalled_writepage() from mpage_da_submit_io()
We don't support delayed allocation in data=journal mode. So checking for it in
mpage_da_submit_io() doesn't make really sence. If we ever decide to extend
delayed allocation support to data=journal mode, adding
__ext4_journalled_writepage() call will be the least of problems we have to
solve. Most likely we'd have to implement separate writepages call anyways
because we don't have transaction credits for writing more than a single page
so mapping of page buffers would have to be done differently.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 09:38:49 -05:00
Jan Kara 1ae48a6354 ext4: use redirty_page_for_writepage() in ext4_bio_write_page()
When we cannot write a page we should use redirty_page_for_writepage()
instead of plain set_page_dirty(). That tells writeback code we have
problems, redirties only the page (redirtying buffers is not needed),
and updates mm accounting of failed page writes.

Also move clearing of buffer dirty flag after io_submit_add_bh(). At that
moment we are sure buffer will be going to disk.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 09:32:54 -05:00
Jan Kara 36ade451a5 ext4: Always use ext4_bio_write_page() for writeout
Currently we sometimes used block_write_full_page() and sometimes
ext4_bio_write_page() for writeback (depending on mount options and call
path). Let's always use ext4_bio_write_page() to simplify things a bit.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 09:30:52 -05:00
Zheng Liu 8bad6fc813 ext4: add punching hole support for non-extent-mapped files
This patch add supports for indirect file support punching hole.  It
is almost the same as ext4_ext_punch_hole.  First, we invalidate all
pages between this hole, and then we try to deallocate all blocks of
this hole.

A recursive function is used to handle deallocation of blocks.  In
this function, it iterates over the entries in inode's i_blocks or
indirect blocks, and try to free the block for each one of them.

After applying this patch, xfstest #255 will not pass w/o extent because
indirect-based file doesn't support unwritten extents.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-28 09:21:37 -05:00
Chen Gang 03dafb5f59 ext4: fix memory leak when quota options are specified multiple times
When usrjquota or grpjquota mount options are specified several times,
we leak memory storing the names. Free the memory correctly.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2013-01-24 23:24:58 -05:00
Theodore Ts'o c3ad83d9ef quota: autoload the quota_v2 module for QFMT_VFS_V1 quota format
Otherwise, ext4 file systems with the quota featured enable will get a
very confusing "No such process" error message if the quota code is
built as a module and the quota_v2 module has not been loaded.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
2013-01-24 23:24:56 -05:00
Theodore Ts'o 72ba74508b ext4: release sysfs kobject when failing to enable quotas on mount
In addition, print the error returned from ext4_enable_quotas()

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Cc: stable@vger.kernel.org
2013-01-24 23:24:54 -05:00
Zheng Liu aaddea812c ext4: add tracepoint in punching hole
This patch adds a tracepoint in ext4_punch_hole.

CC: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-16 20:21:26 -05:00
Theodore Ts'o 7f5118629f ext4: trigger the lazy inode table initialization after resize
After we have finished extending the file system, we need to trigger a
the lazy inode table thread to zero out the inode tables.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-13 08:41:45 -05:00
Eryu Guan 15b49132fc ext4: check bh in ext4_read_block_bitmap()
Validate the bh pointer before using it, since
ext4_read_block_bitmap_nowait() might return NULL.

I've seen this in fsfuzz testing.

 EXT4-fs error (device loop0): ext4_read_block_bitmap_nowait:385: comm touch: Cannot get buffer for block bitmap - block_group = 0, block_bitmap = 3925999616
 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffff8121de25>] ext4_wait_block_bitmap+0x25/0xe0
 ...
 Call Trace:
  [<ffffffff8121e1e5>] ext4_read_block_bitmap+0x35/0x60
  [<ffffffff8125e9c6>] ext4_free_blocks+0x236/0xb80
  [<ffffffff811d0d36>] ? __getblk+0x36/0x70
  [<ffffffff811d0a5f>] ? __find_get_block+0x8f/0x210
  [<ffffffff81191ef3>] ? kmem_cache_free+0x33/0x140
  [<ffffffff812678e5>] ext4_xattr_release_block+0x1b5/0x1d0
  [<ffffffff812679be>] ext4_xattr_delete_inode+0xbe/0x100
  [<ffffffff81222a7c>] ext4_free_inode+0x7c/0x4d0
  [<ffffffff812277b8>] ? ext4_mark_inode_dirty+0x88/0x230
  [<ffffffff8122993c>] ext4_evict_inode+0x32c/0x490
  [<ffffffff811b8cd7>] evict+0xa7/0x1c0
  [<ffffffff811b8ed3>] iput_final+0xe3/0x170
  [<ffffffff811b8f9e>] iput+0x3e/0x50
  [<ffffffff812316fd>] ext4_add_nondir+0x4d/0x90
  [<ffffffff81231d0b>] ext4_create+0xeb/0x170
  [<ffffffff811aae9c>] vfs_create+0xac/0xd0
  [<ffffffff811ac845>] lookup_open+0x185/0x1c0
  [<ffffffff8129e3b9>] ? selinux_inode_permission+0xa9/0x170
  [<ffffffff811acb54>] do_last+0x2d4/0x7a0
  [<ffffffff811af743>] path_openat+0xb3/0x480
  [<ffffffff8116a8a1>] ? handle_mm_fault+0x251/0x3b0
  [<ffffffff811afc49>] do_filp_open+0x49/0xa0
  [<ffffffff811bbaad>] ? __alloc_fd+0xdd/0x150
  [<ffffffff8119da28>] do_sys_open+0x108/0x1f0
  [<ffffffff8119db51>] sys_open+0x21/0x30
  [<ffffffff81618959>] system_call_fastpath+0x16/0x1b

Also fix comment for ext4_read_block_bitmap_nowait()

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-01-12 16:33:25 -05:00
Wang Shilong aebf02430d ext4: use unlikely to improve the efficiency of the kernel
Because the function 'sb_getblk' seldomly fails to return NULL
value,it will be better to use 'unlikely' to optimize it.

Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-12 16:28:47 -05:00