linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Eric Whitney	d0abafac8c	ext4: fix bigalloc regression Commit `f5a44db5d2` introduced a regression on filesystems created with the bigalloc feature (cluster size > blocksize). It causes xfstests generic/006 and /013 to fail with an unexpected JBD2 failure and transaction abort that leaves the test file system in a read only state. Other xfstests run on bigalloc file systems are likely to fail as well. The cause is the accidental use of a cluster mask where a cluster offset was needed in ext4_ext_map_blocks(). Signed-off-by: Eric Whitney <enwlinux@gmail.com>	2014-01-06 14:00:23 -05:00
Theodore Ts'o	f5a44db5d2	ext4: add explicit casts when masking cluster sizes The missing casts can cause the high 64-bits of the physical blocks to be lost. Set up new macros which allows us to make sure the right thing happen, even if at some point we end up supporting larger logical block numbers. Thanks to the Emese Revfy and the PaX security team for reporting this issue. Reported-by: PaX Team <pageexec@freemail.hu> Reported-by: Emese Revfy <re.emese@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-20 09:29:35 -05:00
Jan Kara	34cf865d54	ext4: fix deadlock when writing in ENOSPC conditions Akira-san has been reporting rare deadlocks of his machine when running xfstests test 269 on ext4 filesystem. The problem turned out to be in ext4_da_reserve_metadata() and ext4_da_reserve_space() which called ext4_should_retry_alloc() while holding i_data_sem. Since ext4_should_retry_alloc() can force a transaction commit, this is a lock ordering violation and leads to deadlocks. Fix the problem by just removing the retry loops. These functions should just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that function must take care of retrying after dropping all necessary locks. Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-18 00:44:44 -05:00
Jan Kara	30fac0f75d	ext4: Do not reserve clusters when fs doesn't support extents When the filesystem doesn't support extents (like in ext2/3 compatibility modes), there is no need to reserve any clusters. Space estimates for writing are exact, hole punching doesn't need new metadata, and there are no unwritten extents to convert. This fixes a problem when filesystem still having some free space when accessed with a native ext2/3 driver suddently reports ENOSPC when accessed with ext4 driver. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-08 21:11:59 -05:00
Al Viro	9105bb149b	ext4: fix del_timer() misuse for ->s_err_report That thing should be del_timer_sync(); consider what happens if ext4_put_super() call of del_timer() happens to come just as it's getting run on another CPU. Since that timer reschedules itself to run next day, you are pretty much guaranteed that you'll end up with kfree'd scheduled timer, with usual fun consequences. AFAICS, that's -stable fodder all way back to 2010... [the second del_timer_sync() is almost certainly not needed, but it doesn't hurt either] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-08 20:52:31 -05:00
Eryu Guan	5946d08937	ext4: check for overlapping extents in ext4_valid_extent_entries() A corrupted ext4 may have out of order leaf extents, i.e. extent: lblk 0--1023, len 1024, pblk 9217, flags: LEAF UNINIT extent: lblk 1000--2047, len 1024, pblk 10241, flags: LEAF UNINIT ^^^^ overlap with previous extent Reading such extent could hit BUG_ON() in ext4_es_cache_extent(). BUG_ON(end < lblk); The problem is that __read_extent_tree_block() tries to cache holes as well but assumes 'lblk' is greater than 'prev' and passes underflowed length to ext4_es_cache_extent(). Fix it by checking for overlapping extents in ext4_valid_extent_entries(). I hit this when fuzz testing ext4, and am able to reproduce it by modifying the on-disk extent by hand. Also add the check for (ee_block + len - 1) in ext4_valid_extent() to make sure the value is not overflow. Ran xfstests on patched ext4 and no regression. Cc: Lukáš Czerner <lczerner@redhat.com> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-03 21:22:21 -05:00
Junho Ryu	4e8d213980	ext4: fix use-after-free in ext4_mb_new_blocks ext4_mb_put_pa should hold pa->pa_lock before accessing pa->pa_count. While ext4_mb_use_preallocated checks pa->pa_deleted first and then increments pa->count later, ext4_mb_put_pa decrements pa->pa_count before holding pa->pa_lock and then sets pa->pa_deleted. * Free sequence ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa * Use sequence ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_use_preallocated (4): increase pa->pa_count ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_release_context: access pa * Use-after-free sequence [initial status] <pa->pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count [pa_count decremented] <pa->pa_deleted = 0, pa_count = 0> ext4_mb_use_preallocated (4): increase pa->pa_count [pa_count incremented] <pa->pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 [race condition!] <pa->pa_deleted = 1, pa_count = 1> ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa ext4_mb_release_context: access pa AddressSanitizer has detected use-after-free in ext4_mb_new_blocks Bug report: http://goo.gl/rG1On3 Signed-off-by: Junho Ryu <jayr@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-12-03 18:10:28 -05:00
Theodore Ts'o	ae1495b12d	ext4: call ext4_error_inode() if jbd2_journal_dirty_metadata() fails While it's true that errors can only happen if there is a bug in jbd2_journal_dirty_metadata(), if a bug does happen, we need to halt the kernel or remount the file system read-only in order to avoid further data loss. The ext4_journal_abort_handle() function doesn't do any of this, and while it's likely that this call (since it doesn't adjust refcounts) will likely result in the file system eventually deadlocking since the current transaction will never be able to close, it's much cleaner to call let ext4's error handling system deal with this situation. There's a separate bug here which is that if certain jbd2 errors errors occur and file system is mounted errors=continue, the file system will probably eventually end grind to a halt as described above. But things have been this way in a long time, and usually when we have these sorts of errors it's pretty much a disaster --- and that's why the jbd2 layer aggressively retries memory allocations, which is the most likely cause of these jbd2 errors. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org	2013-12-02 09:31:36 -05:00
Linus Torvalds	4fbf888acc	Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 changes from Ted Ts'o: "Ext4 updates for 3.13. Mostly bug fixes and cleanups" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: add prototypes for macro-generated functions ext4: return non-zero st_blocks for inline data ext4: use prandom_u32() instead of get_random_bytes() ext4: remove unreachable code after ext4_can_extents_be_merged() ext4: remove unreachable code in ext4_can_extents_be_merged() ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea() ext4: don't count free clusters from a corrupt block group ext4: fix FITRIM in no journal mode ext4: drop set but otherwise unused variable from ext4_add_dirent_to_inline() ext4: change ext4_read_inline_dir() to return 0 on success ext4: pair trace_ext4_writepages & trace_ext4_writepages_result ext4: add ratelimiting to ext4 messages ext4: fix performance regression in ext4_writepages ext4: fixup kerndoc annotation of mpage_map_and_submit_extent() ext4: fix assertion in ext4_add_complete_io()	2013-11-14 17:19:58 +09:00
Linus Torvalds	9bc9ccd7db	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "All kinds of stuff this time around; some more notable parts: - RCU'd vfsmounts handling - new primitives for coredump handling - files_lock is gone - Bruce's delegations handling series - exportfs fixes plus misc stuff all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits) ecryptfs: ->f_op is never NULL locks: break delegations on any attribute modification locks: break delegations on link locks: break delegations on rename locks: helper functions for delegation breaking locks: break delegations on unlink namei: minor vfs_unlink cleanup locks: implement delegations locks: introduce new FL_DELEG lock flag vfs: take i_mutex on renamed file vfs: rename I_MUTEX_QUOTA now that it's not used for quotas vfs: don't use PARENT/CHILD lock classes for non-directories vfs: pull ext4's double-i_mutex-locking into common code exportfs: fix quadratic behavior in filehandle lookup exportfs: better variable name exportfs: move most of reconnect_path to helper function exportfs: eliminate unused "noprogress" counter exportfs: stop retrying once we race with rename/remove exportfs: clear DISCONNECTED on all parents sooner exportfs: more detailed comment for path_reconnect ...	2013-11-13 15:34:18 +09:00
Andreas Dilger	3f61c0cc70	ext4: add prototypes for macro-generated functions It isn't very easy to find the declarations for the functions created by EXT4_INODE_BIT_FNS() because the names are generated by macros: ext4_test_inode_flag, ext4_set_inode_flag, ext4_clear_inode_flag ext4_test_inode_state, ext4_set_inode_state, ext4_clear_inode_state Add explicit declarations for these functions so that grep and tags can find them. Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-11 22:40:40 -05:00
Andreas Dilger	9206c56155	ext4: return non-zero st_blocks for inline data Return a non-zero st_blocks to userspace for statfs() and friends. Some versions of tar will assume that files with st_blocks == 0 do not contain any data and will skip reading them entirely. Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-11 22:38:12 -05:00
J. Bruce Fields	375e289ea8	vfs: pull ext4's double-i_mutex-locking into common code We want to do this elsewhere as well. Also catch any attempts to use it for directories (where this ordering would conflict with ancestor-first directory ordering in lock_rename). Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Acked-by: Jeff Layton <jlayton@redhat.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:39 -05:00
Theodore Ts'o	dd1f723bf5	ext4: use prandom_u32() instead of get_random_bytes() Many of the uses of get_random_bytes() do not actually need cryptographically secure random numbers. Replace those uses with a call to prandom_u32(), which is faster and which doesn't consume entropy from the /dev/random driver. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-08 00:14:53 -05:00
Eric Sandeen	f275411440	ext4: remove unreachable code after ext4_can_extents_be_merged() Commit `ec22ba8e` ("ext4: disable merging of uninitialized extents") ensured that if either extent under consideration is uninit, we decline to merge, and ext4_can_extents_be_merged() returns false. So there is no need for the caller to then test whether the extent under consideration is unitialized; if it were, we wouldn't have gotten that far. The comments were also inaccurate; ext4_can_extents_be_merged() no longer XORs the states, it fails if either is uninit. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2013-11-07 22:22:08 -05:00
Eric Sandeen	da0169b3b9	ext4: remove unreachable code in ext4_can_extents_be_merged() Commit `ec22ba8e` ("ext4: disable merging of uninitialized extents") ensured that if either extent under consideration is uninit, we decline to merge, and immediately return. But right after that test, we test again for an uninit extent; we can never hit this. So just remove the impossible test and associated variable. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2013-11-04 09:58:26 -05:00
Theodore Ts'o	dcb9917ba0	ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea() Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-10-31 23:00:24 -04:00
Darrick J. Wong	2746f7a170	ext4: don't count free clusters from a corrupt block group A bg that's been flagged "corrupt" by definition has no free blocks, so that the allocator won't be tempted to use the damaged bg. Therefore, we shouldn't count the clusters in the damaged group when calculating free counts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2013-10-31 11:46:31 -04:00
Lukas Czerner	8f9ff18920	ext4: fix FITRIM in no journal mode When using FITRIM ioctl on a file system without journal it will only trim the block group once, no matter how many times you invoke FITRIM ioctl and how many block you release from the block group. It is because we only clear EXT4_GROUP_INFO_WAS_TRIMMED_BIT in journal callback. Fix this by clearing the bit in no journal mode as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Jorge Fábregas <jorge.fabregas@gmail.com>	2013-10-30 11:10:52 -04:00
Azat Khuzhin	5ba052fe33	ext4: drop set but otherwise unused variable from ext4_add_dirent_to_inline() Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-30 10:53:10 -04:00
BoxiLiu	48ffdab1c1	ext4: change ext4_read_inline_dir() to return 0 on success In ext4_read_inline_dir(), if there is inline data, the successful return value is the return value of ext4_read_inline_data(). Howewer, this is used by ext4_readdir(), and while it seems harmless to return a positive value on success, it's inconsistent, since historically we've always return 0 on success. Signed-off-by: BoxiLiu <lewis.liulei@huawei.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Tao Ma <boyu.mt@taobao.com>	2013-10-30 08:07:20 -04:00
Ming Lei	bbf023c74d	ext4: pair trace_ext4_writepages & trace_ext4_writepages_result Pair the two trace events to make troubeshooting writepages easier, and it should be more convinient to write a simple script to parse the traces. Cc: linux-ext4@vger.kernel.org Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-30 07:27:16 -04:00
Theodore Ts'o	efbed4dc58	ext4: add ratelimiting to ext4 messages In the case of a storage device that suddenly disappears, or in the case of significant file system corruption, this can result in a huge flood of messages being sent to the console. This can overflow the file system containing /var/log/messages, or if a serial console is configured, this can slow down the system so much that a hardware watchdog can end up triggering forcing a system reboot. Google-Bug-Id: `7258357` Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-17 21:11:01 -04:00
Ming Lei	aeac589a74	ext4: fix performance regression in ext4_writepages Commit 4e7ea81db5(ext4: restructure writeback path) introduces another performance regression on random write: - one more page may be added to ext4 extent in mpage_prepare_extent_to_map, and will be submitted for I/O so nr_to_write will become -1 before 'done' is set - the worse thing is that dirty pages may still be retrieved from page cache after nr_to_write becomes negative, so lots of small chunks can be submitted to block device when page writeback is catching up with write path, and performance is hurted. On one arm A15 board with sata 3.0 SSD(CPU: 1.5GHz dura core, RAM: 2GB, SATA controller: 3.0Gbps), this patch can improve below test's result from 157MB/sec to 174MB/sec(>10%): dd if=/dev/zero of=./z.img bs=8K count=512K The above test is actually prototype of block write in bonnie++ utility. This patch makes sure no more pages than nr_to_write can be added to extent for mapping, so that nr_to_write won't become negative. Cc: linux-ext4@vger.kernel.org Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-17 18:56:16 -04:00
Linus Torvalds	0056019da4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull tmpfile fix from Al Viro: "A fix for double iput() in ->tmpfile() on ext3 and ext4; I'd fucked it up, Miklos has caught it" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ext[34]: fix double put in tmpfile	2013-10-16 17:18:18 -07:00

1 2 3 4 5 ...

2111 Commits