linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Zheng Liu	8bb9da943a	jbd: use kmem_cache_zalloc for allocating journal head This commit tries to use kmem_cache_zalloc instead of kmem_cache_alloc/ memset when a new journal head is alloctated. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz>	2013-04-29 14:34:05 +02:00
Jan Kara	e678a4f0f5	jbd: don't wait (forever) for stale tid caused by wraparound In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit `d9b0193` "jbd: fix fsync() tid wraparound bug" attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd_log_start_commit(). Signed-off-by: Jan Kara <jack@suse.cz>	2013-03-27 17:30:59 +01:00
Eric Sandeen	7e2fb2d7e6	jbd: don't wake kjournald unnecessarily Don't send an extra wakeup to kjournald in the case where we already have the proper target in j_commit_request, i.e. that commit has already been requested for commit. commit `d9b0193` "jbd: fix fsync() tid wraparound bug" changed the logic leading to a wakeup, but it caused some extra wakeups which were found to lead to a measurable performance regression. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2013-01-14 22:50:45 +01:00
Jan Kara	2e84f2641e	jbd: don't write superblock when unmounting an ro filesystem This sequence: results in an IO error when unmounting the RO filesystem. The bug was introduced by: commit `9754e39c7b` Author: Jan Kara <jack@suse.cz> Date: Sat Apr 7 12:33:03 2012 +0200 jbd: Split updating of journal superblock and marking journal empty which lost some of the magic in journal_update_superblock() which used to test for a journal with no outstanding transactions. This is a port of a jbd2 fix by Eric Sandeen. CC: <stable@vger.kernel.org> # 3.4.x Signed-off-by: Jan Kara <jack@suse.cz>	2012-08-15 13:53:30 +02:00
Artem Bityutskiy	12810ad708	jbd/jbd2: nuke write_super from comments The '->write_super' superblock method is gone, and this patch removes all the references to 'write_super' from various jbd and jbd2. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-08-04 12:15:36 +04:00
Jan Kara	fd2cbd4dfa	jbd: Write journal superblock with WRITE_FUA after checkpointing If journal superblock is written only in disk's caches and other transaction starts reusing space of the transaction cleaned from the log, it can happen blocks of a new transaction reach the disk before journal superblock. When power failure happens in such case, subsequent journal replay would still try to replay the old transaction but some of it's blocks may be already overwritten by the new transaction. For this reason we must use WRITE_FUA when updating log tail and we must first write new log tail to disk and update in-memory information only after that. Signed-off-by: Jan Kara <jack@suse.cz>	2012-05-15 23:34:37 +02:00
Jan Kara	1ce8486dcc	jbd: protect all log tail updates with j_checkpoint_mutex There are some log tail updates that are not protected by j_checkpoint_mutex. Some of these are harmless because they happen during startup or shutdown but updates in journal_commit_transaction() and journal_flush() can really race with other log tail updates (e.g. someone doing journal_flush() with someone running cleanup_journal_tail()). So protect all log tail updates with j_checkpoint_mutex. Signed-off-by: Jan Kara <jack@suse.cz>	2012-05-15 23:34:36 +02:00
Jan Kara	9754e39c7b	jbd: Split updating of journal superblock and marking journal empty There are three case of updating journal superblock. In the first case, we want to mark journal as empty (setting s_sequence to 0), in the second case we want to update log tail, in the third case we want to update s_errno. Split these cases into separate functions. It makes the code slightly more straightforward and later patches will make the distinction even more important. Signed-off-by: Jan Kara <jack@suse.cz>	2012-05-15 23:34:36 +02:00
Jan Kara	2db938bee3	jbd: Refine commit writeout logic Currently we write out all journal buffers in WRITE_SYNC mode. This improves performance for fsync heavy workloads but hinders performance when writes are mostly asynchronous, most noticably it slows down readers and users complain about slow desktop response etc. So submit writes as asynchronous in the normal case and only submit writes as WRITE_SYNC if we detect someone is waiting for current transaction commit. I've gathered some numbers to back this change. The first is the read latency test. It measures time to read 1 MB after several seconds of sleeping in presence of streaming writes. Top 10 times (out of 90) in us: Before After 2131586 697473 1709932 557487 1564598 535642 1480462 347573 1478579 323153 1408496 222181 1388960 181273 1329565 181070 1252486 172832 1223265 172278 Average: 619377 82180 So the improvement in both maximum and average latency is massive. I've measured fsync throughput by: fs_mark -n 100 -t 1 -s 16384 -d /mnt/fsync/ -S 1 -L 4 in presence of streaming reader. The numbers (fsyncs/s) are: Before After 9.9 6.3 6.8 6.0 6.3 6.2 5.8 6.1 So fsync performance seems unharmed by this change. Signed-off-by: Jan Kara <jack@suse.cz>	2012-04-11 11:12:44 +02:00
Linus Torvalds	c7c66c0cb0	Merge tag 'pm-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates for 3.4 from Rafael Wysocki: "Assorted extensions and fixes including: * Introduction of early/late suspend/hibernation device callbacks. * Generic PM domains extensions and fixes. * devfreq updates from Axel Lin and MyungJoo Ham. * Device PM QoS updates. * Fixes of concurrency problems with wakeup sources. * System suspend and hibernation fixes." * tag 'pm-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (43 commits) PM / Domains: Check domain status during hibernation restore of devices PM / devfreq: add relation of recommended frequency. PM / shmobile: Make MTU2 driver use pm_genpd_dev_always_on() PM / shmobile: Make CMT driver use pm_genpd_dev_always_on() PM / shmobile: Make TMU driver use pm_genpd_dev_always_on() PM / Domains: Introduce "always on" device flag PM / Domains: Fix hibernation restore of devices, v2 PM / Domains: Fix handling of wakeup devices during system resume sh_mmcif / PM: Use PM QoS latency constraint tmio_mmc / PM: Use PM QoS latency constraint PM / QoS: Make it possible to expose PM QoS latency constraints PM / Sleep: JBD and JBD2 missing set_freezable() PM / Domains: Fix include for PM_GENERIC_DOMAINS=n case PM / Freezer: Remove references to TIF_FREEZE in comments PM / Sleep: Add more wakeup source initialization routines PM / Hibernate: Enable usermodehelpers in hibernate() error path PM / Sleep: Make __pm_stay_awake() delete wakeup source timers PM / Sleep: Fix race conditions related to wakeup source timer function PM / Sleep: Fix possible infinite loop during wakeup source destruction PM / Hibernate: print physical addresses consistently with other parts of kernel ...	2012-03-21 10:15:51 -07:00
Cong Wang	8fb53c46d9	jbd: remove the second argument of k[un]map_atomic() Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:23 +08:00
Nigel Cunningham	35c80422af	PM / Sleep: JBD and JBD2 missing set_freezable() With the latest and greatest changes to the freezer, I started seeing panics that were caused by jbd2 running post-process freezing and hitting the canary BUG_ON for non-TuxOnIce I/O submission. I've traced this back to a lack of set_freezable calls in both jbd and jbd2. Since they're clearly meant to be frozen (there are tests for freezing()), I submit the following patch to add the missing calls. Signed-off-by: Nigel Cunningham <nigel@tuxonice.net> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2012-03-13 22:36:44 +01:00
Linus Torvalds	ac69e09280	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2/3/4: delete unneeded includes of module.h ext{3,4}: Fix potential race when setversion ioctl updates inode udf: Mark LVID buffer as uptodate before marking it dirty ext3: Don't warn from writepage when readonly inode is spotted after error jbd: Remove j_barrier mutex reiserfs: Force inode evictions before umount to avoid crash reiserfs: Fix quota mount option parsing udf: Treat symlink component of type 2 as / udf: Fix deadlock when converting file from in-ICB one to normal one udf: Cleanup calling convention of inode_getblk() ext2: Fix error handling on inode bitmap corruption ext3: Fix error handling on inode bitmap corruption ext3: replace ll_rw_block with other functions ext3: NULL dereference in ext3_evict_inode() jbd: clear revoked flag on buffers before a new transaction started ext3: call ext3_mark_recovery_complete() when recovery is really needed	2012-01-09 12:51:21 -08:00
Jan Kara	0048278552	jbd: Remove j_barrier mutex j_barrier mutex is used for serializing different journal lock operations. The problem with it is that e.g. FIFREEZE ioctl results in process leaving kernel with j_barrier mutex held which makes lockdep freak out. Also hibernation code wants to freeze filesystem but it cannot do so because it then cannot hibernate the system because of mutex being locked. So we remove j_barrier mutex and use direct wait on j_barrier_count instead. Since locking journal is a rare operation we don't have to care about fairness or such things. CC: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Jan Kara <jack@suse.cz>	2012-01-09 13:52:09 +01:00
Tejun Heo	a0acae0e88	freezer: unexport refrigerator() and update try_to_freeze() slightly There is no reason to export two functions for entering the refrigerator. Calling refrigerator() instead of try_to_freeze() doesn't save anything noticeable or removes any race condition. * Rename refrigerator() to __refrigerator() and make it return bool indicating whether it scheduled out for freezing. * Update try_to_freeze() to return bool and relay the return value of __refrigerator() if freezing(). * Convert all refrigerator() users to try_to_freeze(). * Update documentation accordingly. * While at it, add might_sleep() to try_to_freeze(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org>	2011-11-21 12:32:22 -08:00
Eryu Guan	8762202dd0	jbd/jbd2: validate sb->s_first in journal_get_superblock() I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3 image has s_first = 0 in journal superblock, and the 0 is passed to journal->j_head in journal_reset(), then to blocknr in cleanup_journal_tail(), in the end the J_ASSERT failed. So validate s_first after reading journal superblock from disk in journal_get_superblock() to ensure s_first is valid. The following script could reproduce it: fstype=ext3 blocksize=1024 img=$fstype.img offset=0 found=0 magic="c0 3b 39 98" dd if=/dev/zero of=$img bs=1M count=8 mkfs -t $fstype -b $blocksize -F $img filesize=`stat -c %s $img` while [ $offset -lt $filesize ] do if od -j $offset -N 4 -t x1 $img \| grep -i "$magic";then echo "Found journal: $offset" found=1 break fi offset=`echo "$offset+$blocksize" \| bc` done if [ $found -ne 1 ];then echo "Magic \"$magic\" not found" exit 1 fi dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1 mkdir -p ./mnt mount -o loop $img ./mnt Cc: Jan Kara <jack@suse.cz> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-11-01 19:04:59 -04:00
Jan Kara	bb189247f3	jbd: Fix oops in journal_remove_journal_head() journal_remove_journal_head() can oops when trying to access journal_head returned by bh2jh(). This is caused for example by the following race: TASK1 TASK2 journal_commit_transaction() ... processing t_forget list __journal_refile_buffer(jh); if (!jh->b_transaction) { jbd_unlock_bh_state(bh); journal_try_to_free_buffers() journal_grab_journal_head(bh) jbd_lock_bh_state(bh) __journal_try_to_free_buffer() journal_put_journal_head(jh) journal_remove_journal_head(bh); journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not part of any transaction and thus frees journal_head before TASK1 gets to doing so. Note that even buffer_head can be released by try_to_free_buffers() after journal_put_journal_head() which adds even larger opportunity for oops (but I didn't see this happen in reality). Fix the problem by making transactions hold their own journal_head reference (in b_jcount). That way we don't have to remove journal_head explicitely via journal_remove_journal_head() and instead just remove journal_head when b_jcount drops to zero. The result of this is that [__]journal_refile_buffer(), [__]journal_unfile_buffer(), and __journal_remove_checkpoint() can free journal_head which needs modification of a few callers. Also we have to be careful because once journal_head is removed, buffer_head might be freed as well. So we have to get our own buffer_head reference where it matters. Signed-off-by: Jan Kara <jack@suse.cz>	2011-06-27 11:44:37 +02:00
Lukas Czerner	99cb1a318c	jbd: Add fixed tracepoints This commit adds fixed tracepoint for jbd. It has been based on fixed tracepoints for jbd2, however there are missing those for collecting statistics, since I think that it will require more intrusive patch so I should have its own commit, if someone decide that it is needed. Also there are new tracepoints in __journal_drop_transaction() and journal_update_superblock(). The list of jbd tracepoints: jbd_checkpoint jbd_start_commit jbd_commit_locking jbd_commit_flushing jbd_commit_logging jbd_drop_transaction jbd_end_commit jbd_do_submit_data jbd_cleanup_journal_tail jbd_update_superblock_end Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz>	2011-06-25 17:29:51 +02:00
Ted Ts'o	d9b01934d5	jbd: fix fsync() tid wraparound bug If an application program does not make any changes to the indirect blocks or extent tree, i_datasync_tid will not get updated. If there are enough commits (i.e., 2*31) such that tid_geq()'s calculations wrap, and there isn't a currently active transaction at the time of the fdatasync() call, this can end up triggering a BUG_ON in fs/jbd/commit.c: J_ASSERT(journal->j_running_transaction != NULL); It's pretty rare that this can happen, since it requires the use of fdatasync() plus very* frequent and excessive use of fsync(). But with the right workload, it can. We fix this by replacing the use of tid_geq() with an equality test, since there's only one valid transaction id that is valid for us to start: namely, the currently running transaction (if it exists). CC: stable@kernel.org Reported-by: Martin_Zielinski@McAfee.com Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>	2011-05-17 13:47:41 +02:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Justin P. Mattock	3c26bdb423	jbd: Remove one to many n's in a word. The Patch below removes one to many "n's" in a word.. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: linux-ext4@vger.kernel.org Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>	2011-02-28 21:55:58 +01:00
Andrea Gelmini	bcf3d0bcff	jbd/2: fixed typos "wakup" Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Jan Kara <jack@suse.cz>	2010-10-28 01:30:05 +02:00
Namhyung Kim	2a0e33889b	jbd: Check return value of __getblk() Fail journal creation if __getblk() returns NULL. unlikely() is added because it is called in a loop and we've been OK without the check until now. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2010-10-28 01:30:04 +02:00
Darrick J. Wong	dff6825e9f	ext3/jbd: Avoid WARN() messages when failing to write the superblock This fixes a WARN backtrace in mark_buffer_dirty() that occurs during unmount when the underlying block device is removed. This bug has been seen on System Z when removing all paths from a multipath-backed ext3 mount; on System P when injecting enough PCI EEH errors to make the SCSI controller go offline; and similar warnings have been seen (and patched) with ext2/ext4. The super block update from a previous operation has marked the buffer as in error, and the flag has to be cleared before doing the update. Similar changes have been made to ext4 by commit `914258bf2c`. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Jan Kara <jack@suse.cz>	2010-10-28 01:30:02 +02:00
Namhyung Kim	f81e3d4564	jbd: Use printk_ratelimited() in journal_alloc_journal_head() Use printk_ratelimited() instead of doing it manually. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2010-10-28 01:30:01 +02:00

1 2 3

74 Commits