linux

mirror of https://github.com/Dasharo/linux.git synced 2026-03-06 15:25:10 -08:00

Author	SHA1	Message	Date
Linus Torvalds	6d61a53dd6	Merge tag 'f2fs-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this series, there are several major improvements such as folio conversion by Matthew, speed-up of block truncation, and caching more dentry pages. In addition, we implemented a linear dentry search to address recent unicode regression, and figured out some false alarms that we could get rid of. Enhancements: - foilio conversion in various IO paths - optimize f2fs_truncate_data_blocks_range() - cache more dentry pages - remove unnecessary blk_finish_plug - procfs: show mtime in segment_bits Bug fixes: - introduce linear search for dentries - don't call block truncation for aliased file - fix using wrong 'submitted' value in f2fs_write_cache_pages - fix to do sanity check correctly on i_inline_xattr_size - avoid trying to get invalid block address - fix inconsistent dirty state of atomic file" * tag 'f2fs-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits) f2fs: fix inconsistent dirty state of atomic file f2fs: fix to avoid changing 'check only' behaior of recovery f2fs: Clean up the loop outside of f2fs_invalidate_blocks() f2fs: procfs: show mtime in segment_bits f2fs: fix to avoid return invalid mtime from f2fs_get_section_mtime() f2fs: Fix format specifier in sanity_check_inode() f2fs: avoid trying to get invalid block address f2fs: fix to do sanity check correctly on i_inline_xattr_size f2fs: remove blk_finish_plug f2fs: Optimize f2fs_truncate_data_blocks_range() f2fs: fix using wrong 'submitted' value in f2fs_write_cache_pages f2fs: add parameter @len to f2fs_invalidate_blocks() f2fs: update_sit_entry_for_release() supports consecutive blocks. f2fs: introduce update_sit_entry_for_release/alloc() f2fs: don't call block truncation for aliased file f2fs: Introduce linear search for dentries f2fs: add parameter @len to f2fs_invalidate_internal_cache() f2fs: expand f2fs_invalidate_compress_page() to f2fs_invalidate_compress_pages_range() f2fs: ensure that node info flags are always initialized f2fs: The GC triggered by ioctl also needs to mark the segno as victim ...	2025-01-27 20:58:58 -08:00
Linus Torvalds	9c5968db9e	Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "The various patchsets are summarized below. Plus of course many indivudual patches which are described in their changelogs. - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the page allocator so we end up with the ability to allocate and free zero-refcount pages. So that callers (ie, slab) can avoid a refcount inc & dec - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use large folios other than PMD-sized ones - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and fixes for this small built-in kernel selftest - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of the mapletree code - "mm: fix format issues and param types" from Keren Sun implements a few minor code cleanups - "simplify split calculation" from Wei Yang provides a few fixes and a test for the mapletree code - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes continues the work of moving vma-related code into the (relatively) new mm/vma.c - "mm/page_alloc: gfp flags cleanups for alloc_contig_()" from David Hildenbrand cleans up and rationalizes handling of gfp flags in the page allocator - "readahead: Reintroduce fix for improper RA window sizing" from Jan Kara is a second attempt at fixing a readahead window sizing issue. It should reduce the amount of unnecessary reading - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng addresses an issue where "huge" amounts of pte pagetables are accumulated: https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/ Qi's series addresses this windup by synchronously freeing PTE memory within the context of madvise(MADV_DONTNEED) - "selftest/mm: Remove warnings found by adding compiler flags" from Muhammad Usama Anjum fixes some build warnings in the selftests code when optional compiler warnings are enabled - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David Hildenbrand tightens the allocator's observance of __GFP_HARDWALL - "pkeys kselftests improvements" from Kevin Brodsky implements various fixes and cleanups in the MM selftests code, mainly pertaining to the pkeys tests - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to estimate application working set size - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn provides some cleanups to memcg's hugetlb charging logic - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song removes the global swap cgroup lock. A speedup of 10% for a tmpfs-based kernel build was demonstrated - "zram: split page type read/write handling" from Sergey Senozhatsky has several fixes and cleaups for zram in the area of zram_write_page(). A watchdog softlockup warning was eliminated - "move pagetable__dtor() to __tlb_remove_table()" from Kevin Brodsky cleans up the pagetable destructor implementations. A rare use-after-free race is fixed - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes simplifies and cleans up the debugging code in the VMA merging logic - "Account page tables at all levels" from Kevin Brodsky cleans up and regularizes the pagetable ctor/dtor handling. This results in improvements in accounting accuracy - "mm/damon: replace most damon_callback usages in sysfs with new core functions" from SeongJae Park cleans up and generalizes DAMON's sysfs file interface logic - "mm/damon: enable page level properties based monitoring" from SeongJae Park increases the amount of information which is presented in response to DAMOS actions - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes DAMON's long-deprecated debugfs interfaces. Thus the migration to sysfs is completed - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter Xu cleans up and generalizes the hugetlb reservation accounting - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino removes a never-used feature of the alloc_pages_bulk() interface - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park extends DAMOS filters to support not only exclusion (rejecting), but also inclusion (allowing) behavior - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi introduces a new memory descriptor for zswap.zpool that currently overlaps with struct page for now. This is part of the effort to reduce the size of struct page and to enable dynamic allocation of memory descriptors - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and simplifies the swap allocator locking. A speedup of 400% was demonstrated for one workload. As was a 35% reduction for kernel build time with swap-on-zram - "mm: update mips to use do_mmap(), make mmap_region() internal" from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that mmap_region() can be made MM-internal - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU regressions and otherwise improves MGLRU performance - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park updates DAMON documentation - "Cleanup for memfd_create()" from Isaac Manjarres does that thing - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand provides various cleanups in the areas of hugetlb folios, THP folios and migration - "Uncached buffered IO" from Jens Axboe implements the new RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache reading and writing. To permite userspace to address issues with massive buildup of useless pagecache when reading/writing fast devices - "selftests/mm: virtual_address_range: Reduce memory" from Thomas Weißschuh fixes and optimizes some of the MM selftests" * tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm/compaction: fix UBSAN shift-out-of-bounds warning s390/mm: add missing ctor/dtor on page table upgrade kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags() tools: add VM_WARN_ON_VMG definition mm/damon/core: use str_high_low() helper in damos_wmark_wait_us() seqlock: add missing parameter documentation for raw_seqcount_try_begin() mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh mm/page_alloc: remove the incorrect and misleading comment zram: remove zcomp_stream_put() from write_incompressible_page() mm: separate move/undo parts from migrate_pages_batch() mm/kfence: use str_write_read() helper in get_access_type() selftests/mm/mkdirty: fix memory leak in test_uffdio_copy() kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags() selftests/mm: virtual_address_range: avoid reading from VM_IO mappings selftests/mm: vm_util: split up /proc/self/smaps parsing selftests/mm: virtual_address_range: unmap chunks after validation selftests/mm: virtual_address_range: mmap() without PROT_WRITE selftests/memfd/memfd_test: fix possible NULL pointer dereference mm: add FGP_DONTCACHE folio creation flag mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue ...	2025-01-26 18:36:23 -08:00
Kairui Song	27701521be	mm, swap: clean up device availability check Remove highest_bit and lowest_bit. After the HDD allocation path has been removed, the only purpose of these two fields is to determine whether the device is full or not, which can instead be determined by checking the inuse_pages. Link: https://lkml.kernel.org/r/20250113175732.48099-6-ryncsn@gmail.com Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Baoquan He <bhe@redhat.com> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Chis Li <chrisl@kernel.org> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Hugh Dickens <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-01-25 20:22:36 -08:00
Jianan Huang	03511e9369	f2fs: fix inconsistent dirty state of atomic file When testing the atomic write fix patches, the f2fs_bug_on was triggered as below: ------------[ cut here ]------------ kernel BUG at fs/f2fs/inode.c:935! Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 3 UID: 0 PID: 257 Comm: bash Not tainted 6.13.0-rc1-00033-gc283a70d3497 #5 RIP: 0010:f2fs_evict_inode+0x50f/0x520 Call Trace: <TASK> ? __die_body+0x65/0xb0 ? die+0x9f/0xc0 ? do_trap+0xa1/0x170 ? f2fs_evict_inode+0x50f/0x520 ? f2fs_evict_inode+0x50f/0x520 ? handle_invalid_op+0x65/0x80 ? f2fs_evict_inode+0x50f/0x520 ? exc_invalid_op+0x39/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? __pfx_f2fs_get_dquots+0x10/0x10 ? f2fs_evict_inode+0x50f/0x520 ? f2fs_evict_inode+0x2e5/0x520 evict+0x186/0x2f0 prune_icache_sb+0x75/0xb0 super_cache_scan+0x1a8/0x200 do_shrink_slab+0x163/0x320 shrink_slab+0x2fc/0x470 drop_slab+0x82/0xf0 drop_caches_sysctl_handler+0x4e/0xb0 proc_sys_call_handler+0x183/0x280 vfs_write+0x36d/0x450 ksys_write+0x68/0xd0 do_syscall_64+0xc8/0x1a0 ? arch_exit_to_user_mode_prepare+0x11/0x60 ? irqentry_exit_to_user_mode+0x7e/0xa0 The root cause is: f2fs uses FI_ATOMIC_DIRTIED to indicate dirty atomic files during commit. If the inode is dirtied during commit, such as by f2fs_i_pino_write, the vfs inode keeps clean and the f2fs inode is set to FI_DIRTY_INODE. The FI_DIRTY_INODE flag cann't be cleared by write_inode later due to the clean vfs inode. Finally, f2fs_bug_on is triggered due to this inconsistent state when evict. To reproduce this situation: - fd = open("/mnt/test.db", O_WRONLY) - ioctl(fd, F2FS_IOC_START_ATOMIC_WRITE) - mv /mnt/test.db /mnt/test1.db - ioctl(fd, F2FS_IOC_COMMIT_ATOMIC_WRITE) - echo 3 > /proc/sys/vm/drop_caches To fix this problem, clear FI_DIRTY_INODE after commit, then f2fs_mark_inode_dirty_sync will ensure a consistent dirty state. Fixes: `fccaa81de8` ("f2fs: prevent atomic file from being dirtied before commit") Signed-off-by: Yunlei He <heyunlei@xiaomi.com> Signed-off-by: Jianan Huang <huangjianan@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-25 01:12:12 +00:00
Zhiguo Niu	edf3c08600	f2fs: fix to avoid changing 'check only' behaior of recovery The following two 'check only recovery' processes are very dependent on the return value of f2fs_recover_fsync_data, especially when the return value is greater than 0. 1. when device has readonly mode, shown as commit `23738e7447` ("f2fs: fix to restrict mount condition on readonly block device") 2. mount optiont NORECOVERY or DISABLE_ROLL_FORWARD is set, shown as commit `6781eabba1` ("f2fs: give -EINVAL for norecovery and rw mount") However, commit `c426d99127` ("f2fs: Check write pointer consistency of open zones") will change the return value unexpectedly, thereby changing the caller's behavior This patch let the f2fs_recover_fsync_data return correct value,and not do f2fs_check_and_fix_write_pointer when the device is read-only. Fixes: `c426d99127` ("f2fs: Check write pointer consistency of open zones") Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-22 21:04:56 +00:00
Yi Sun	6d4008dc4a	f2fs: Clean up the loop outside of f2fs_invalidate_blocks() Now f2fs_invalidate_blocks() supports a continuous range of addresses, so the for loop can be omitted. Signed-off-by: Yi Sun <yi.sun@unisoc.com> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-22 21:04:56 +00:00
Chao Yu	f6370a360d	f2fs: procfs: show mtime in segment_bits Show mtime in segment_bits for debug. cat /proc/fs//f2fs/loop0/segment_bits format: segment_type\|valid_blocks\|bitmaps\|mtime segment_type(0:HD, 1:WD, 2:CD, 3:HN, 4:WN, 5:CN) 0 3\|1 \| 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00\| ffffffffffffffff 1 4\|3 \| 00 d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00\| ffffffffffffffff 2 5\|0 \| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00\| ffffffffffffffff 3 0\|1 \| 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00\| ffffffffffffffff Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-22 21:04:56 +00:00
Chao Yu	207764e5d6	f2fs: fix to avoid return invalid mtime from f2fs_get_section_mtime() syzbot reported a f2fs bug as below: ------------[ cut here ]------------ kernel BUG at fs/f2fs/gc.c:373! CPU: 0 UID: 0 PID: 5316 Comm: syz.0.0 Not tainted 6.13.0-rc3-syzkaller-00044-gaef25be35d23 #0 RIP: 0010:get_cb_cost fs/f2fs/gc.c:373 [inline] RIP: 0010:get_gc_cost fs/f2fs/gc.c:406 [inline] RIP: 0010:f2fs_get_victim+0x68b1/0x6aa0 fs/f2fs/gc.c:912 Call Trace: <TASK> __get_victim fs/f2fs/gc.c:1707 [inline] f2fs_gc+0xc89/0x2f60 fs/f2fs/gc.c:1915 f2fs_ioc_gc fs/f2fs/file.c:2624 [inline] __f2fs_ioctl+0x4cc9/0xb8b0 fs/f2fs/file.c:4482 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f w/ below testcase, it can reproduce directly: - dd if=/dev/zero of=/tmp/file bs=1M count=64 - mkfs.f2fs /tmp/file - mount -t f2fs -o loop,mode=fragment:block /tmp/file /mnt/f2fs - echo 0 > /sys/fs/f2fs/loop0/min_ssr_sections - dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=5 - umount /mnt/f2fs - for((i=4096;i<16384;i+=512)) do inject.f2fs --sit 0 --blk $i --mb mtime --val -1 /tmp/file; done - mount -o loop /tmp/file /mnt/f2fs - f2fs_io gc 0 /mnt/f2fs/file static unsigned int get_cb_cost() { ... mtime = f2fs_get_section_mtime(sbi, segno); f2fs_bug_on(sbi, mtime == INVALID_MTIME); ... } The root cause is: mtime in f2fs_sit_entry can be fuzzed to INVALID_MTIME, then it will trigger BUG_ON in get_cb_cost() during GC. Let's change behavior of f2fs_get_section_mtime() as below for fix: - return INVALID_MTIME only if total valid blocks is zero. - return INVALID_MTIME - 1 if average mtime calculated is INVALID_MTIME. Fixes: `b19ee72722` ("f2fs: introduce f2fs_get_section_mtime") Reported-by: syzbot+b9972806adbe20a910eb@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/6768c82e.050a0220.226966.0035.GAE@google.com Cc: liuderong <liuderong@oppo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-22 21:04:56 +00:00
Nathan Chancellor	a68905d48a	f2fs: Fix format specifier in sanity_check_inode() When building for 32-bit platforms, for which 'size_t' is 'unsigned int', there is a warning due to an incorrect format specifier: fs/f2fs/inode.c:320:6: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat] 318 \| f2fs_warn(sbi, "%s: inode (ino=%lx) has corrupted i_inline_xattr_size: %d, min: %lu, max: %lu", \| ~~~ \| %u 319 \| __func__, inode->i_ino, fi->i_inline_xattr_size, 320 \| MIN_INLINE_XATTR_SIZE, MAX_INLINE_XATTR_SIZE); \| ^~~~~~~~~~~~~~~~~~~~~ fs/f2fs/f2fs.h:1855:46: note: expanded from macro 'f2fs_warn' 1855 \| f2fs_printk(sbi, false, KERN_WARNING fmt, ##__VA_ARGS__) \| ~~~ ^~~~~~~~~~~ fs/f2fs/xattr.h:86:31: note: expanded from macro 'MIN_INLINE_XATTR_SIZE' 86 \| #define MIN_INLINE_XATTR_SIZE (sizeof(struct f2fs_xattr_header) / sizeof(__le32)) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use the format specifier for 'size_t', '%zu', to resolve the warning. Fixes: `5c1768b672` ("f2fs: fix to do sanity check correctly on i_inline_xattr_size") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-22 21:04:49 +00:00
Jaegeuk Kim	e02938613e	f2fs: avoid trying to get invalid block address In f2fs_new_inode(), if we fail to get a new inode, we go iput(), followed by f2fs_evict_inode(). If the inode is not marked as bad, it'll try to call f2fs_remove_inode_page() which tries to read the inode block given node id. But, there's no block address allocated yet, which gives a chance to access a wrong block address, if the block device has some garbage data in NAT table. We need to make sure NAT table should have zero data for all the unallocated node ids, but also would be better to take this unnecessary path as well. Let's mark the faild inode as bad. Fixes: `0abd675e97` ("f2fs: support plain user/group quota") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-21 16:28:16 +00:00
Chao Yu	5c1768b672	f2fs: fix to do sanity check correctly on i_inline_xattr_size syzbot reported an out-of-range access issue as below: UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3292:19 index 18446744073709550491 is out of range for type '__le32[923]' (aka 'unsigned int[923]') CPU: 0 UID: 0 PID: 5338 Comm: syz.0.0 Not tainted 6.12.0-syzkaller-10689-g7af08b57bcb9 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_out_of_bounds+0x121/0x150 lib/ubsan.c:429 read_inline_xattr+0x273/0x280 lookup_all_xattrs fs/f2fs/xattr.c:341 [inline] f2fs_getxattr+0x57b/0x13b0 fs/f2fs/xattr.c:533 vfs_getxattr_alloc+0x472/0x5c0 fs/xattr.c:393 ima_read_xattr+0x38/0x60 security/integrity/ima/ima_appraise.c:229 process_measurement+0x117a/0x1fb0 security/integrity/ima/ima_main.c:353 ima_file_check+0xd9/0x120 security/integrity/ima/ima_main.c:572 security_file_post_open+0xb9/0x280 security/security.c:3121 do_open fs/namei.c:3830 [inline] path_openat+0x2ccd/0x3590 fs/namei.c:3987 do_file_open_root+0x3a7/0x720 fs/namei.c:4039 file_open_root+0x247/0x2a0 fs/open.c:1382 do_handle_open+0x85b/0x9d0 fs/fhandle.c:414 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f index: 18446744073709550491 (decimal, unsigned long long) = 0xfffffffffffffb9b (hexadecimal) = -1125 (decimal, long long) UBSAN detects that inline_xattr_addr() tries to access .i_addr[-1125]. w/ below testcase, it can reproduce this bug easily: - mkfs.f2fs -f -O extra_attr,flexible_inline_xattr /dev/sdb - mount -o inline_xattr_size=512 /dev/sdb /mnt/f2fs - touch /mnt/f2fs/file - umount /mnt/f2fs - inject.f2fs --node --mb i_inline --nid 4 --val 0x1 /dev/sdb - inject.f2fs --node --mb i_inline_xattr_size --nid 4 --val 2048 /dev/sdb - mount /dev/sdb /mnt/f2fs - getfattr /mnt/f2fs/file The root cause is if metadata of filesystem and inode were fuzzed as below: - extra_attr feature is enabled - flexible_inline_xattr feature is enabled - ri.i_inline_xattr_size = 2048 - F2FS_EXTRA_ATTR bit in ri.i_inline was not set sanity_check_inode() will skip doing sanity check on fi->i_inline_xattr_size, result in using invalid inline_xattr_size later incorrectly, fix it. Meanwhile, let's fix to check lower boundary for .i_inline_xattr_size w/ MIN_INLINE_XATTR_SIZE like we did in parse_options(). There is a related issue reported by syzbot, Qasim Ijaz has anlyzed and fixed it w/ very similar way [1], as discussed, we all agree that it will be better to do sanity check in sanity_check_inode() for fix, so finally, let's fix these two related bugs w/ current patch. Including commit message from Qasim's patch as below, thanks a lot for his contribution. "In f2fs_getxattr(), the function lookup_all_xattrs() allocates a 12-byte (base_size) buffer for an inline extended attribute. However, when __find_inline_xattr() calls __find_xattr(), it uses the macro "list_for_each_xattr(entry, addr)", which starts by calling XATTR_FIRST_ENTRY(addr). This skips a 24-byte struct f2fs_xattr_header at the beginning of the buffer, causing an immediate out-of-bounds read in a 12-byte allocation. The subsequent !IS_XATTR_LAST_ENTRY(entry) check then dereferences memory outside the allocated region, triggering the slab-out-of bounds read. This patch prevents the out-of-bounds read by adding a check to bail out early if inline_size is too small and does not account for the header plus the 4-byte value that IS_XATTR_LAST_ENTRY reads." [1]: https://lore.kernel.org/linux-f2fs-devel/Z32y1rfBY9Qb5ZjM@qasdev.system/ Fixes: `6afc662e68` ("f2fs: support flexible inline xattr size") Reported-by: syzbot+69f5379a1717a0b982a1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/674f4e7d.050a0220.17bd51.004f.GAE@google.com Reported-by: syzbot <syzbot+f5e74075e096e757bdbf@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=f5e74075e096e757bdbf Tested-by: syzbot <syzbot+f5e74075e096e757bdbf@syzkaller.appspotmail.com> Tested-by: Qasim Ijaz <qasdev00@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-16 17:27:51 +00:00
Jaegeuk Kim	4811fee828	f2fs: remove blk_finish_plug Let's remove unclear blk_finish_plug. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-16 17:27:51 +00:00
Yi Sun	120ac1dc32	f2fs: Optimize f2fs_truncate_data_blocks_range() Function f2fs_invalidate_blocks() can process consecutive blocks at a time, so f2fs_truncate_data_blocks_range() is optimized to use the new functionality of f2fs_invalidate_blocks(). Add two variables @blkstart and @blklen, @blkstart records the first address of the consecutive blocks, and @blkstart records the number of consecutive blocks. Signed-off-by: Yi Sun <yi.sun@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-16 17:27:29 +00:00
zangyangyang1	c84c242493	f2fs: fix using wrong 'submitted' value in f2fs_write_cache_pages When f2fs_write_single_data_page fails, f2fs_write_cache_pages will use the last 'submitted' value incorrectly, which will cause 'nwritten' and 'wbc->nr_to_write' calculation errors Signed-off-by: zangyangyang1 <zangyangyang1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-13 18:49:38 +00:00
Yi Sun	e53c568f46	f2fs: add parameter @len to f2fs_invalidate_blocks() New function can process some consecutive blocks at a time. Function f2fs_invalidate_blocks()->down_write() and up_write() are very time-consuming, so if f2fs_invalidate_blocks() can process consecutive blocks at one time, it will save a lot of time. Signed-off-by: Yi Sun <yi.sun@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-13 18:49:32 +00:00
Yi Sun	81ffbd224e	f2fs: update_sit_entry_for_release() supports consecutive blocks. This function can process some consecutive blocks at a time. When using update_sit_entry() to release consecutive blocks, ensure that the consecutive blocks belong to the same segment. Because after update_sit_entry_for_realese(), @segno is still in use in update_sit_entry(). Signed-off-by: Yi Sun <yi.sun@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-08 18:31:46 +00:00
Yi Sun	66baee2b88	f2fs: introduce update_sit_entry_for_release/alloc() No logical changes, just for cleanliness. Signed-off-by: Yi Sun <yi.sun@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-08 18:31:45 +00:00
Jaegeuk Kim	cf5817ce66	f2fs: don't call block truncation for aliased file This patch should avoid the below warning which does not corrupt the metadata tho. [ 51.508120][ T253] F2FS-fs (dm-59): access invalid blkaddr:36 [ 51.508156][ T253] __f2fs_is_valid_blkaddr+0x330/0x384 [ 51.508162][ T253] f2fs_is_valid_blkaddr_raw+0x10/0x24 [ 51.508163][ T253] f2fs_truncate_data_blocks_range+0x1ec/0x438 [ 51.508177][ T253] f2fs_remove_inode_page+0x8c/0x148 [ 51.508194][ T253] f2fs_evict_inode+0x230/0x76c Fixes: `128d333f0d` ("f2fs: introduce device aliasing file") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-08 18:31:45 +00:00
Daniel Lee	91b587ba79	f2fs: Introduce linear search for dentries This patch addresses an issue where some files in case-insensitive directories become inaccessible due to changes in how the kernel function, utf8_casefold(), generates case-folded strings from the commit `5c26d2f1d3` ("unicode: Don't special case ignorable code points"). F2FS uses these case-folded names to calculate hash values for locating dentries and stores them on disk. Since utf8_casefold() can produce different output across kernel versions, stored hash values and newly calculated hash values may differ. This results in affected files no longer being found via the hash-based lookup. To resolve this, the patch introduces a linear search fallback. If the initial hash-based search fails, F2FS will sequentially scan the directory entries. Fixes: `5c26d2f1d3` ("unicode: Don't special case ignorable code points") Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Signed-off-by: Daniel Lee <chullee@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-08 18:31:45 +00:00
Yi Sun	d217b5cea4	f2fs: add parameter @len to f2fs_invalidate_internal_cache() New function can process some consecutive blocks at a time. Signed-off-by: Yi Sun <yi.sun@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-08 18:31:45 +00:00
Yi Sun	3d56fbb1f0	f2fs: expand f2fs_invalidate_compress_page() to f2fs_invalidate_compress_pages_range() New function f2fs_invalidate_compress_pages_range() adds the @len parameter. So it can process some consecutive blocks at a time. Signed-off-by: Yi Sun <yi.sun@unisoc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-01-08 18:31:45 +00:00
Dmitry Antipov	76f01376df	f2fs: ensure that node info flags are always initialized Syzbot has reported the following KMSAN splat: BUG: KMSAN: uninit-value in f2fs_new_node_page+0x1494/0x1630 f2fs_new_node_page+0x1494/0x1630 f2fs_new_inode_page+0xb9/0x100 f2fs_init_inode_metadata+0x176/0x1e90 f2fs_add_inline_entry+0x723/0xc90 f2fs_do_add_link+0x48f/0xa70 f2fs_symlink+0x6af/0xfc0 vfs_symlink+0x1f1/0x470 do_symlinkat+0x471/0xbc0 __x64_sys_symlink+0xcf/0x140 x64_sys_call+0x2fcc/0x3d90 do_syscall_64+0xd9/0x1b0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Local variable new_ni created at: f2fs_new_node_page+0x9d/0x1630 f2fs_new_inode_page+0xb9/0x100 So adjust 'f2fs_get_node_info()' to ensure that 'flag' field of 'struct node_info' is always initialized. Reported-by: syzbot+5141f6db57a2f7614352@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=5141f6db57a2f7614352 Fixes: `e05df3b115` ("f2fs: add node operations") Suggested-by: Chao Yu <chao@kernel.org> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2024-12-16 16:12:54 +00:00
Yongpeng Yang	e9a844f6e4	f2fs: The GC triggered by ioctl also needs to mark the segno as victim In SSR mode, the segment selected for allocation might be the same as the target segment of the GC triggered by ioctl, resulting in the GC moving the CURSEG_I(sbi, type)->segno. Thread A Thread B or Thread A - f2fs_ioc_gc_range - __f2fs_ioc_gc_range(.victim_segno=segno#N) - f2fs_gc - __get_victim - f2fs_get_victim : segno#N is valid, return segno#N as source segment of GC - f2fs_allocate_data_block - need_new_seg - get_ssr_segment - f2fs_get_victim : get segno #N as destination segment - change_curseg Fixes: `e066b83c9b` ("f2fs: add ioctl to flush data from faster device to cold area") Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2024-12-16 16:12:29 +00:00
zangyangyang1	5f65945427	f2fs: cache more dentry pages While traversing dir entries in dentry page, it's better to refresh current accessed page in lru list by using FGP_ACCESSED flag, otherwise, such page may has less chance to survive during memory reclaim, result in causing additional IO when revisiting dentry page. Signed-off-by: zangyangyang1 <zangyangyang1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2024-12-16 16:12:28 +00:00
Matthew Wilcox (Oracle)	c910a64bc4	f2fs: Remove calls to folio_file_mapping() All folios that f2fs sees belong to f2fs and not to the swapcache so it can dereference folio->mapping directly like all other filesystems do. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2024-12-16 16:12:26 +00:00

1 2 3 4 5 ...

4265 Commits