As reported, status debugfs entry shows inconsistent GC stats as below:
GC calls: 6008 (BG: 6161)
- data segments : 3053 (BG: 3053)
- node segments : 2955 (BG: 2955)
Total GC calls is larger than BGGC calls, the reason is:
- f2fs_stat_info.call_count accounts total migrated section count
by f2fs_gc()
- f2fs_stat_info.bg_gc accounts total call times of f2fs_gc() from
background gc_thread
Another issue is gc_foreground_calls sysfs entry shows total GC call
count rather than FGGC call count.
This patch changes as below for fix:
- account GC calls and migrated segment count separately
- support to account migrated section count if it enables large section
mode
- fix to show correct value in gc_foreground_calls sysfs entry
Fixes: fc7100ea2a ("f2fs: Add f2fs stats to sysfs")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
With -O quota mkfs option, xfstests generic/417 fails due to fsck detects
data corruption on quota inodes.
[ASSERT] (fsck_chk_quota_files:2051) --> Quota file is missing or invalid quota file content found.
The root cause is there is a hole f2fs doesn't hold quota inodes,
so all recovered quota data will be dropped due to SBI_POR_DOING
flag was set.
- f2fs_fill_super
- f2fs_recover_orphan_inodes
- f2fs_enable_quota_files
- f2fs_quota_off_umount
<--- quota inodes were dropped --->
- f2fs_recover_fsync_data
- f2fs_enable_quota_files
- f2fs_quota_off_umount
This patch tries to eliminate the hole by holding quota inodes
during entire recovery flow as below:
- f2fs_fill_super
- f2fs_recover_quota_begin
- f2fs_recover_orphan_inodes
- f2fs_recover_fsync_data
- f2fs_recover_quota_end
Then, recovered quota data can be persisted after SBI_POR_DOING
is cleared.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
For each loop add a local f2fs_sb_info pointer insted of looking it up.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Export ipu_policy as a string in debugfs for better readability and
it can help us better understand some strategies of the file system.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
To fix:
WARNING: function definition argument 'struct f2fs_attr *' should also have an identifier name
+ ssize_t (*show)(struct f2fs_attr *, struct f2fs_sb_info *, char *);
WARNING: return sysfs_emit(...) formats should include a terminating newline
+ return sysfs_emit(buf, "(none)");
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
+ unsigned npages = NODE_MAPPING(sbi)->nrpages;
WARNING: Missing a blank line after declarations
+ unsigned npages = COMPRESS_MAPPING(sbi)->nrpages;
+ si->page_mem += (unsigned long long)npages << PAGE_SHIFT;
WARNING: quoted string split across lines
+ seq_printf(s, "CP merge (Queued: %4d, Issued: %4d, Total: %4d, "
+ "Cur time: %4d(ms), Peak time: %4d(ms))\n",
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.
Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
- It records total data blocks allocated since mount;
- When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
- Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.
Test and result:
- Prepare: create about 30000 files
* 3% for cold files (with cold file extension like .apk, from 3M to 10M)
* 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
* 47% for hot files (with hot file extension like .db, from 1K to 256K)
- create(5%)/random update(90%)/delete(5%) the files
* total write amount is about 70G
* fsync will be called for .db files, and buffered write will be used
for other files
The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.
Benefit: dirty segment count increment reduce about 14%
- before: Dirty +21110
- after: Dirty +18286
Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
inode_lock[ATOMIC_FILE] was used for protecting sbi->atomic_files,
update atomic_files variable's type to atomic_t instead of unsigned
int, then inode_lock[ATOMIC_FILE] can be obsoleted.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
ERROR: code indent should use tabs where possible
ERROR: spaces required around that ':'
ERROR: incorrect tab
Found serveral code type errors when review the code and fix it.
There is no function change.
Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch fixes counting unusable blocks set by zone capacity when
checking the valid block count in a section.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Current atomic write has three major issues like below.
- keeps the updates in non-reclaimable memory space and they are even
hard to be migrated, which is not good for contiguous memory
allocation.
- disk spaces used for atomic files cannot be garbage collected, so
this makes it difficult for the filesystem to be defragmented.
- If atomic write operations hit the threshold of either memory usage
or garbage collection failure count, All the atomic write operations
will fail immediately.
To resolve the issues, I will keep a COW inode internally for all the
updates to be flushed from memory, when we need to flush them out in a
situation like high memory pressure. These COW inodes will be tagged
as orphan inodes to be reclaimed in case of sudden power-cut or system
failure during atomic writes.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We need a mid level of gc urgent mode to do GC forcibly in a period
of given gc_urgent_sleep_time, but not like using greedy GC approach
and switching to SSR mode such as gc urgent high mode. This can be
used for more aggressive periodic storage clean up.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This adds a sysfs entry to call checkpoint during fsync() in order to avoid
long elapsed time to run roll-forward recovery when booting the device.
Default value doesn't enforce the limitation which is same as before.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
To fix:
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>