linux

mirror of https://github.com/armbian/linux.git synced 2026-01-06 10:13:00 -08:00

Author	SHA1	Message	Date
Linus Torvalds	3f490f7f99	Merge tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set includes the following major enhancement patches: - remount_fs callback function - restore parent inode number to enhance the fsync performance - xattr security labels - reduce the number of redundant lock/unlock data pages - avoid frequent write_inode calls The other minor bug fixes are as follows. - endian conversion bugs - various bugs in the roll-forward recovery routine" * tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (56 commits) f2fs: fix to recover i_size from roll-forward f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes() f2fs: remove reusing any prefree segments f2fs: code cleanup and simplify in func {find/add}_gc_inode f2fs: optimize the init_dirty_segmap function f2fs: fix an endian conversion bug detected by sparse f2fs: fix crc endian conversion f2fs: add remount_fs callback support f2fs: recover wrong pino after checkpoint during fsync f2fs: optimize do_write_data_page() f2fs: make locate_dirty_segment() as static f2fs: remove unnecessary parameter "offset" from __add_sum_entry() f2fs: avoid freqeunt write_inode calls f2fs: optimise the truncate_data_blocks_range() range f2fs: use the F2FS specific flags in f2fs_ioctl() f2fs: sync dir->i_size with its block allocation f2fs: fix i_blocks translation on various types of files f2fs: set sb->s_fs_info before calling parse_options() f2fs: support xattr security labels f2fs: fix iget/iput of dir during recovery ...	2013-07-02 09:42:38 -07:00
Jaegeuk Kim	8ae8f1627f	f2fs: support xattr security labels This patch adds the support of security labels for f2fs, which will be used by Linus Security Models (LSMs). Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules: "Linux Security Modules (LSM) is a framework that allows the Linux kernel to support a variety of computer security models while avoiding favoritism toward any single security implementation. The framework is licensed under the terms of the GNU General Public License and is standard part of the Linux kernel since Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted modules in the official kernel.". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-06-11 16:01:03 +09:00
Jaegeuk Kim	f356fe0cba	f2fs: add debug msgs in the recovery routine This patch adds some trivial debugging messages in the recovery process. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:02 +09:00
Jaegeuk Kim	44a83ff6a8	f2fs: update inode page after creation I found a bug when testing power-off-recovery as follows. [Bug Scenario] 1. create a file 2. fsync the file 3. reboot w/o any sync 4. try to recover the file - found its fsync mark - found its dentry mark : try to recover its dentry - get its file name - get its parent inode number : here we got zero value The reason why we get the wrong parent inode number is that we didn't synchronize the inode page with its newly created inode information perfectly. Especially, previous f2fs stores fi->i_pino and writes it to the cached node page in a wrong order, which incurs the zero-valued i_pino during the recovery. So, this patch modifies the creation flow to fix the synchronization order of inode page with its inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:02 +09:00
Jaegeuk Kim	1646cfac95	f2fs: skip get_node_page if locked node page is passed If get_dnode_of_data gets a locked node page, let's skip redundant get_node_page calls. This is for the futher enhancement. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:01 +09:00
Jaegeuk Kim	65e5cd0a15	f2fs: fix inconsistency of block count during recovery Currently f2fs recovers the dentry of fsynced files. When power-off-recovery is conducted, this newly recovered inode should increase node block count as well as inode block count. This patch resolves this inconsistency that results in: 1. create a file 2. write data 3. fsync 4. reboot without sync 5. mount and recover the file 6. node block count is 1 and inode block count is 2 : fall into the inconsistent state 7. unlink the file : trigger the following BUG_ON ------------[ cut here ]------------ kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/f2fs.h:716! Call Trace: [<ffffffffa0344100>] ? get_node_page+0x50/0x1a0 [f2fs] [<ffffffffa0344bfc>] remove_inode_page+0x8c/0x100 [f2fs] [<ffffffffa03380f0>] ? f2fs_evict_inode+0x180/0x2d0 [f2fs] [<ffffffffa033812e>] f2fs_evict_inode+0x1be/0x2d0 [f2fs] [<ffffffff811c7a67>] evict+0xa7/0x1a0 [<ffffffff811c82b5>] iput+0x105/0x190 [<ffffffff811c2b30>] d_kill+0xe0/0x120 [<ffffffff811c2c57>] dput+0xe7/0x1e0 [<ffffffff811acc3d>] __fput+0x19d/0x2d0 [<ffffffff811acd7e>] ____fput+0xe/0x10 [<ffffffff81070645>] task_work_run+0xb5/0xe0 [<ffffffff81002941>] do_notify_resume+0x71/0xb0 [<ffffffff8175f14a>] int_signal+0x12/0x17 Reported-and-Tested-by: Chris Fries <C.Fries@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:00 +09:00
Lukas Czerner	d47992f86b	mm: change invalidatepage prototype to accept length Currently there is no way to truncate partial page where the end truncate point is not at the end of the page. This is because it was not needed and the functionality was enough for file system truncate operation to work properly. However more file systems now support punch hole feature and it can benefit from mm supporting truncating page just up to the certain point. Specifically, with this functionality truncate_inode_pages_range() can be changed so it supports truncating partial page at the end of the range (currently it will BUG_ON() if 'end' is not at the end of the page). This commit changes the invalidatepage() address space operation prototype to accept range to be invalidated and update all the instances for it. We also change the block_invalidatepage() in the same way and actually make a use of the new length argument implementing range invalidation. Actual file system implementations will follow except the file systems where the changes are really simple and should not change the behaviour in any way .Implementation for truncate_page_range() which will be able to accept page unaligned ranges will follow as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com>	2013-05-21 23:17:23 -04:00
Jaegeuk Kim	59bbd474ab	f2fs: cover free_nid management with spin_lock After build_free_nids() searches free nid candidates from nat pages and current journal blocks, it checks all the candidates if they are allocated so that the nat cache has its nid with an allocated block address. In this procedure, previously we used list_for_each_entry_safe(fnid, next_fnid, &nm_i->free_nid_list, list). But, this is not covered by free_nid_list_lock, resulting in null pointer bug. This patch moves this checking routine inside add_free_nid() in order not to use the spin_lock. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-08 19:54:22 +09:00
Haicheng Li	23d3884427	f2fs: optimize scan_nat_page() When nm_i->fcnt > 2 * MAX_FREE_NIDS, stop scanning other NAT entries. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> [Jaegeuk Kim: fix handling the return value of add_free_nid()] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-08 19:54:22 +09:00
Haicheng Li	8760952d92	f2fs: code cleanup for scan_nat_page() and build_free_nids() This patch does two cleanups: 1. remove unused variable "fcnt" in build_free_nids(). 2. make scan_nat_page() as void type and remove useless variable "fcnt". Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-08 19:54:21 +09:00
Haicheng Li	95630cbadc	f2fs: bugfix for alloc_nid_failed() Directly drop the free_nid cache when nm_i->fcnt > 2 * MAX_FREE_NIDS Since there is NOT nmi->free_nid_list_lock spinlock protection between a sequential calling of alloc_nid() and alloc_nid_failed(), some other threads may already add new free_nid to the free_nid_list during this period. We need to make sure nmi->fcnt is never > 2 * MAX_FREE_NIDS. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> [Jaegeuk Kim: fit the coding style] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-08 19:54:20 +09:00
Jaegeuk Kim	ac5d156c78	f2fs: modify the number of issued pages to merge IOs When testing f2fs on an SSD, I found some 128 page IOs followed by 1 page IO were issued by f2fs_write_node_pages. This means that there were some mishandling flows which degrades performance. Previous f2fs_write_node_pages determines the number of pages to be written, nr_to_write, as follows. 1. The bio_get_nr_vecs returns 129 pages. 2. The bio_alloc makes a room for 128 pages. 3. The initial 128 pages go into one bio. 4. The existing bio is submitted, and a new bio is prepared for the last 1 page. 5. Finally, sync_node_pages submits the last 1 page bio. The problem is from the use of bio_get_nr_vecs, so this patch replace it with max_hw_blocks using queue_max_sectors. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-30 12:07:32 +09:00
Haicheng Li	6cac3759ce	f2fs: fix inconsistent using of NM_WOUT_THRESHOLD try_to_free_nats() is usually called with parameter nr_shrink as "nm_i->nat_cnt - NM_WOUT_THRESHOLD" by flush_nat_entries() during checkpointing process. However, this is inconsistent with the actual threshold check as "if (nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD)" , which will ignore the free_nats requests when NM_WOUT_THRESHOLD < nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD So fix the threshold check condition. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-30 12:07:32 +09:00
Jaegeuk Kim	afcb7ca01f	f2fs: check truncation of mapping after lock_page We call lock_page when we need to update a page after readpage. Between grab and lock page, the page can be truncated by other thread. So, we should check the page after lock_page whether it was truncated or not. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-29 11:19:32 +09:00
Jaegeuk Kim	55008d845d	f2fs: enhance alloc_nid and build_free_nids flows In order to avoid build_free_nid lock contention, let's change the order of function calls as follows. At first, check whether there is enough free nids. - If available, just get a free nid with spin_lock without any overhead. - Otherwise, conduct build_free_nids. : scan nat pages, journal nat entries, and nat cache entries. We should consider carefullly not to serve free nids intermediately made by build_free_nids. We can get stable free nids only after build_free_nids is done. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-29 11:19:21 +09:00
Jaegeuk Kim	9198aceb53	f2fs: check nid == 0 in add_free_nid It is more obvious that add_free_nid checks whether the free nid is zero or not. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-26 10:35:13 +09:00
Jaegeuk Kim	c718379b6b	f2fs: give a chance to merge IOs by IO scheduler Previously, background GC submits many 4KB read requests to load victim blocks and/or its (i)node blocks. ... f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0] ... However, by the fact that many IOs are sequential, we can give a chance to merge the IOs by IO scheduler. In order to do that, let's use blk_plug. ... f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef <idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0] <idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0] <idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0] <idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0] <idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0] <idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0] <idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0] ... Note that this issue should be addressed in checkpoint, and some readahead flows too. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-26 10:35:10 +09:00
Namjae Jeon	51dd624934	f2fs: add tracepoints for truncate operation add tracepoints for tracing the truncate operations like truncate node/data blocks, f2fs_truncate etc. Tracepoints are added at entry and exit of operation to trace the success & failure of operation. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-23 16:40:38 +09:00
Jaegeuk Kim	399368372e	f2fs: introduce a new global lock scheme In the previous version, f2fs uses global locks according to the usage types, such as directory operations, block allocation, block write, and so on. Reference the following lock types in f2fs.h. enum lock_type { RENAME, /* for renaming operations / DENTRY_OPS, / for directory operations / DATA_WRITE, / for data write / DATA_NEW, / for data allocation / DATA_TRUNC, / for data truncate / NODE_NEW, / for node allocation / NODE_TRUNC, / for node truncate / NODE_WRITE, / for node write */ NR_LOCK_TYPE, }; In that case, we lose the performance under the multi-threading environment, since every types of operations must be conducted one at a time. In order to address the problem, let's share the locks globally with a mutex array regardless of any types. So, let users grab a mutex and perform their jobs in parallel as much as possbile. For this, I propose a new global lock scheme as follows. 0. Data structure - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS] - f2fs_sb_info -> node_write 1. mutex_lock_op(sbi) - try to get an avaiable lock from the array. - returns the index of the gottern lock variable. 2. mutex_unlock_op(sbi, index of the lock) - unlock the given index of the lock. 3. mutex_lock_all(sbi) - grab all the locks in the array before the checkpoint. 4. mutex_unlock_all(sbi) - release all the locks in the array after checkpoint. 5. block_operations() - call mutex_lock_all() - sync_dirty_dir_inodes() - grab node_write - sync_node_pages() Note that, the pairs of mutex_lock_op()/mutex_unlock_op() and mutex_lock_all()/mutex_unlock_all() should be used together. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-09 18:21:18 +09:00
Jaegeuk Kim	49952fa182	f2fs: reduce redundant spin_lock operations This patch reduces redundant spin_lock operations in alloc_nid_failed(). The alloc_nid_failed() does not need to delete entry and add one again by triggering spin_lock and spin_unlock redundantly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-03 22:19:03 +09:00
Jaegeuk Kim	b74737541c	f2fs: avoid race for summary information In order to do GC more reliably, I'd like to lock the vicitm summary page until its GC is completed, and also prevent any checkpoint process. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-03 17:27:51 +09:00
Jaegeuk Kim	56ae674cc2	f2fs: remove redundant lock_page calls In get_node_page, we do not need to call lock_page all the time. If the node page is cached as uptodate, 1. grab_cache_page locks the page, 2. read_node_page unlocks the page, and 3. lock_page is called for further process. Let's avoid this. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-03 17:27:42 +09:00
Alexandru Gheorghiu	79b5793be4	f2fs: use kmemdup Use kmemdup instead of kzalloc and memcpy. Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com> Acked-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-31 09:12:18 +09:00
Jaegeuk Kim	fa37241743	f2fs: remain nat cache entries for further free nid allocation In the checkpoint flow, the f2fs investigates the total nat cache entries. Previously, if an entry has NULL_ADDR, f2fs drops the entry and adds the obsolete nid to the free nid list. However, this free nid will be reused sooner, resulting in its nat entry miss. In order to avoid this, we don't need to drop the nat cache entry at this moment. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-27 09:16:18 +09:00
Jaegeuk Kim	04431c44e5	f2fs: fix not to allocate max_nid The build_free_nid should not add free nids over nm_i->max_nid. But, there was a hole that invalid free nid was added by the following scenario. Let's suppose nm_i->max_nid = 150 and the last NAT page has 100 ~ 200 nids. build_free_nids - get_current_nat_page loads the last NAT page - scan_nat_page can add 100 ~ 200 nids -> Bug here! So, when scanning an NAT page, we should check each candidate whether it is over max_nid or not. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-20 18:30:13 +09:00

1 2 3

52 Commits