Commit Graph

38 Commits

Author SHA1 Message Date
Jaegeuk Kim 55008d845d f2fs: enhance alloc_nid and build_free_nids flows
In order to avoid build_free_nid lock contention, let's change the order of
function calls as follows.

At first, check whether there is enough free nids.
 - If available, just get a free nid with spin_lock without any overhead.
 - Otherwise, conduct build_free_nids.
  : scan nat pages, journal nat entries, and nat cache entries.

We should consider carefullly not to serve free nids intermediately made by
build_free_nids.
We can get stable free nids only after build_free_nids is done.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-29 11:19:21 +09:00
Jaegeuk Kim 9198aceb53 f2fs: check nid == 0 in add_free_nid
It is more obvious that add_free_nid checks whether the free nid is zero or not.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-26 10:35:13 +09:00
Jaegeuk Kim c718379b6b f2fs: give a chance to merge IOs by IO scheduler
Previously, background GC submits many 4KB read requests to load victim blocks
and/or its (i)node blocks.

...
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed
f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee
f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef
f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0]
...

However, by the fact that many IOs are sequential, we can give a chance to merge
the IOs by IO scheduler.
In order to do that, let's use blk_plug.

...
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef
<idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0]
<idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0]
<idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0]
<idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0]
<idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0]
<idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0]
<idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0]
...

Note that this issue should be addressed in checkpoint, and some readahead
flows too.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-26 10:35:10 +09:00
Namjae Jeon 51dd624934 f2fs: add tracepoints for truncate operation
add tracepoints for tracing the truncate operations
like truncate node/data blocks, f2fs_truncate etc.

Tracepoints are added at entry and exit of operation
to trace the success & failure of operation.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: combine and modify the tracepoint structures]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 16:40:38 +09:00
Jaegeuk Kim 399368372e f2fs: introduce a new global lock scheme
In the previous version, f2fs uses global locks according to the usage types,
such as directory operations, block allocation, block write, and so on.

Reference the following lock types in f2fs.h.
enum lock_type {
	RENAME,		/* for renaming operations */
	DENTRY_OPS,	/* for directory operations */
	DATA_WRITE,	/* for data write */
	DATA_NEW,	/* for data allocation */
	DATA_TRUNC,	/* for data truncate */
	NODE_NEW,	/* for node allocation */
	NODE_TRUNC,	/* for node truncate */
	NODE_WRITE,	/* for node write */
	NR_LOCK_TYPE,
};

In that case, we lose the performance under the multi-threading environment,
since every types of operations must be conducted one at a time.

In order to address the problem, let's share the locks globally with a mutex
array regardless of any types.
So, let users grab a mutex and perform their jobs in parallel as much as
possbile.

For this, I propose a new global lock scheme as follows.

0. Data structure
 - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
 - f2fs_sb_info -> node_write

1. mutex_lock_op(sbi)
 - try to get an avaiable lock from the array.
 - returns the index of the gottern lock variable.

2. mutex_unlock_op(sbi, index of the lock)
 - unlock the given index of the lock.

3. mutex_lock_all(sbi)
 - grab all the locks in the array before the checkpoint.

4. mutex_unlock_all(sbi)
 - release all the locks in the array after checkpoint.

5. block_operations()
 - call mutex_lock_all()
 - sync_dirty_dir_inodes()
 - grab node_write
 - sync_node_pages()

Note that,
 the pairs of mutex_lock_op()/mutex_unlock_op() and
 mutex_lock_all()/mutex_unlock_all() should be used together.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-09 18:21:18 +09:00
Jaegeuk Kim 49952fa182 f2fs: reduce redundant spin_lock operations
This patch reduces redundant spin_lock operations in alloc_nid_failed().
The alloc_nid_failed() does not need to delete entry and add one again
by triggering spin_lock and spin_unlock redundantly.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-03 22:19:03 +09:00
Jaegeuk Kim b74737541c f2fs: avoid race for summary information
In order to do GC more reliably, I'd like to lock the vicitm summary page
until its GC is completed, and also prevent any checkpoint process.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-03 17:27:51 +09:00
Jaegeuk Kim 56ae674cc2 f2fs: remove redundant lock_page calls
In get_node_page, we do not need to call lock_page all the time.

If the node page is cached as uptodate,

1. grab_cache_page locks the page,
2. read_node_page unlocks the page, and
3. lock_page is called for further process.

Let's avoid this.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-03 17:27:42 +09:00
Alexandru Gheorghiu 79b5793be4 f2fs: use kmemdup
Use kmemdup instead of kzalloc and memcpy.

Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com>
Acked-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-31 09:12:18 +09:00
Jaegeuk Kim fa37241743 f2fs: remain nat cache entries for further free nid allocation
In the checkpoint flow, the f2fs investigates the total nat cache entries.
Previously, if an entry has NULL_ADDR, f2fs drops the entry and adds the
obsolete nid to the free nid list.
However, this free nid will be reused sooner, resulting in its nat entry miss.
In order to avoid this, we don't need to drop the nat cache entry at this moment.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-27 09:16:18 +09:00
Jaegeuk Kim 04431c44e5 f2fs: fix not to allocate max_nid
The build_free_nid should not add free nids over nm_i->max_nid.
But, there was a hole that invalid free nid was added by the following scenario.

Let's suppose nm_i->max_nid = 150 and the last NAT page has 100 ~ 200 nids.

build_free_nids
  - get_current_nat_page loads the last NAT page
  - scan_nat_page can add 100 ~ 200 nids
    -> Bug here!
So, when scanning an NAT page, we should check each candidate whether it is
over max_nid or not.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-20 18:30:13 +09:00
Jaegeuk Kim c3850aa1cb f2fs: fix return value of releasepage for node and data
If the return value of releasepage is equal to zero, the page cannot be reclaimed.
Instead, we should return 1 in order to reclaim clean pages.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-20 18:30:13 +09:00
Jaegeuk Kim 48cb76c7be f2fs: scan next nat page to reuse free nids in there
When we build new free nids, let's scan the just next NAT page instead of
skipping a couple of previously scanned pages in order to reuse free nids in
there.
Otherwise, we can use too much wide range of nids even though several nids were
deallocated, and also their node pages can be cached in the node_inode's address
space.
This means that we can retain lots of clean pages in the main memory, which
induces mm's reclaiming overhead.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-20 18:30:12 +09:00
Jaegeuk Kim 08d8058be6 f2fs: should check the node page was truncated first
Currently, f2fs doesn't reclaim any node pages.
However, if we found that a node page was truncated by checking its block
address with zero during f2fs_write_node_page, we should not skip that node
page and return zero to reclaim it.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-20 18:30:12 +09:00
Jaegeuk Kim 393ff91f57 f2fs: reduce unncessary locking pages during read
This patch reduces redundant locking and unlocking pages during read operations.
In f2fs_readpage, let's use wait_on_page_locked() instead of lock_page.
And then, when we need to modify any data finally, let's lock the page so that
we can avoid lock contention.

[readpage rule]
- The f2fs_readpage returns unlocked page, or released page too in error cases.
- Its caller should handle read error, -EIO, after locking the page, which
  indicates read completion.
- Its caller should check PageUptodate after grab_cache_page.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-20 18:30:06 +09:00
Namjae Jeon 25c0a6e529 f2fs: avoid extra ++ while returning from get_node_path
In all the breaking conditions in get_node_path, 'n' is used to
track index in offset[] array, but while breaking out also, in all
paths n++ is done.
So, remove the ++ from breaking paths. Also, avoid
reset of 'level=0' in first case.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-18 21:00:36 +09:00
Namjae Jeon 3aa770a9c9 f2fs: optimize and change return path in lookup_free_nid_list
Optimize and change return path in lookup_free_nid_list

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-18 21:00:35 +09:00
Namjae Jeon e0f56cb44b f2fs: optimize get node page readahead part
We can remove the call to find_get_page to get a page from the cache
and check for up-to-date, instead we can make use of grab_cache_page
part itself to fetch the page from the cache.
So, removing the call and moving the PageUptodate at proper place, also
taken care of moving the lock_page condition in the page_hit part.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-18 21:00:34 +09:00
Changman Lee 52c2db3f95 f2fs: check the level before calling get_nid function
The caller of get_nid should be careful not to put lower value than
NODE_DIR1_BLOCK in case of level is zero.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-18 21:00:34 +09:00
Jaegeuk Kim 266e97a81c f2fs: introduce readahead mode of node pages
Previously, f2fs reads several node pages ahead when get_dnode_of_data is called
with RDONLY_NODE flag.
And, this flag is set by the following functions.
- get_data_block_ro
- get_lock_data_page
- do_write_data_page
- truncate_blocks
- truncate_hole

However, this readahead mechanism is initially introduced for the use of
get_data_block_ro to enhance the sequential read performance.

So, let's clarify all the cases with the additional modes as follows.

enum {
	ALLOC_NODE,	/* allocate a new node page if needed */
	LOOKUP_NODE,	/* look up a node without readahead */
	LOOKUP_NODE_RA,	/*
			 * look up a node with readahead called
			 * by get_datablock_ro.
			 */
}

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
2013-03-18 21:00:33 +09:00
Jaegeuk Kim 66d36a2944 f2fs: read with READ_SYNC when getting dnode page
The get_node_page_ra tries to:
1. grab or read a target node page for the given nid,
2. then, call ra_node_page to read other adjacent node pages in advance.

So, when we try to read a target node page by #1, we should submit bio with
READ_SYNC instead of READA.
And, in #2, READA should be used.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
2013-03-18 21:00:33 +09:00
Jaegeuk Kim 12faafe454 f2fs: fix to unlock node page when it was truncated
If the node page was truncated, its block address became zero.
This means that we don't need to write the node page, but have to unlock
NODE_WRITE, decrease the number of dirty node pages, and then unlock_page
before returning the f2fs_write_node_page with zero.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-03-18 21:00:09 +09:00
Jaegeuk Kim 7dd690c820 f2fs: avoid build warning
This patch removes the following build warning:
fs/f2fs/node.c: warning: 'nofs' may be used uninitialized in this function
[-Wuninitialized]:  => 738:8

Note that this is a false alarm.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-02-12 07:28:55 +09:00
Jaegeuk Kim 90b2fc64f0 Merge branch 'f2fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into dev
Pull f2fs cleanup patches from Al Viro:

f2fs: get rid of fake on-stack dentries
f2fs: switch init_inode_metadata() to passing parent and name separately
f2fs: switch new_inode_page() from dentry to qstr
f2fs: init_dent_inode() should take qstr

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>

Conflicts:
	fs/f2fs/recovery.c
2013-02-12 07:17:20 +09:00
Jaegeuk Kim 437275272f f2fs: clarify and enhance the f2fs_gc flow
This patch makes clearer the ambiguous f2fs_gc flow as follows.

1. Remove intermediate checkpoint condition during f2fs_gc
 (i.e., should_do_checkpoint() and GC_BLOCKED)

2. Remove unnecessary return values of f2fs_gc because of #1.
 (i.e., GC_NODE, GC_OK, etc)

3. Simplify write_checkpoint() because of #2.

4. Clarify the main f2fs_gc flow.
 o monitor how many freed sections during one iteration of do_garbage_collect().
 o do GC more without checkpoints if we can't get enough free sections.
 o do checkpoint once we've got enough free sections through forground GCs.

5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data
  log types. See. get_ssr_segement()

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-02-12 07:15:02 +09:00