Commit Graph

413 Commits

Author SHA1 Message Date
Jens Axboe 7dcda1c96d fs: export empty_aops
With the ->sync_page() hook gone, we have a few users that
add their own static address_space_operations without any
functions defined.

fs/inode.c already has an empty_aops that it uses for init
purposes. Lets export that and use it in the places where
an otherwise empty aops was defined.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-05 23:51:48 +02:00
Nicolas Kaiser af71dda0b8 nilfs2: fix whitespace coding style issues
Fixes whitespace coding style issues.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-30 17:39:54 +09:00
Ryusuke Konishi d611b22f1a nilfs2: fix oops due to a bad aops initialization
Nilfs in 2.6.39-rc1 hit the following oops:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
 IP: [<ffffffff810ac235>] try_to_release_page+0x2a/0x3d
 PGD 234cb6067 PUD 234c72067 PMD 0
 Oops: 0000 [#1] SMP
 <snip>
 Process truncate (pid: 10995, threadinfo ffff8802353c2000, task ffff880234cfa000)
 Stack:
  ffff8802333c77b8 ffffffff810b64b0 0000000000003802 ffffffffa0052cca
  0000000000000000 ffff8802353c3b58 0000000000000000 ffff8802353c3b58
  0000000000000001 0000000000000000 ffffea0007b92308 ffffea0007b92308
 Call Trace:
  [<ffffffff810b64b0>] ? invalidate_inode_pages2_range+0x15f/0x273
  [<ffffffffa0052cca>] ? nilfs_palloc_get_block+0x2d/0xaf [nilfs2]
  [<ffffffff810589e7>] ? bit_waitqueue+0x14/0xa1
  [<ffffffff81058ab1>] ? wake_up_bit+0x10/0x20
  [<ffffffffa00433fd>] ? nilfs_forget_buffer+0x66/0x7a [nilfs2]
  [<ffffffffa00467b8>] ? nilfs_btree_concat_left+0x5c/0x77 [nilfs2]
  [<ffffffffa00471fc>] ? nilfs_btree_delete+0x395/0x3cf [nilfs2]
  [<ffffffffa00449a3>] ? nilfs_bmap_do_delete+0x6e/0x79 [nilfs2]
  [<ffffffffa0045845>] ? nilfs_btree_last_key+0x14b/0x15e [nilfs2]
  [<ffffffffa00449dd>] ? nilfs_bmap_truncate+0x2f/0x83 [nilfs2]
  [<ffffffffa0044ab2>] ? nilfs_bmap_last_key+0x35/0x62 [nilfs2]
  [<ffffffffa003e99b>] ? nilfs_truncate_bmap+0x6b/0xc7 [nilfs2]
  [<ffffffffa003ee4a>] ? nilfs_truncate+0x79/0xe4 [nilfs2]
  [<ffffffff810b6c00>] ? vmtruncate+0x33/0x3b
  [<ffffffffa003e8f1>] ? nilfs_setattr+0x4d/0x8c [nilfs2]
  [<ffffffff81026106>] ? do_page_fault+0x31b/0x356
  [<ffffffff810f9d61>] ? notify_change+0x17d/0x262
  [<ffffffff810e5046>] ? do_truncate+0x65/0x80
  [<ffffffff810e52af>] ? sys_ftruncate+0xf1/0xf6
  [<ffffffff8132c012>] ? system_call_fastpath+0x16/0x1b
 Code: c3 48 83 ec 08 48 8b 17 48 8b 47 18 80 e2 01 75 04 0f 0b eb fe 48 8b 17 80 e6 20 74 05 31 c0 41 59 c3 48 85 c0 74 11 48 8b 40 58
  8b 40 48 48 85 c0 74 04 41 58 ff e0 59 e9 b1 b5 05 00 41 54
 RIP  [<ffffffff810ac235>] try_to_release_page+0x2a/0x3d
  RSP <ffff8802353c3b08>
 CR2: 0000000000000048

This oops was brought in by the change "block: remove per-queue
plugging" (commit: 7eaceaccab).  It initializes mapping->a_ops
with a NULL pointer for some pages in nilfs (e.g. btree node pages),
but mm code doesn't NULL pointer checks against mapping->a_ops. (the
check is done for each callback function)

This corrects the aops initialization and fixes the oops.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-30 17:39:25 +09:00
Ryusuke Konishi 3409453794 nilfs2: fix data loss in mmap page write for hole blocks
From the result of a function test of mmap, mmap write to shared pages
turned out to be broken for hole blocks.  It doesn't write out filled
blocks and the data will be lost after umount.  This is due to a bug
that the target file is not queued for log writer when filling hole
blocks.

Also, nilfs_page_mkwrite function exits normal code path even after
successfully filled hole blocks due to a change of block_page_mkwrite
function; just after nilfs was merged into the mainline,
block_page_mkwrite() started to return VM_FAULT_LOCKED instead of zero
by the patch "mm: close page_mkwrite races" (commit:
b827e496c8).  The current nilfs_page_mkwrite() is not handling
this value properly.

This corrects nilfs_page_mkwrite() and will resolve the data loss
problem in mmap write.

[This should be applied to every kernel since 2.6.30 but a fix is
 needed for 2.6.37 and prior kernels]

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: stable <stable@kernel.org>  [2.6.38]
2011-03-30 10:45:31 +09:00
Linus Torvalds 6c51038900 Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
  Documentation/iostats.txt: bit-size reference etc.
  cfq-iosched: removing unnecessary think time checking
  cfq-iosched: Don't clear queue stats when preempt.
  blk-throttle: Reset group slice when limits are changed
  blk-cgroup: Only give unaccounted_time under debug
  cfq-iosched: Don't set active queue in preempt
  block: fix non-atomic access to genhd inflight structures
  block: attempt to merge with existing requests on plug flush
  block: NULL dereference on error path in __blkdev_get()
  cfq-iosched: Don't update group weights when on service tree
  fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
  block: Require subsystems to explicitly allocate bio_set integrity mempool
  jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  fs: make fsync_buffers_list() plug
  mm: make generic_writepages() use plugging
  blk-cgroup: Add unaccounted time to timeslice_used.
  block: fixup plugging stubs for !CONFIG_BLOCK
  block: remove obsolete comments for blkdev_issue_zeroout.
  blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
  ...

Fix up conflicts in fs/{aio.c,super.c}
2011-03-24 10:16:26 -07:00
Serge E. Hallyn 2e14967075 userns: rename is_owner_or_cap to inode_owner_or_capable
And give it a kernel-doc comment.

[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:13 -07:00
Akinobu Mita a49ebbabb0 nilfs2: use little-endian bitops
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h.  This converts ext2 non-atomic bit operations to
little-endian bit operations.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:18 -07:00
Jens Axboe 4c63f5646e Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core
Conflicts:
	block/blk-core.c
	block/blk-flush.c
	drivers/md/raid1.c
	drivers/md/raid10.c
	drivers/md/raid5.c
	fs/nilfs2/btnode.c
	fs/nilfs2/mdt.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:58:35 +01:00
Jens Axboe 721a9602e6 block: kill off REQ_UNPLUG
With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe 7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
Ryusuke Konishi e3154e9748 nilfs2: get rid of nilfs_sb_info structure
This directly uses sb->s_fs_info to keep a nilfs filesystem object and
fully removes the intermediate nilfs_sb_info structure.  With this
change, the hierarchy of on-memory structures of nilfs will be
simplified as follows:

Before:
  super_block
       -> nilfs_sb_info
             -> the_nilfs
                   -> cptree --+-> nilfs_root (current file system)
                               +-> nilfs_root (snapshot A)
                               +-> nilfs_root (snapshot B)
                               :
             -> nilfs_sc_info (log writer structure)
After:
  super_block
       -> the_nilfs
             -> cptree --+-> nilfs_root (current file system)
                         +-> nilfs_root (snapshot A)
                         +-> nilfs_root (snapshot B)
                         :
             -> nilfs_sc_info (log writer structure)

The reason why we didn't design so from the beginning is because the
initial shape also differed from the above.  The early hierachy was
composed of "per-mount-point" super_block -> nilfs_sb_info pairs and a
shared nilfs object.  On the kernel 2.6.37, it was changed to the
current shape in order to unify super block instances into one per
device, and this cleanup became applicable as the result.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:54:26 +09:00
Ryusuke Konishi f7545144c2 nilfs2: use sb instance instead of nilfs_sb_info struct
This replaces sbi uses with direct reference to sb instance.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:05:08 +09:00
Ryusuke Konishi d96bbfa28a nilfs2: get rid of sc_sbi back pointer
Removes sci->sc_sbi which is a back pointer to nilfs_sb_info struct
from log writer object (nilfs_sc_info).

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:05:08 +09:00
Ryusuke Konishi 3fd3fe5aea nilfs2: move log writer onto nilfs object
Log writer is held by the nilfs_sb_info structure.  This moves it into
nilfs object and replaces all uses of NILFS_SC() accessor.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:05:08 +09:00
Ryusuke Konishi 9b1fc4e497 nilfs2: move next generation counter into nilfs object
Moves s_next_generation counter and a spinlock protecting it to nilfs
object from nilfs_sb_info structure.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:05:08 +09:00
Ryusuke Konishi 693dd32122 nilfs2: move s_inode_lock and s_dirty_files into nilfs object
Moves s_inode_lock spinlock and s_dirty_files list to nilfs object
from nilfs_sb_info structure.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:05:07 +09:00
Ryusuke Konishi 574e6c3145 nilfs2: move parameters on nilfs_sb_info into nilfs object
This moves four parameter variables on nilfs_sb_info s_resuid,
s_resgid, s_interval and s_watermark to the nilfs object.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:05:07 +09:00
Ryusuke Konishi 3b2ce58b0f nilfs2: move mount options to nilfs object
This moves mount_opt local variable to nilfs object from nilfs_sb_info
struct.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-09 11:05:07 +09:00
Ryusuke Konishi be667377a8 nilfs2: record used amount of each checkpoint in checkpoint list
This records the number of used blocks per checkpoint in each
checkpoint entry of cpfile.  Even though userland tools can get the
block count via nilfs_get_cpinfo ioctl, it was not updated by the
nilfs2 kernel code.  This fixes the issue and makes it available for
userland tools to calculate used amount per checkpoint.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Jiro SEKIBA <jir@unicus.jp>
2011-03-08 14:58:31 +09:00
Ryusuke Konishi ae191838b0 nilfs2: optimize rec_len functions
This is a similar change to those in ext2/ext3 codebase (commit
40a063f669 and a4ae309486, respectively).

The addition of 64k block capability in the rec_len_from_disk and
rec_len_to_disk functions added a bit of math overhead which slows
down file create workloads needlessly when the architecture cannot
even support 64k blocks.  This will cut the corner.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08 14:58:30 +09:00
Ryusuke Konishi 4138ec2382 nilfs2: append blocksize info to warnings during loading super blocks
At present, the same warning message can be output twice when nilfs
detected a problem on super blocks:

 NILFS warning: broken superblock. using spare superblock.
 NILFS warning: broken superblock. using spare superblock.
 ...

This is because these super blocks are reloaded with the block size
written in a super block if it differs from the first block size, but
this repetition looks somewhat confusing.  So, we hint at what is
going on by appending block size information to those messages.

Reported-by: Wakko Warner <wakko@animx.eu.org>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08 14:58:30 +09:00
Ryusuke Konishi 828b1c50ae nilfs2: add compat ioctl
The current FS_IOC_GETFLAGS/SETFLAGS/GETVERSION will fail if
application is 32 bit and kernel is 64 bit.

This issue is avoidable by adding compat_ioctl method.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08 14:58:30 +09:00
Ryusuke Konishi cde98f0f84 nilfs2: implement FS_IOC_GETFLAGS/SETFLAGS/GETVERSION
Add support for the standard attributes set via chattr and read via
lsattr.  These attributes are already in the flags value in the nilfs2
inode, but currently we don't have any ioctl commands that expose them
to the userland.

Collaterally, this adds the FS_IOC_GETVERSION ioctl for getting
i_generation, which allows users to list the file's generation number
with "lsattr -v".

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08 14:58:30 +09:00
Ryusuke Konishi b253a3e4f2 nilfs2: tighten restrictions on inode flags
Nilfs has few rectrictions on which flags may be set on which inodes
like ext2/3/4 filesystems used to be.  Specifically DIRSYNC may only
be set on directories and IMMUTABLE and APPEND may not be set on
links.  Tighten that to disallow TOPDIR being set on non-directories
and only NODUMP and NOATIME to be set on non-regular file,
non-directories.

This introduces a flags masking function like those of extN and uses
it during inode creation.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08 14:58:29 +09:00
Ryusuke Konishi 32f4aeb315 nilfs2: mark S_NOATIME on inodes only if NOATIME attribute is set
At present, nilfs marks S_NOATIME flag on all inodes.  This restricts
nilfs_set_inode_flags function so that it marks S_NOATIME only if a
given inode has an FS_NOATIME_FL flag.

Although nilfs does not support atime yet, touch_atime() still safely
returns on IS_NOATIME check since MS_NOATIME is always set on sb.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08 14:58:29 +09:00