Commit Graph

24438 Commits

Author SHA1 Message Date
Paul Gortmaker 143cb494cb fs: add module.h to files that were implicitly using it
Some files were using the complete module.h infrastructure without
actually including the header at all.  Fix them up in advance so
once the implicit presence is removed, we won't get failures like this:

  CC [M]  fs/nfsd/nfssvc.o
fs/nfsd/nfssvc.c: In function 'nfsd_create_serv':
fs/nfsd/nfssvc.c:335: error: 'THIS_MODULE' undeclared (first use in this function)
fs/nfsd/nfssvc.c:335: error: (Each undeclared identifier is reported only once
fs/nfsd/nfssvc.c:335: error: for each function it appears in.)
fs/nfsd/nfssvc.c: In function 'nfsd':
fs/nfsd/nfssvc.c:555: error: implicit declaration of function 'module_put_and_exit'
make[3]: *** [fs/nfsd/nfssvc.o] Error 1

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:31 -04:00
Paul Gortmaker afeacc8c1f fs: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros
These files were getting <linux/module.h> via an implicit include
path, but we want to crush those out of existence since they cost
time during compiles of processing thousands of lines of headers
for no reason.  Give them the lightweight header that just contains
the EXPORT_SYMBOL infrastructure.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:31 -04:00
Linus Torvalds 97d2eb13a0 Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-client
* 'for-linus' of git://ceph.newdream.net/git/ceph-client:
  libceph: fix double-free of page vector
  ceph: fix 32-bit ino numbers
  libceph: force resend of osd requests if we skip an osdmap
  ceph: use kernel DNS resolver
  ceph: fix ceph_monc_init memory leak
  ceph: let the set_layout ioctl set single traits
  Revert "ceph: don't truncate dirty pages in invalidate work thread"
  ceph: replace leading spaces with tabs
  libceph: warn on msg allocation failures
  libceph: don't complain on msgpool alloc failures
  libceph: always preallocate mon connection
  libceph: create messenger with client
  ceph: document ioctls
  ceph: implement (optional) max read size
  ceph: rename rsize -> rasize
  ceph: make readpages fully async
2011-10-28 16:42:18 -07:00
Linus Torvalds f362f98e7c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
  leases: fix write-open/read-lease race
  nfs: drop unnecessary locking in llseek
  ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
  vfs: add generic_file_llseek_size
  vfs: do (nearly) lockless generic_file_llseek
  direct-io: merge direct_io_walker into __blockdev_direct_IO
  direct-io: inline the complete submission path
  direct-io: separate map_bh from dio
  direct-io: use a slab cache for struct dio
  direct-io: rearrange fields in dio/dio_submit to avoid holes
  direct-io: fix a wrong comment
  direct-io: separate fields only used in the submission path from struct dio
  vfs: fix spinning prevention in prune_icache_sb
  vfs: add a comment to inode_permission()
  vfs: pass all mask flags check_acl and posix_acl_permission
  vfs: add hex format for MAY_* flag values
  vfs: indicate that the permission functions take all the MAY_* flags
  compat: sync compat_stats with statfs.
  vfs: add "device" tag to /proc/self/mountstats
  cleanup: vfs: small comment fix for block_invalidatepage
  ...

Fix up trivial conflict in fs/gfs2/file.c (llseek changes)
2011-10-28 10:49:34 -07:00
Linus Torvalds f793f29611 Merge http://sucs.org/~rohan/git/gfs2-3.0-nmw
* http://sucs.org/~rohan/git/gfs2-3.0-nmw: (24 commits)
  GFS2: Move readahead of metadata during deallocation into its own function
  GFS2: Remove two unused variables
  GFS2: Misc fixes
  GFS2: rewrite fallocate code to write blocks directly
  GFS2: speed up delete/unlink performance for large files
  GFS2: Fix off-by-one in gfs2_blk2rgrpd
  GFS2: Clean up ->page_mkwrite
  GFS2: Correctly set goal block after allocation
  GFS2: Fix AIL flush issue during fsync
  GFS2: Use cached rgrp in gfs2_rlist_add()
  GFS2: Call do_strip() directly from recursive_scan()
  GFS2: Remove obsolete assert
  GFS2: Cache the most recently used resource group in the inode
  GFS2: Make resource groups "append only" during life of fs
  GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme
  GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added
  GFS2: Clean up gfs2_create
  GFS2: Use ->dirty_inode()
  GFS2: Fix bug trap and journaled data fsync
  GFS2: Fix inode allocation error path
  ...
2011-10-28 10:44:50 -07:00
Linus Torvalds dabcbb1bae Merge branch '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6
* '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6: (52 commits)
  Fix build break when freezer not configured
  Add definition for share encryption
  CIFS: Make cifs_push_locks send as many locks at once as possible
  CIFS: Send as many mandatory unlock ranges at once as possible
  CIFS: Implement caching mechanism for posix brlocks
  CIFS: Implement caching mechanism for mandatory brlocks
  CIFS: Fix DFS handling in cifs_get_file_info
  CIFS: Fix error handling in cifs_readv_complete
  [CIFS] Fixup trivial checkpatch warning
  [CIFS] Show nostrictsync and noperm mount options in /proc/mounts
  cifs, freezer: add wait_event_freezekillable and have cifs use it
  cifs: allow cifs_max_pending to be readable under /sys/module/cifs/parameters
  cifs: tune bdi.ra_pages in accordance with the rsize
  cifs: allow for larger rsize= options and change defaults
  cifs: convert cifs_readpages to use async reads
  cifs: add cifs_async_readv
  cifs: fix protocol definition for READ_RSP
  cifs: add a callback function to receive the rest of the frame
  cifs: break out 3rd receive phase into separate function
  cifs: find mid earlier in receive codepath
  ...
2011-10-28 10:43:32 -07:00
Linus Torvalds 5619a69396 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (69 commits)
  xfs: add AIL pushing tracepoints
  xfs: put in missed fix for merge problem
  xfs: do not flush data workqueues in xfs_flush_buftarg
  xfs: remove XFS_bflush
  xfs: remove xfs_buf_target_name
  xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks
  xfs: clean up xfs_ioerror_alert
  xfs: clean up buffer allocation
  xfs: remove buffers from the delwri list in xfs_buf_stale
  xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE
  xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF
  xfs: remove XFS_BUF_FINISH_IOWAIT
  xfs: remove xfs_get_buftarg_list
  xfs: fix buffer flushing during unmount
  xfs: optimize fsync on directories
  xfs: reduce the number of log forces from tail pushing
  xfs: Don't allocate new buffers on every call to _xfs_buf_find
  xfs: simplify xfs_trans_ijoin* again
  xfs: unlock the inode before log force in xfs_change_file_space
  xfs: unlock the inode before log force in xfs_fs_nfs_commit_metadata
  ...
2011-10-28 10:31:42 -07:00
J. Bruce Fields f3c7691e8d leases: fix write-open/read-lease race
In setlease, we use i_writecount to decide whether we can give out a
read lease.

In open, we break leases before incrementing i_writecount.

There is therefore a window between the break lease and the i_writecount
increment when setlease could add a new read lease.

This would leave us with a simultaneous write open and read lease, which
shouldn't happen.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:59:00 +02:00
Andi Kleen 79835a710d nfs: drop unnecessary locking in llseek
This makes NFS follow the standard generic_file_llseek locking scheme.

Cc: Trond.Myklebust@netapp.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:59:00 +02:00
Andi Kleen 4cce0e28b9 ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
This gives ext4 the benefits of unlocked llseek.

Cc: tytso@mit.edu
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:59 +02:00
Andi Kleen 5760495a87 vfs: add generic_file_llseek_size
Add a generic_file_llseek variant to the VFS that allows passing in
the maximum file size of the file system, instead of always
using maxbytes from the superblock.

This can be used to eliminate some cut'n'paste seek code in ext4.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:59 +02:00
Andi Kleen ef3d0fd27e vfs: do (nearly) lockless generic_file_llseek
The i_mutex lock use of generic _file_llseek hurts.  Independent processes
accessing the same file synchronize over a single lock, even though
they have no need for synchronization at all.

Under high utilization this can cause llseek to scale very poorly on larger
systems.

This patch does some rethinking of the llseek locking model:

First the 64bit f_pos is not necessarily atomic without locks
on 32bit systems. This can already cause races with read() today.
This was discussed on linux-kernel in the past and deemed acceptable.
The patch does not change that.

Let's look at the different seek variants:

SEEK_SET: Doesn't really need any locking.
If there's a race one writer wins, the other loses.

For 32bit the non atomic update races against read()
stay the same. Without a lock they can also happen
against write() now.  The read() race was deemed
acceptable in past discussions, and I think if it's
ok for read it's ok for write too.

=> Don't need a lock.

SEEK_END: This behaves like SEEK_SET plus it reads
the maximum size too. Reading the maximum size would have the
32bit atomic problem. But luckily we already have a way to read
the maximum size without locking (i_size_read), so we
can just use that instead.

Without i_mutex there is no synchronization with write() anymore,
however since the write() update is atomic on 64bit it just behaves
like another racy SEEK_SET.  On non atomic 32bit it's the same
as SEEK_SET.

=> Don't need a lock, but need to use i_size_read()

SEEK_CUR: This has a read-modify-write race window
on the same file. One could argue that any application
doing unsynchronized seeks on the same file is already broken.
But for the sake of not adding a regression here I'm
using the file->f_lock to synchronize this. Using this
lock is much better than the inode mutex because it doesn't
synchronize between processes.

=> So still need a lock, but can use a f_lock.

This patch implements this new scheme in generic_file_llseek.
I dropped generic_file_llseek_unlocked and changed all callers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:58 +02:00
Andi Kleen 847cc6371b direct-io: merge direct_io_walker into __blockdev_direct_IO
This doesn't change anything for the compiler, but hch thought it would
make the code clearer.

I moved the reference counting into its own little inline.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:58 +02:00
Andi Kleen ba253fbf6d direct-io: inline the complete submission path
Add inlines to all the submission path functions. While this increases
code size it also gives gcc a lot of optimization opportunities
in this critical hotpath.

In particular -- together with some other changes -- this
allows gcc to get rid of the unnecessary clearing of
sdio at the beginning and optimize the messy parameter passing.
Any non inlining of a function which takes a sdio parameter
would break this optimization because they cannot be done if the
address of a structure is taken.

Note that benefits are only seen with CONFIG_OPTIMIZE_INLINING
and CONFIG_CC_OPTIMIZE_FOR_SIZE both set to off.

This gives about 2.2% improvement on a large database benchmark
with a high IOPS rate.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:58 +02:00
Andi Kleen 18772641db direct-io: separate map_bh from dio
Only a single b_private field in the map_bh buffer head is needed after
the submission path. Move map_bh separately to avoid storing
this information in the long term slab.

This avoids the weird 104 byte hole in struct dio_submit which also needed
to be memseted early.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:57 +02:00
Andi Kleen 6e8267f532 direct-io: use a slab cache for struct dio
A direct slab call is slightly faster than kmalloc and can be better cached
per CPU. It also avoids rounding to the next kmalloc slab.

In addition this enforces cache line alignment for struct dio to avoid
any false sharing.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:57 +02:00
Andi Kleen 0dc2bc49be direct-io: rearrange fields in dio/dio_submit to avoid holes
Fix most problems reported by pahole.

There is still a weird 104 byte hole after map_bh. I'm not sure what
causes this.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:56 +02:00
Andi Kleen cde1ecb324 direct-io: fix a wrong comment
There's nothing on the stack, even before my changes.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:56 +02:00
Andi Kleen eb28be2b4c direct-io: separate fields only used in the submission path from struct dio
This large, but largely mechanic, patch moves all fields in struct dio
that are only used in the submission path into a separate on stack
data structure. This has the advantage that the memory is very likely
cache hot, which is not guaranteed for memory fresh out of kmalloc.

This also gives gcc more optimization potential because it can easier
determine that there are no external aliases for these variables.

The sdio initialization is a initialization now instead of memset.
This allows gcc to break sdio into individual fields and optimize
away unnecessary zeroing (after all the functions are inlined)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:56 +02:00
Christoph Hellwig 62a3ddef61 vfs: fix spinning prevention in prune_icache_sb
We need to move the inode to the end of the list to actually make the
spinning prevention explained in the comment above it work.  With a
plain list_move it will simply stay in place as we're always reclaiming
from the head of the list.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:55 +02:00
Andreas Gruenbacher 948409c74d vfs: add a comment to inode_permission()
Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:55 +02:00
Andreas Gruenbacher d124b60a83 vfs: pass all mask flags check_acl and posix_acl_permission
Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:54 +02:00
Andreas Gruenbacher 8fd90c8d1d vfs: indicate that the permission functions take all the MAY_* flags
Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:54 +02:00
Eric W. Biederman 1448c721e4 compat: sync compat_stats with statfs.
This was found by inspection while tracking a similar
bug in compat_statfs64, that has been fixed in mainline
since decemeber.

- This fixes a bug where not all of the f_spare fields
  were cleared on mips and s390.
- Add the f_flags field to struct compat_statfs
- Copy f_flags to userspace in case someone cares.
- Use __clear_user to copy the f_spare field to userspace
  to ensure that all of the elements of f_spare are cleared.
  On some architectures f_spare is has 5 ints and on some
  architectures f_spare only has 4 ints.  Which makes
  the previous technique of clearing each int individually
  broken.

I don't expect anyone actually uses the old statfs system
call anymore but if they do let them benefit from having
the compat and the native version working the same.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:53 +02:00
Bryan Schumaker a877ee03ac vfs: add "device" tag to /proc/self/mountstats
nfsiostat was failing to find mounted filesystems on kernels after
2.6.38 because of changes to show_vfsstat() by commit
c7f404b40a.  This patch adds back the
"device" tag before the nfs server entry so scripts can parse the
mountstats file correctly.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
CC: stable@kernel.org [>=2.6.39]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 13:55:08 +02:00