Commit Graph

1410 Commits

Author SHA1 Message Date
Linus Torvalds
790eac5640 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull second set of VFS changes from Al Viro:
 "Assorted f_pos race fixes, making do_splice_direct() safe to call with
  i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
  ->d_hash/->d_compare calling conventions changes from Linus, misc
  stuff all over the place."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  Document ->tmpfile()
  ext4: ->tmpfile() support
  vfs: export lseek_execute() to modules
  lseek_execute() doesn't need an inode passed to it
  block_dev: switch to fixed_size_llseek()
  cpqphp_sysfs: switch to fixed_size_llseek()
  tile-srom: switch to fixed_size_llseek()
  proc_powerpc: switch to fixed_size_llseek()
  ubi/cdev: switch to fixed_size_llseek()
  pci/proc: switch to fixed_size_llseek()
  isapnp: switch to fixed_size_llseek()
  lpfc: switch to fixed_size_llseek()
  locks: give the blocked_hash its own spinlock
  locks: add a new "lm_owner_key" lock operation
  locks: turn the blocked_list into a hashtable
  locks: convert fl_link to a hlist_node
  locks: avoid taking global lock if possible when waking up blocked waiters
  locks: protect most of the file_lock handling with i_lock
  locks: encapsulate the fl_link list handling
  locks: make "added" in __posix_lock_file a bool
  ...
2013-07-03 09:10:19 -07:00
Linus Torvalds
fc76a258d4 Merge tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
 "Here's the big driver core merge for 3.11-rc1

  Lots of little things, and larger firmware subsystem updates, all
  described in the shortlog.  Nice thing here is that we finally get rid
  of CONFIG_HOTPLUG, after 10+ years, thanks to Stephen Rohtwell (it had
  been always on for a number of kernel releases, now it's just
  removed)"

* tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (27 commits)
  driver core: device.h: fix doc compilation warnings
  firmware loader: fix another compile warning with PM_SLEEP unset
  build some drivers only when compile-testing
  firmware loader: fix compile warning with PM_SLEEP set
  kobject: sanitize argument for format string
  sysfs_notify is only possible on file attributes
  firmware loader: simplify holding module for request_firmware
  firmware loader: don't export cache_firmware and uncache_firmware
  drivers/base: Use attribute groups to create sysfs memory files
  firmware loader: fix compile warning
  firmware loader: fix build failure with !CONFIG_FW_LOADER_USER_HELPER
  Documentation: Updated broken link in HOWTO
  Finally eradicate CONFIG_HOTPLUG
  driver core: firmware loader: kill FW_ACTION_NOHOTPLUG requests before suspend
  driver core: firmware loader: don't cache FW_ACTION_NOHOTPLUG firmware
  Documentation: Tidy up some drivers/base/core.c kerneldoc content.
  platform_device: use a macro instead of platform_driver_register
  firmware: move EXPORT_SYMBOL annotations
  firmware: Avoid deadlock of usermodehelper lock at shutdown
  dell_rbu: Select CONFIG_FW_LOADER_USER_HELPER explicitly
  ...
2013-07-02 11:44:19 -07:00
Linus Torvalds
c4eb1b0730 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw
Pull GFS2 updates from Steven Whitehouse:
 "There are a few bug fixes for various, mostly very minor corner cases,
  plus some interesting new features.

  The new features include atomic_open whose main benefit will be the
  reduction in locking overhead in case of combined lookup/create and
  open operations, sorting the log buffer lists by block number to
  improve the efficiency of AIL writeback, and aggressively issuing
  revokes in gfs2_log_flush to reduce overhead when dropping glocks."

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
  GFS2: Reserve journal space for quota change in do_grow
  GFS2: Fix fstrim boundary conditions
  GFS2: fix warning message
  GFS2: aggressively issue revokes in gfs2_log_flush
  GFS2: fix regression in dir_double_exhash
  GFS2: Add atomic_open support
  GFS2: Only do one directory search on create
  GFS2: fix error propagation in init_threads()
  GFS2: Remove no-op wrapper function
  GFS2: Cocci spatch "ptr_ret.spatch"
  GFS2: Eliminate gfs2_rg_lops
  GFS2: Sort buffer lists by inplace block number
2013-07-02 09:41:18 -07:00
Linus Torvalds
9e239bb939 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 update from Ted Ts'o:
 "Lots of bug fixes, cleanups and optimizations.  In the bug fixes
  category, of note is a fix for on-line resizing file systems where the
  block size is smaller than the page size (i.e., file systems 1k blocks
  on x86, or more interestingly file systems with 4k blocks on Power or
  ia64 systems.)

  In the cleanup category, the ext4's punch hole implementation was
  significantly improved by Lukas Czerner, and now supports bigalloc
  file systems.  In addition, Jan Kara significantly cleaned up the
  write submission code path.  We also improved error checking and added
  a few sanity checks.

  In the optimizations category, two major optimizations deserve
  mention.  The first is that ext4_writepages() is now used for
  nodelalloc and ext3 compatibility mode.  This allows writes to be
  submitted much more efficiently as a single bio request, instead of
  being sent as individual 4k writes into the block layer (which then
  relied on the elevator code to coalesce the requests in the block
  queue).  Secondly, the extent cache shrink mechanism, which was
  introduce in 3.9, no longer has a scalability bottleneck caused by the
  i_es_lru spinlock.  Other optimizations include some changes to reduce
  CPU usage and to avoid issuing empty commits unnecessarily."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
  ext4: optimize starting extent in ext4_ext_rm_leaf()
  jbd2: invalidate handle if jbd2_journal_restart() fails
  ext4: translate flag bits to strings in tracepoints
  ext4: fix up error handling for mpage_map_and_submit_extent()
  jbd2: fix theoretical race in jbd2__journal_restart
  ext4: only zero partial blocks in ext4_zero_partial_blocks()
  ext4: check error return from ext4_write_inline_data_end()
  ext4: delete unnecessary C statements
  ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
  jbd2: move superblock checksum calculation to jbd2_write_superblock()
  ext4: pass inode pointer instead of file pointer to punch hole
  ext4: improve free space calculation for inline_data
  ext4: reduce object size when !CONFIG_PRINTK
  ext4: improve extent cache shrink mechanism to avoid to burn CPU time
  ext4: implement error handling of ext4_mb_new_preallocation()
  ext4: fix corruption when online resizing a fs with 1K block size
  ext4: delete unused variables
  ext4: return FIEMAP_EXTENT_UNKNOWN for delalloc extents
  jbd2: remove debug dependency on debug_fs and update Kconfig help text
  jbd2: use a single printk for jbd_debug()
  ...
2013-07-02 09:39:34 -07:00
Jeff Layton
1c8c601a8c locks: protect most of the file_lock handling with i_lock
Having a global lock that protects all of this code is a clear
scalability problem. Instead of doing that, move most of the code to be
protected by the i_lock instead. The exceptions are the global lists
that the ->fl_link sits on, and the ->fl_block list.

->fl_link is what connects these structures to the
global lists, so we must ensure that we hold those locks when iterating
over or updating these lists.

Furthermore, sound deadlock detection requires that we hold the
blocked_list state steady while checking for loops. We also must ensure
that the search and update to the list are atomic.

For the checking and insertion side of the blocked_list, push the
acquisition of the global lock into __posix_lock_file and ensure that
checking and update of the  blocked_list is done without dropping the
lock in between.

On the removal side, when waking up blocked lock waiters, take the
global lock before walking the blocked list and dequeue the waiters from
the global list prior to removal from the fl_block list.

With this, deadlock detection should be race free while we minimize
excessive file_lock_lock thrashing.

Finally, in order to avoid a lock inversion problem when handling
/proc/locks output we must ensure that manipulations of the fl_block
list are also protected by the file_lock_lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:42 +04:00
Linus Torvalds
da53be12bb Don't pass inode to ->d_hash() and ->d_compare()
Instances either don't look at it at all (the majority of cases) or
only want it to find the superblock (which can be had as dentry->d_sb).
A few cases that want more are actually safe with dentry->d_inode -
the only precaution needed is the check that it hadn't been replaced with
NULL by rmdir() or by overwriting rename(), which case should be simply
treated as cache miss.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:36 +04:00
Al Viro
ac6614b764 [readdir] constify ->actor
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:05 +04:00
Al Viro
d81a8ef598 [readdir] convert gfs2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:56:35 +04:00
Bob Peterson
a01aedfe21 GFS2: Reserve journal space for quota change in do_grow
If a GFS2 file system is mounted with quotas and a file is grown
in such a way that its free blocks for the allocation are represented
in a secondary bitmap, GFS2 ran out of blocks in the transaction.
That resulted in "fatal: assertion "tr->tr_num_buf <= tr->tr_blocks".
This patch reserves extra blocks for the quota change so the
transaction has enough space.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-27 18:16:27 +01:00
Abhijith Das
6a98c333ed GFS2: Fix fstrim boundary conditions
This patch correctly distinguishes two boundary conditions:

1. When the given range is entire within the unaccounted space between
   two rgrps, and
2. The range begins beyond the end of the filesystem

Also fix the unit of the returned value r.len (total trimming) to be in bytes 
instead of the (incorrect) 512 byte blocks

With this patch, GFS2 passes multiple iterations of all the relevant xfstests
(251, 260, 288) with different fs block sizes.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-19 21:41:26 +01:00
Benjamin Marzinski
2b12eea656 GFS2: fix warning message
This patch fixes a warning message introduced in the recent
"GFS2: aggressively issue revokes in gfs2_log_flush" patch.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-19 21:29:19 +01:00
Benjamin Marzinski
5d054964f5 GFS2: aggressively issue revokes in gfs2_log_flush
This patch looks at all the outstanding blocks in all the transactions
on the log, and moves the completed ones to the ail2 list.  Then it
issues revokes for these blocks.  This will hopefully speed things up
in situations where there is a lot of contention for glocks, especially
if they are acquired serially.

revoke_lo_before_commit will issue at most one log block's full of these
preemptive revokes. The amount of reserved log space that
gfs2_log_reserve() ignores has been incremented to allow for this extra
block.

This patch also consolidates the common revoke instructions into one
function, gfs2_add_revoke().

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-19 09:41:59 +01:00
Greg Kroah-Hartman
bb07b00be7 Merge 3.10-rc6 into driver-core-next
We want these fixes here too.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-17 16:57:20 -07:00
Bob Peterson
512cbf02fd GFS2: fix regression in dir_double_exhash
Recent commit e8830d8 introduced a bug in function dir_double_exhash;
it was failing to set h in the fall-back case. This patch corrects it.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-14 13:27:17 +01:00
Steven Whitehouse
6d4ade986f GFS2: Add atomic_open support
I've restricted atomic_open to only operate on regular files, although
I still don't understand why atomic_open should not be possible also for
directories on GFS2. That can always be added in later though, if it
makes sense.

The ->atomic_open function can be passed negative dentries, which
in most cases means either ENOENT (->lookup) or a call to d_instantiate
(->create). In the GFS2 case though, we need to actually perform the
look up, since we do not know whether there has been a new inode created
on another node. The look up calls d_splice_alias which then tries to
rehash the dentry - so the solution here is to simply check for that
in d_splice_alias. The same issue is likely to affect any other cluster
filesystem implementing ->atomic_open

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields fieldses org>
Cc: Jeff Layton <jlayton@redhat.com>
2013-06-14 11:17:15 +01:00
Steven Whitehouse
5a00f3cc97 GFS2: Only do one directory search on create
Creation of a new inode requires a directory search in order to ensure
that we are not trying to create an inode with the same name as an
existing one. This was hidden away inside the create_ok() function.

In the case that there was an existing inode, and a lookup can be
substituted for a create (which is the case with regular files
when the O_EXCL flag is not in use) then we were doing a second
lookup in order to return the inode.

This patch merges these two lookups into one. This can be done by
passing a flag to gfs2_dir_search() to tell it to just return -EEXIST
in the cases where we don't actually want to look up the inode.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-11 13:45:29 +01:00
Alexey Khoroshilov
a9aefd707c GFS2: fix error propagation in init_threads()
If kthread_run() fails, init_threads() returns
IS_ERR(p) instead of PTR_ERR(p).

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-06 09:52:29 +01:00
Steven Whitehouse
edd2e9acc0 GFS2: Remove no-op wrapper function
This wrapper function is no longer required, so get rid of it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-05 09:51:23 +01:00
Thomas Meyer
2b6f8860e1 GFS2: Cocci spatch "ptr_ret.spatch"
Use PTR_RET in place of open coding this function.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-05 09:51:08 +01:00
Bob Peterson
cd51e61eac GFS2: Eliminate gfs2_rg_lops
With recent changes to the transactions, it appears that we
are no longer using the "log ops" for resource groups. Since the
log commit code processes the array of log ops, eliminating this
should be marginally better for performance. Therefore this patch
eliminates it.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-05 09:50:40 +01:00
Benjamin Marzinski
7f63257da1 GFS2: Sort buffer lists by inplace block number
This patch simply sort the data and metadata buffer lists by their
inplace block number.  This makes gfs2_log_flush issue the inplace IO
in sequential order, which will hopefully speed up writing the IO
out to disk.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-05 09:50:20 +01:00
Stephen Rothwell
40b313608a Finally eradicate CONFIG_HOTPLUG
Ever since commit 45f035ab9b ("CONFIG_HOTPLUG should be always on"),
it has been basically impossible to build a kernel with CONFIG_HOTPLUG
turned off.  Remove all the remaining references to it.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-03 14:20:18 -07:00
Bob Peterson
a6a4d98b01 GFS2: Don't cache iopen glocks
This patch makes GFS2 immediately reclaim/delete all iopen glocks
as soon as they're dequeued. This allows deleters to get an
EXclusive lock on iopen so files are deleted properly instead of
being set as unlinked.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-03 16:40:22 +01:00
Bob Peterson
e8830d8856 GFS2: Fall back to vmalloc if kmalloc fails for dir hash tables
This version has one more correction: the vmalloc calls are replaced
by __vmalloc calls to preserve the GFP_NOFS flag.

When GFS2's directory management code allocates buffers for a
directory hash table, if it can't get the memory it needs, it
currently gives a bad return code. Rather than giving an error,
this patch allows it to use virtual memory rather than kernel
memory for the hash table. This should make it possible for
directories to function properly, even when kernel memory becomes
very fragmented.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-03 16:39:44 +01:00
Bob Peterson
2b3dcf3581 GFS2: Increase i_writecount during gfs2_setattr_size
This patch calls get_write_access in a few functions. This
merely increases inode->i_writecount for the duration of the function.
That will ensure that any file closes won't delete the inode's
multi-block reservation while the function is running.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-03 16:38:58 +01:00