xfs_sync.c now only contains inode reclaim functions and inode cache
iteration functions. It is not related to sync operations anymore.
Rename to xfs_icache.c to reflect it's contents and prepare for
consolidation with the other inode cache file that exists
(xfs_iget.c).
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
xfs_quiesce_attr() is supposed to leave the log empty with an
unmount record written. Right now it does not wait for the AIL to be
emptied before writing the unmount record, not does it wait for
metadata IO completion, either. Fix it to use the same method and
code as xfs_log_unmount().
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Both callers of xfs_quiesce_attr() are in xfs_super.c, and there's
nothing really sync-specific about this functionality so it doesn't
really matter where it lives. Move it to benext to it's callers, so
all the remount/sync_fs code is in the one place.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Why do we need to write the superblock to disk once we've written
all the data? We don't actually - the reasons for doing this are
lost in the mists of time, and go back to the way Irix used to drive
VFS flushing.
On linux, this code is only called from two contexts: remount and
.sync_fs. In the remount case, the call is followed by a metadata
sync, which unpins and writes the superblock. In the sync_fs case,
we only need to force the log to disk to ensure that the superblock
is correctly on disk, so we don't actually need to write it. Hence
the functionality is either redundant or superfluous and thus can be
removed.
Seeing as xfs_quiesce_data is essentially now just a log force,
remove it as well and fold the code back into the two callers.
Neither of them need the log covering check, either, as that is
redundant for the remount case, and unnecessary for the .sync_fs
case.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
With the syncd functions moved to the log and/or removed, the syncd
workqueue is the only remaining bit left. It is used by the log
covering/ail pushing work, as well as by the inode reclaim work.
Given how cheap workqueues are these days, give the log and inode
reclaim work their own work queues and kill the syncd work queue.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
We don't do any data writeback from XFS any more - the VFS is
completely responsible for that, including for freeze. We can
replace the remaining caller with a VFS level function that
achieves the same thing, but without conflicting with current
writeback work.
This means we can remove the flush_work and xfs_flush_inodes() - the
VFS functionality completely replaces the internal flush queue for
doing this writeback work in a separate context to avoid stack
overruns.
This does have one complication - it cannot be called with page
locks held. Hence move the flushing of delalloc space when ENOSPC
occurs back up into xfs_file_aio_buffered_write when we don't hold
any locks that will stall writeback.
Unfortunately, writeback_inodes_sb_if_idle() is not sufficient to
trigger delalloc conversion fast enough to prevent spurious ENOSPC
whent here are hundreds of writers, thousands of small files and GBs
of free RAM. Hence we need to use sync_sb_inodes() to block callers
while we wait for writeback like the previous xfs_flush_inodes
implementation did.
That means we have to hold the s_umount lock here, but because this
call can nest inside i_mutex (the parent directory in the create
case, held by the VFS), we have to use down_read_trylock() to avoid
potential deadlocks. In practice, this trylock will succeed on
almost every attempt as unmount/remount type operations are
exceedingly rare.
Note: we always need to pass a count of zero to
generic_file_buffered_write() as the previously written byte count.
We only do this by accident before this patch by the virtue of ret
always being zero when there are no errors. Make this explicit
rather than needing to specifically zero ret in the ENOSPC retry
case.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
When unmounting the filesystem, there are lots of operations that
need to be done in a specific order, and they are spread across
across a couple of functions. We have to drain the AIL before we
write the unmount record, and we have to shut down the background
log work before we do either of them.
But this is all split haphazardly across xfs_unmountfs() and
xfs_log_unmount(). Move all the AIL flushing and log manipulations
to xfs_log_unmount() so that the responisbilities of each function
is clear and the operations they perform obvious.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
The only thing the periodic sync work does now is flush the AIL and
idle the log. These are really functions of the log code, so move
the work to xfs_log.c and rename it appropriately.
The only wart that this leaves behind is the xfssyncd_centisecs
sysctl, otherwise the xfssyncd is dead. Clean up any comments that
related to xfssyncd to reflect it's passing.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
If the filesystem is mounted or remounted read-only, stop the sync
worker that tries to flush or cover the log if the filesystem is
dirty. It's read-only, so it isn't dirty. Restart it on a remount,rw
as necessary. This avoids the need for RO checks in the work.
Similarly, stop the sync work when the filesystem is frozen, and
start it again when the filesysetm is thawed. This avoids the need
for special freeze checks in the work.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Instead of starting and stopping background work on the xfs_mount_wq
all at the same time, separate them to where they really are needed
to start and stop.
The xfs_sync_worker, only needs to be started after all the mount
processing has completed successfully, while it needs to be stopped
before the log is unmounted.
The xfs_reclaim_worker is started on demand, and can be
stopped before the unmount process does it's own inode reclaim pass.
The xfs_flush_inodes work is run on demand, and so we really only
need to ensure that it has stopped running before we start
processing an unmount, freeze or remount,ro.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
xfs_syncd_start and xfs_syncd_stop tie a bunch of unrelated
functionailty together that actually have different start and stop
requirements. Kill these functions and open code the start/stop
methods for each of the background functions.
Subsequent patches will move the start/stop functions around to the
correct places to avoid races and shutdown issues.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Pull MIPS update from Ralf Baechle:
"Cleanups and fixes for breakage that occured earlier during this merge
phase. Also a few patches that didn't make the first pull request.
Of those is the Alchemy work that merges code for many of the SOCs and
evaluation boards thus among other code shrinkage, reduces the number
of MIPS defconfigs by 5."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (22 commits)
MIPS: SNI: Switch RM400 serial to SCCNXP driver
MIPS: Remove unused empty_bad_pmd_table[] declaration.
MIPS: MT: Remove kspd.
MIPS: Malta: Fix section mismatch.
MIPS: asm-offset.c: Delete unused irq_cpustat_t struct offsets.
MIPS: Alchemy: Merge PB1100/1500 support into DB1000 code.
MIPS: Alchemy: merge PB1550 support into DB1550 code
MIPS: Alchemy: Single kernel for DB1200/1300/1550
MIPS: Optimize TLB refill for RI/XI configurations.
MIPS: proc: Cleanup printing of ASEs.
MIPS: Hardwire detection of DSP ASE Rev 2 for systems, as required.
MIPS: Add detection of DSP ASE Revision 2.
MIPS: Optimize pgd_init and pmd_init
MIPS: perf: Add perf functionality for BMIPS5000
MIPS: perf: Split the Kconfig option CONFIG_MIPS_MT_SMP
MIPS: perf: Remove unnecessary #ifdef
MIPS: perf: Add cpu feature bit for PCI (performance counter interrupt)
MIPS: perf: Change the "mips_perf_event" table unsupported indicator.
MIPS: Align swapper_pg_dir to 64K for better TLB Refill code.
vmlinux.lds.h: Allow architectures to add sections to the front of .bss
...
Pull module signing support from Rusty Russell:
"module signing is the highlight, but it's an all-over David Howells frenzy..."
Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.
* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
X.509: Fix indefinite length element skip error handling
X.509: Convert some printk calls to pr_devel
asymmetric keys: fix printk format warning
MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
MODSIGN: Make mrproper should remove generated files.
MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
MODSIGN: Use the same digest for the autogen key sig as for the module sig
MODSIGN: Sign modules during the build process
MODSIGN: Provide a script for generating a key ID from an X.509 cert
MODSIGN: Implement module signature checking
MODSIGN: Provide module signing public keys to the kernel
MODSIGN: Automatically generate module signing keys if missing
MODSIGN: Provide Kconfig options
MODSIGN: Provide gitignore and make clean rules for extra files
MODSIGN: Add FIPS policy
module: signature checking hook
X.509: Add a crypto key parser for binary (DER) X.509 certificates
MPILIB: Provide a function to read raw data into an MPI
X.509: Add an ASN.1 decoder
X.509: Add simple ASN.1 grammar compiler
...
The hostprogs need access to the CONFIG_* symbols found in
include/generated/autoconf.h. But commit abbf1590de ("UAPI: Partition
the header include path sets and add uapi/ header directories") replaced
$(LINUXINCLUDE) with $(USERINCLUDE) which doesn't contain the necessary
include paths.
This has the undesirable effect of breaking the EFI boot stub because
the #ifdef CONFIG_EFI_STUB code in arch/x86/boot/tools/build.c is
never compiled.
It should also be noted that because $(USERINCLUDE) isn't exported by
the top-level Makefile it's actually empty in arch/x86/boot/Makefile.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The UAPI commits forgot to test tooling builds such as tools/perf/,
and this fixes the fallout.
Manual conversion.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM update from Russell King:
"This is the final round of stuff for ARM, left until the end of the
merge window to reduce the number of conflicts. This set contains the
ARM part of David Howells UAPI changes, and a fix to the ordering of
'select' statements in ARM Kconfig files (see the appropriate commit
for why this happened - thanks to Andrew Morton for pointing out the
problem.)
I've left this as long as I dare for this window to avoid conflicts,
and I regenerated the config patch yesterday, posting it to our
mailing list for review and testing. I have several acks which
include successful test reports for it.
However, today I notice we've got new conflicts with previously unseen
code... though that conflict should be trivial (it's my changes vs a
one liner.)"
* 'late-for-linus' of git://git.linaro.org/people/rmk/linux-arm:
ARM: config: make sure that platforms are ordered by option string
ARM: config: sort select statements alphanumerically
UAPI: (Scripted) Disintegrate arch/arm/include/asm
Fix up fairly conflict in arch/arm/Kconfig (the select re-organization
vs recent addition of GENERIC_KERNEL_EXECVE)
Pull UAPI disintegration for include/linux/{,byteorder/}*.h from David Howells:
"The patches contained herein do the following:
(1) Remove kernel-only stuff in linux/ppp-comp.h from the UAPI. I checked
this with Paul Mackerras before I created the patch and he suggested some
extra bits to unexport.
(2) Remove linux/blk_types.h entirely from the UAPI as none of it is userspace
applicable, and remove from the UAPI that part of linux/fs.h that was the
reason for linux/blk_types.h being exported in the first place. I
discussed this with Jens Axboe before creating the patch.
(3) The big patch of the series to disintegrate include/linux/*.h as a unit.
This could be split up, though there would be collisions in moving stuff
between the two Kbuild files when the parts are merged as that file is
sorted alphabetically rather than being grouped by subsystem.
Of this set of headers, 17 files have changed in the UAPI exported region
since the 4th and only 8 since the 9th so there isn't much change in this
area - as one might expect.
It should be pretty obvious and straightforward if it does come to fixing
up: stuff in __KERNEL__ guards stays where it is and stuff outside moves
to the same file in the include/uapi/linux/ directory.
If a new file appears then things get a bit more complicated as the
"headers +=" line has to move to include/uapi/linux/Kbuild. Only one new
file has appeared since the 9th and I judge this type of event relatively
unlikely.
(4) A patch to disintegrate include/linux/byteorder/*.h as a unit.
Signed-off-by: David Howells <dhowells@redhat.com>"
* tag 'disintegrate-main-20121013' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/linux/byteorder
UAPI: (Scripted) Disintegrate include/linux
UAPI: Unexport linux/blk_types.h
UAPI: Unexport part of linux/ppp-comp.h
Pull spi UAPI disintegration from David Howells:
"This is to complete part of the Userspace API (UAPI) disintegration
for which the preparatory patches were pulled recently. After these
patches, userspace headers will be segregated into:
include/uapi/linux/.../foo.h
for the userspace interface stuff, and:
include/linux/.../foo.h
for the strictly kernel internal stuff.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>"
* tag 'disintegrate-spi-20121009' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/linux/spi
Pull OpenRISC uapi disintegration from Jonas Bonn:
"OpenRISC UAPI disintegration work from David Howells"
* tag 'openrisc-uapi' of git://openrisc.net/jonas/linux:
UAPI: (Scripted) Disintegrate arch/openrisc/include/asm
Pull user namespace compile fixes from Eric W Biederman:
"This tree contains three trivial fixes. One compiler warning, one
thinko fix, and one build fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
btrfs: Fix compilation with user namespace support enabled
userns: Fix posix_acl_file_xattr_userns gid conversion
userns: Properly print bluetooth socket uids
Pull md updates from NeilBrown:
- "discard" support, some dm-raid improvements and other assorted bits
and pieces.
* tag 'md-3.7' of git://neil.brown.name/md: (29 commits)
md: refine reporting of resync/reshape delays.
md/raid5: be careful not to resize_stripes too big.
md: make sure manual changes to recovery checkpoint are saved.
md/raid10: use correct limit variable
md: writing to sync_action should clear the read-auto state.
Subject: [PATCH] md:change resync_mismatches to atomic64_t to avoid races
md/raid5: make sure to_read and to_write never go negative.
md: When RAID5 is dirty, force reconstruct-write instead of read-modify-write.
md/raid5: protect debug message against NULL derefernce.
md/raid5: add some missing locking in handle_failed_stripe.
MD: raid5 avoid unnecessary zero page for trim
MD: raid5 trim support
md/bitmap:Don't use IS_ERR to judge alloc_page().
md/raid1: Don't release reference to device while handling read error.
raid: replace list_for_each_continue_rcu with new interface
add further __init annotations to crypto/xor.c
DM RAID: Fix for "sync" directive ineffectiveness
DM RAID: Fix comparison of index and quantity for "rebuild" parameter
DM RAID: Add rebuild capability for RAID10
DM RAID: Move 'rebuild' checking code to its own function
...
The large platform selection choice should be sorted by option string
so it's easy to find the platform you're looking for. Fix the few
options which are out of this order.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As suggested by Andrew Morton:
This is a pet peeve of mine. Any time there's a long list of items
(header file inclusions, kconfig entries, array initalisers, etc) and
someone wants to add a new item, they *always* go and stick it at the
end of the list.
Guys, don't do this. Either put the new item into a randomly-chosen
position or, probably better, alphanumerically sort the list.
lets sort all our select statements alphanumerically. This commit was
created by the following perl:
while (<>) {
while (/\\\s*$/) {
$_ .= <>;
}
undef %selects if /^\s*config\s+/;
if (/^\s+select\s+(\w+).*/) {
if (defined($selects{$1})) {
if ($selects{$1} eq $_) {
print STDERR "Warning: removing duplicated $1 entry\n";
} else {
print STDERR "Error: $1 differently selected\n".
"\tOld: $selects{$1}\n".
"\tNew: $_\n";
exit 1;
}
}
$selects{$1} = $_;
next;
}
if (%selects and (/^\s*$/ or /^\s+help/ or /^\s+---help---/ or
/^endif/ or /^endchoice/)) {
foreach $k (sort (keys %selects)) {
print "$selects{$k}";
}
undef %selects;
}
print;
}
if (%selects) {
foreach $k (sort (keys %selects)) {
print "$selects{$k}";
}
}
It found two duplicates:
Warning: removing duplicated S5P_SETUP_MIPIPHY entry
Warning: removing duplicated HARDIRQS_SW_RESEND entry
and they are identical duplicates, hence the shrinkage in the diffstat
of two lines.
We have four testers reporting success of this change (Tony, Stephen,
Linus and Sekhar.)
Acked-by: Jason Cooper <jason@lakedaemon.net>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>