Commit Graph

383701 Commits

Author SHA1 Message Date
Linus Torvalds e3a0dd98e1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs update from Chris Mason:
 "These are the usual mixture of bugs, cleanups and performance fixes.
  Miao has some really nice tuning of our crc code as well as our
  transaction commits.

  Josef is peeling off more and more problems related to early enospc,
  and has a number of important bug fixes in here too"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (81 commits)
  Btrfs: wait ordered range before doing direct io
  Btrfs: only do the tree_mod_log_free_eb if this is our last ref
  Btrfs: hold the tree mod lock in __tree_mod_log_rewind
  Btrfs: make backref walking code handle skinny metadata
  Btrfs: fix crash regarding to ulist_add_merge
  Btrfs: fix several potential problems in copy_nocow_pages_for_inode
  Btrfs: cleanup the code of copy_nocow_pages_for_inode()
  Btrfs: fix oops when recovering the file data by scrub function
  Btrfs: make the chunk allocator completely tree lockless
  Btrfs: cleanup orphaned root orphan item
  Btrfs: fix wrong mirror number tuning
  Btrfs: cleanup redundant code in btrfs_submit_direct()
  Btrfs: remove btrfs_sector_sum structure
  Btrfs: check if we can nocow if we don't have data space
  Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc
  Btrfs: use a percpu to keep track of possibly pinned bytes
  Btrfs: check for actual acls rather than just xattrs when caching no acl
  Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate
  Btrfs: optimize reada_for_balance
  Btrfs: optimize read_block_for_search
  ...
2013-07-09 12:33:09 -07:00
Linus Torvalds da89bd213f Merge tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs
Pull xfs update from Ben Myers:
 "This includes several bugfixes, part of the work for project quotas
  and group quotas to be used together, performance improvements for
  inode creation/deletion, buffer readahead, and bulkstat,
  implementation of the inode change count, an inode create transaction,
  and the removal of a bunch of dead code.

  There are also some duplicate commits that you already have from the
  3.10-rc series.

   - part of the work to allow project quotas and group quotas to be
     used together
   - inode change count
   - inode create transaction
   - block queue plugging in buffer readahead and bulkstat
   - ordered log vector support
   - removal of dead code in and around xfs_sync_inode_grab,
     xfs_ialloc_get_rec, XFS_MOUNT_RETERR, XFS_ALLOCFREE_LOG_RES,
     XFS_DIROP_LOG_RES, xfs_chash, ctl_table, and
     xfs_growfs_data_private
   - don't keep silent if sunit/swidth can not be changed via mount
   - fix a leak of remote symlink blocks into the filesystem when xattrs
     are used on symlinks
   - fix for fiemap to return FIEMAP_EXTENT_UNKOWN flag on delay extents
   - part of a fix for xfs_fsr
   - disable speculative preallocation with small files
   - performance improvements for inode creates and deletes"

* tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs: (61 commits)
  xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD
  xfs: Change xfs_dquot_acct to be a 2-dimensional array
  xfs: Code cleanup and removal of some typedef usage
  xfs: Replace macro XFS_DQ_TO_QIP with a function
  xfs: Replace macro XFS_DQUOT_TREE with a function
  xfs: Define a new function xfs_is_quota_inode()
  xfs: implement inode change count
  xfs: Use inode create transaction
  xfs: Inode create item recovery
  xfs: Inode create transaction reservations
  xfs: Inode create log items
  xfs: Introduce an ordered buffer item
  xfs: Introduce ordered log vector support
  xfs: xfs_ifree doesn't need to modify the inode buffer
  xfs: don't do IO when creating an new inode
  xfs: don't use speculative prealloc for small files
  xfs: plug directory buffer readahead
  xfs: add pluging for bulkstat readahead
  xfs: Remove dead function prototype xfs_sync_inode_grab()
  xfs: Remove the left function variable from xfs_ialloc_get_rec()
  ...
2013-07-09 12:29:12 -07:00
Linus Torvalds be0c5d8c0b Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Feature highlights include:
   - Add basic client support for NFSv4.2
   - Add basic client support for Labeled NFS (selinux for NFSv4.2)
   - Fix the use of credentials in NFSv4.1 stateful operations, and add
     support for NFSv4.1 state protection.

  Bugfix highlights:
   - Fix another NFSv4 open state recovery race
   - Fix an NFSv4.1 back channel session regression
   - Various rpc_pipefs races
   - Fix another issue with NFSv3 auth negotiation

  Please note that Labeled NFS does require some additional support from
  the security subsystem.  The relevant changesets have all been
  reviewed and acked by James Morris."

* tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
  NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
  NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
  nfs: have NFSv3 try server-specified auth flavors in turn
  nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
  nfs: move server_authlist into nfs_try_mount_request
  nfs: refactor "need_mount" code out of nfs_try_mount
  SUNRPC: PipeFS MOUNT notification optimization for dying clients
  SUNRPC: split client creation routine into setup and registration
  SUNRPC: fix races on PipeFS UMOUNT notifications
  SUNRPC: fix races on PipeFS MOUNT notifications
  NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
  NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
  NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
  NFS: Improve legacy idmapping fallback
  NFSv4.1 end back channel session draining
  NFS: Apply v4.1 capabilities to v4.2
  NFSv4.1: Clean up layout segment comparison helper names
  NFSv4.1: layout segment comparison helpers should take 'const' parameters
  NFSv4: Move the DNS resolver into the NFSv4 module
  rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
  ...
2013-07-09 12:09:43 -07:00
Linus Torvalds 1f792dd176 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext3 fix and quota cleanup from Jan Kara:
 "A fix of ext3 error reporting from fsync and a quota cleanup"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Convert use of typedef ctl_table to struct ctl_table
  ext3: Fix fsync error handling after filesystem abort.
2013-07-09 12:08:43 -07:00
Linus Torvalds c75e24752c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull third set of VFS updates from Al Viro:
 "Misc stuff all over the place.  There will be one more pile in a
  couple of days"

This is an "evil merge" that also uses the new d_count helper in
fs/configfs/dir.c, missed by commit 84d08fa888 ("helper for reading
->d_count")

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ncpfs: fix error return code in ncp_parse_options()
  locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock
  seq_file: add seq_list_*_percpu helpers
  f2fs: fix readdir incorrectness
  mode_t whack-a-mole...
  lustre: kill the pointless wrapper
  helper for reading ->d_count
2013-07-09 11:26:44 -07:00
Wei Yongjun 4fbeb19d53 ncpfs: fix error return code in ncp_parse_options()
Fix to return -EINVAL from the option parse error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-08 13:36:43 +04:00
Jeff Layton 7012b02a2b locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock
The file_lock_list is only used for /proc/locks. The vastly common case
is for locks to be put onto the list and come off again, without ever
being traversed.

Help optimize for this use-case by moving to percpu hlist_head-s. At the
same time, we can make the locking less contentious by moving to an
lglock. When iterating over the lists for /proc/locks, we must take the
global lock and then iterate over each CPU's list in turn.

This change necessitates a new fl_link_cpu field to keep track of which
CPU the entry is on. On x86_64 at least, this field is placed within an
existing hole in the struct to avoid growing the size.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-08 13:36:42 +04:00
Jeff Layton 0bc77381c1 seq_file: add seq_list_*_percpu helpers
When we convert the file_lock_list to a set of percpu lists, we'll need
a way to iterate over them in order to output /proc/locks info. Add
some seq_list_*_percpu helpers to handle that.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-08 13:36:41 +04:00
Jaegeuk Kim 99b072bb38 f2fs: fix readdir incorrectness
In the previous Al Viro's readdir patch set, there occurs a bug when
running
xfstest: 006 as follows.

[Error output]
alpha size = 4, name length = 6, total files = 4096, nproc=1
1023 files created
rm: cannot remove `/mnt/f2fs/permname.15150/a': Directory not empty

[Correct output]
alpha size = 4, name length = 6, total files = 4096, nproc=1
4097 files created

This bug is due to the misupdate of directory position in ctx.
So, this patch fixes this.

[AV: fixed a braino]

CC: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-08 13:35:48 +04:00
Linus Torvalds d2b4a64671 Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine updates from Vinod Koul:
 "Once you have some time from extended weekend celebrations please
  consider pulling the following to get:
   - Various fixes and PCI driver for dw_dmac by Andy
   - DT binding for imx-dma by Markus & imx-sdma by Shawn
   - DT fixes for dmaengine by Lars
   - jz4740 dmac driver by Lars
   - and various fixes across the drivers"

What "extended weekend celebrations"?  I'm in the merge window, who has
time for extended celebrations..

* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (40 commits)
  DMA: shdma: add DT support
  DMA: shdma: shdma_chan_filter() has to be in shdma-base.h
  DMA: shdma: (cosmetic) don't re-calculate a pointer
  dmaengine: at_hdmac: prepare clk before calling enable
  dmaengine/trivial: at_hdmac: add curly brackets to if/else expressions
  dmaengine: at_hdmac: remove unsuded atc_cleanup_descriptors()
  dmaengine: at_hdmac: add FIFO configuration parameter to DMA DT binding
  ARM: at91: dt: add header to define at_hdmac configuration
  MIPS: jz4740: Correct clock gate bit for DMA controller
  MIPS: jz4740: Remove custom DMA API
  MIPS: jz4740: Register jz4740 DMA device
  dma: Add a jz4740 dmaengine driver
  MIPS: jz4740: Acquire and enable DMA controller clock
  dma: mmp_tdma: disable irq when disabling dma channel
  dmaengine: PL08x: Avoid collisions with get_signal() macro
  dmaengine: dw: select DW_DMAC_BIG_ENDIAN_IO automagically
  dma: dw: add PCI part of the driver
  dma: dw: split driver to library part and platform code
  dma: move dw_dmac driver to an own directory
  dw_dmac: don't check resource with devm_ioremap_resource
  ...
2013-07-07 11:11:43 -07:00
Linus Torvalds 8dce5f3dee Merge branch 'cpuinit-delete' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull first stage of __cpuinit removal from Paul Gortmaker:
 "The two commits here 1) dummy out all the __cpuinit macros so that we
  no longer generate such sections, and then 2) remove all the section
  processing that we used to do for those sections.

  This makes all the __cpuinit and friends no-ops, so that we can remove
  the use cases of it at our leisure.  Expect stage 2, which does the
  tree wide removal sweep at the end of the merge window."

* 'cpuinit-delete' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  modpost: remove all traces of cpuinit/cpuexit sections
  init.h: remove __cpuinit sections from the kernel
2013-07-07 11:01:19 -07:00
Linus Torvalds 21884a83b2 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:
 "The timer changes contain:

   - posix timer code consolidation and fixes for odd corner cases

   - sched_clock implementation moved from ARM to core code to avoid
     duplication by other architectures

   - alarm timer updates

   - clocksource and clockevents unregistration facilities

   - clocksource/events support for new hardware

   - precise nanoseconds RTC readout (Xen feature)

   - generic support for Xen suspend/resume oddities

   - the usual lot of fixes and cleanups all over the place

  The parts which touch other areas (ARM/XEN) have been coordinated with
  the relevant maintainers.  Though this results in an handful of
  trivial to solve merge conflicts, which we preferred over nasty cross
  tree merge dependencies.

  The patches which have been committed in the last few days are bug
  fixes plus the posix timer lot.  The latter was in akpms queue and
  next for quite some time; they just got forgotten and Frederic
  collected them last minute."

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (59 commits)
  hrtimer: Remove unused variable
  hrtimers: Move SMP function call to thread context
  clocksource: Reselect clocksource when watchdog validated high-res capability
  posix-cpu-timers: don't account cpu timer after stopped thread runtime accounting
  posix_timers: fix racy timer delta caching on task exit
  posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()
  selftests: add basic posix timers selftests
  posix_cpu_timers: consolidate expired timers check
  posix_cpu_timers: consolidate timer list cleanups
  posix_cpu_timer: consolidate expiry time type
  tick: Sanitize broadcast control logic
  tick: Prevent uncontrolled switch to oneshot mode
  tick: Make oneshot broadcast robust vs. CPU offlining
  x86: xen: Sync the CMOS RTC as well as the Xen wallclock
  x86: xen: Sync the wallclock when the system time is set
  timekeeping: Indicate that clock was set in the pvclock gtod notifier
  timekeeping: Pass flags instead of multiple bools to timekeeping_update()
  xen: Remove clock_was_set() call in the resume path
  hrtimers: Support resuming with two or more CPUs online (but stopped)
  timer: Fix jiffies wrap behavior of round_jiffies_common()
  ...
2013-07-06 14:09:38 -07:00
Linus Torvalds 8b70a90cab Merge branch 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull ARM DMA mapping updates from Marek Szyprowski:
 "This contains important bugfixes and an update for IOMMU integration
  support for ARM architecture"

* 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma: Drop __GFP_COMP for iommu dma memory allocations
  ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
  ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detach
  ARM: dma-mapping: convert DMA direction into IOMMU protection attributes
  ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_pool
2013-07-06 12:41:54 -07:00
Linus Torvalds 7644a448cc Merge tag 'metag-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
Pull Metag architecture changes from James Hogan:
 - Infrastructure and DT files for TZ1090 SoC (pin control drivers
   already merged via pinctrl tree).
 - Panic on boot instead of just warning if cache aliasing possible.
 - Various SMP/hotplug fixes.
 - Various other randconfig/sparse fixes.

* tag 'metag-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (24 commits)
  metag: move EXPORT_SYMBOL(csum_partial) to metag_ksyms.c
  metag: cpu hotplug: route_irq: preserve irq mask
  metag: kick: add missing irq_enter/exit to kick_handler()
  metag: smp: don't spin waiting for CPU to start
  metag: smp: enable irqs after set_cpu_online
  metag: use clear_tasks_mm_cpumask()
  metag: tz1090: select and instantiate pinctrl-tz1090-pdc
  metag: tz1090: select and instantiate pinctrl-tz1090
  metag: don't check for cache aliasing on smp cpu boot
  metag: panic if cache aliasing possible
  metag: *.dts: include using preprocessor
  metag: add <dt-bindings/> symlink
  metag/.gitignore: Extend the *.dtb pattern to match the dtb.S files
  metag/traps: include setup.h for the per_cpu_trap_init declaration
  metag/traps: Mark die() as __noreturn to match the declaration.
  metag/processor.h: Add missing cpuinfo_op declaration.
  metag/setup: Restrict scope for the capabilities variable
  metag/mm/cache: Restrict scope for metag_lnkget_probe
  metag/asm/irq.h: Declare init_IRQ
  metag/kernel/irq.c: Declare root_domain as static
  ...
2013-07-06 12:39:39 -07:00
Linus Torvalds 16984ce15e Merge tag 'xenarm-for-3.11-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen
Pull Xen ARM update rom Stefano Stabellini:
 "Just one commit this time: the implementation of the tmem hypercall
  for arm and arm64"

* tag 'xenarm-for-3.11-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen:
  xen/arm and xen/arm64: implement HYPERVISOR_tmem_op
2013-07-06 12:38:42 -07:00
Linus Torvalds 2cb7b5a38c Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux
Pull irqdomain refactoring from Grant Likely:
 "This is the long awaited simplification of irqdomain.  It gets rid of
  the different types of irq domains and instead both linear and tree
  mappings can be supported in a single domain.  Doing this removes a
  lot of special case code and makes irq domains simpler to understand
  overall"

* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux:
  irq: fix checkpatch error
  irqdomain: Include hwirq number in /proc/interrupts
  irqdomain: make irq_linear_revmap() a fast path again
  irqdomain: remove irq_domain_generate_simple()
  irqdomain: Refactor irq_domain_associate_many()
  irqdomain: Beef up debugfs output
  irqdomain: Clean up aftermath of irq_domain refactoring
  irqdomain: Eliminate revmap type
  irqdomain: merge linear and tree reverse mappings.
  irqdomain: Add a name field
  irqdomain: Replace LEGACY mapping with LINEAR
  irqdomain: Relax failure path on setting up mappings
2013-07-06 12:37:04 -07:00
Al Viro 3de91f9237 mode_t whack-a-mole...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-06 23:04:23 +04:00
Thomas Gleixner 73b0cd674c hrtimer: Remove unused variable
Sigh, should have noticed myself.

Reported-by: fengguang.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-06 10:34:00 +02:00
Linus Torvalds b2c311075d Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 - Do not idle omap device between crypto operations in one session.
 - Added sha224/sha384 shims for SSSE3.
 - More optimisations for camellia-aesni-avx2.
 - Removed defunct blowfish/twofish AVX2 implementations.
 - Added unaligned buffer self-tests.
 - Added PCLMULQDQ optimisation for CRCT10DIF.
 - Added support for Freescale's DCP co-processor
 - Misc fixes.

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (44 commits)
  crypto: testmgr - test hash implementations with unaligned buffers
  crypto: testmgr - test AEADs with unaligned buffers
  crypto: testmgr - test skciphers with unaligned buffers
  crypto: testmgr - check that entries in alg_test_descs are in correct order
  Revert "crypto: twofish - add AVX2/x86_64 assembler implementation of twofish cipher"
  Revert "crypto: blowfish - add AVX2/x86_64 implementation of blowfish cipher"
  crypto: camellia-aesni-avx2 - tune assembly code for more performance
  hwrng: bcm2835 - fix MODULE_LICENSE tag
  hwrng: nomadik - use clk_prepare_enable()
  crypto: picoxcell - replace strict_strtoul() with kstrtoul()
  crypto: dcp - Staticize local symbols
  crypto: dcp - Use NULL instead of 0
  crypto: dcp - Use devm_* APIs
  crypto: dcp - Remove redundant platform_set_drvdata()
  hwrng: use platform_{get,set}_drvdata()
  crypto: omap-aes - Don't idle/start AES device between Encrypt operations
  crypto: crct10dif - Use PTR_RET
  crypto: ux500 - Cocci spatch "resource_size.spatch"
  crypto: sha256_ssse3 - add sha224 support
  crypto: sha512_ssse3 - add sha384 support
  ...
2013-07-05 12:12:33 -07:00
Linus Torvalds 45175476ae Merge tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubi
Pull ubi fixes from Artem Bityutskiy:
 "A couple of fixes and clean-ups, allow for assigning user-defined UBI
  device numbers when attaching MTD devices by using the "mtd=" module
  parameter"

* tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubi:
  UBI: support ubi_num on mtd.ubi command line
  UBI: fastmap break out of used PEB search
  UBI: document UBI_IOCVOLUP better in user header
  UBI: do not abort init when ubi.mtd devices cannot be found
  UBI: drop redundant "UBI error" string
2013-07-05 12:09:48 -07:00
Linus Torvalds 2dd1cb5a7e Merge tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubifs
Pull ubifs fix from Artem Bityutskiy:
 "Only a single patch which fixes a message"

* tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: correct mount message
2013-07-05 12:08:47 -07:00
Thomas Gleixner 5ec2481b7b hrtimers: Move SMP function call to thread context
smp_call_function_* must not be called from softirq context.

But clock_was_set() which calls on_each_cpu() is called from softirq
context to implement a delayed clock_was_set() for the timer interrupt
handler. Though that almost never gets invoked. A recent change in the
resume code uses the softirq based delayed clock_was_set to support
Xens resume mechanism.

linux-next contains a new warning which warns if smp_call_function_*
is called from softirq context which gets triggered by that Xen
change.

Fix this by moving the delayed clock_was_set() call to a work context.

Reported-and-tested-by: Artem Savkov <artem.savkov@gmail.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>,
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: xen-devel@lists.xen.org
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-05 17:25:58 +02:00
Al Viro 193deee199 lustre: kill the pointless wrapper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-05 19:06:16 +04:00
Al Viro 84d08fa888 helper for reading ->d_count
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-05 18:59:33 +04:00
Thomas Gleixner 332962f2c8 clocksource: Reselect clocksource when watchdog validated high-res capability
Up to commit 5d33b883a (clocksource: Always verify highres capability)
we had no sanity check when selecting a clocksource, which prevented
that a non highres capable clocksource is used when the system already
switched to highres/nohz mode.

The new sanity check works as Alex and Tim found out. It prevents the
TSC from being used. This happens because on x86 the boot process
looks like this:

 tsc_start_freqency_validation(TSC);
 clocksource_register(HPET);
 clocksource_done_booting();
	clocksource_select()
		Selects HPET which is valid for high-res

 switch_to_highres();

 clocksource_register(TSC);
 	TSC is not selected, because it is not yet
	flagged as VALID_HIGH_RES

 clocksource_watchdog()
	Validates TSC for highres, but that does not make TSC
	the current clocksource.

Before the sanity check was added, we installed TSC unvalidated which
worked most of the time. If the TSC was really detected as unstable,
then the unstable logic removed it and installed HPET again.

The sanity check is correct and needed. So the watchdog needs to kick
a reselection of the clocksource, when it qualifies TSC as a valid
high res clocksource.

To solve this, we mark the clocksource which got the flag
CLOCK_SOURCE_VALID_FOR_HRES set by the watchdog with an new flag
CLOCK_SOURCE_RESELECT and trigger the watchdog thread. The watchdog
thread evaluates the flag and invokes clocksource_select() when set.

To avoid that the clocksource_done_booting() code, which is about to
install the first real clocksource anyway, needs to go through
clocksource_select and tick_oneshot_notify() pointlessly, split out
the clocksource_watchdog_kthread() list walk code and invoke the
select/notify only when called from clocksource_watchdog_kthread().

So clocksource_done_booting() can utilize the same splitout code
without the select/notify invocation and the clocksource_mutex
unlock/relock dance.

Reported-and-tested-by: Alex Shi <alex.shi@intel.com>
Cc: Hans Peter Anvin <hpa@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307042239150.11637@ionos.tec.linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-05 11:09:28 +02:00