Commit Graph

1266 Commits

Author SHA1 Message Date
Linus Torvalds
1200b6809d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support more Realtek wireless chips, from Jes Sorenson.

   2) New BPF types for per-cpu hash and arrap maps, from Alexei
      Starovoitov.

   3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

   4) Allow the use of SO_REUSEPORT in order to do per-thread processing
   of incoming TCP/UDP connections.  The muxing can be done using a
   BPF program which hashes the incoming packet.  From Craig Gallek.

   5) Add a multiplexer for TCP streams, to provide a messaged based
      interface.  BPF programs can be used to determine the message
      boundaries.  From Tom Herbert.

   6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

   7) Avoid factorial complexity when taking down an inetdev interface
      with lots of configured addresses.  We were doing things like
      traversing the entire address less for each address removed, and
      flushing the entire netfilter conntrack table for every address as
      well.

   8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

   9) Allow offloading u32 classifiers to hardware, and implement for
      ixgbe, from John Fastabend.

  10) Allow configuring IRQ coalescing parameters on a per-queue basis,
      from Kan Liang.

  11) Extend ethtool so that larger link mode masks can be supported.
      From David Decotigny.

  12) Introduce devlink, which can be used to configure port link types
      (ethernet vs Infiniband, etc.), port splitting, and switch device
      level attributes as a whole.  From Jiri Pirko.

  13) Hardware offload support for flower classifiers, from Amir Vadai.

  14) Add "Local Checksum Offload".  Basically, for a tunneled packet
      the checksum of the outer header is 'constant' (because with the
      checksum field filled into the inner protocol header, the payload
      of the outer frame checksums to 'zero'), and we can take advantage
      of that in various ways.  From Edward Cree"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  net/mlx4_core: Fix backward compatibility on VFs
  phy: mdio-thunder: Fix some Kconfig typos
  lan78xx: add ndo_get_stats64
  lan78xx: handle statistics counter rollover
  RDS: TCP: Remove unused constant
  RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
  net: smc911x: convert pxa dma to dmaengine
  team: remove duplicate set of flag IFF_MULTICAST
  bonding: remove duplicate set of flag IFF_MULTICAST
  net: fix a comment typo
  ethernet: micrel: fix some error codes
  ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
  bpf, dst: add and use dst_tclassid helper
  bpf: make skb->tc_classid also readable
  net: mvneta: bm: clarify dependencies
  cls_bpf: reset class and reuse major in da
  ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
  ldmvsw: Add ldmvsw.c driver code
  ...
2016-03-19 10:05:34 -07:00
Joonsoo Kim
95813b8faa mm/page_ref: add tracepoint to track down page reference manipulation
CMA allocation should be guaranteed to succeed by definition, but,
unfortunately, it would be failed sometimes.  It is hard to track down
the problem, because it is related to page reference manipulation and we
don't have any facility to analyze it.

This patch adds tracepoints to track down page reference manipulation.
With it, we can find exact reason of failure and can fix the problem.
Following is an example of tracepoint output.  (note: this example is
stale version that printing flags as the number.  Recent version will
print it as human readable string.)

<...>-9018  [004]    92.678375: page_ref_set:         pfn=0x17ac9 flags=0x0 count=1 mapcount=0 mapping=(nil) mt=4 val=1
<...>-9018  [004]    92.678378: kernel_stack:
 => get_page_from_freelist (ffffffff81176659)
 => __alloc_pages_nodemask (ffffffff81176d22)
 => alloc_pages_vma (ffffffff811bf675)
 => handle_mm_fault (ffffffff8119e693)
 => __do_page_fault (ffffffff810631ea)
 => trace_do_page_fault (ffffffff81063543)
 => do_async_page_fault (ffffffff8105c40a)
 => async_page_fault (ffffffff817581d8)
[snip]
<...>-9018  [004]    92.678379: page_ref_mod:         pfn=0x17ac9 flags=0x40048 count=2 mapcount=1 mapping=0xffff880015a78dc1 mt=4 val=1
[snip]
...
...
<...>-9131  [001]    93.174468: test_pages_isolated:  start_pfn=0x17800 end_pfn=0x17c00 fin_pfn=0x17ac9 ret=fail
[snip]
<...>-9018  [004]    93.174843: page_ref_mod_and_test: pfn=0x17ac9 flags=0x40068 count=0 mapcount=0 mapping=0xffff880015a78dc1 mt=4 val=-1 ret=1
 => release_pages (ffffffff8117c9e4)
 => free_pages_and_swap_cache (ffffffff811b0697)
 => tlb_flush_mmu_free (ffffffff81199616)
 => tlb_finish_mmu (ffffffff8119a62c)
 => exit_mmap (ffffffff811a53f7)
 => mmput (ffffffff81073f47)
 => do_exit (ffffffff810794e9)
 => do_group_exit (ffffffff81079def)
 => SyS_exit_group (ffffffff81079e74)
 => entry_SYSCALL_64_fastpath (ffffffff817560b6)

This output shows that problem comes from exit path.  In exit path, to
improve performance, pages are not freed immediately.  They are gathered
and processed by batch.  During this process, migration cannot be
possible and CMA allocation is failed.  This problem is hard to find
without this page reference tracepoint facility.

Enabling this feature bloat kernel text 30 KB in my configuration.

   text    data     bss     dec     hex filename
12127327        2243616 1507328 15878271         f2487f vmlinux_disabled
12157208        2258880 1507328 15923416         f2f8d8 vmlinux_enabled

Note that, due to header file dependency problem between mm.h and
tracepoint.h, this feature has to open code the static key functions for
tracepoints.  Proposed by Steven Rostedt in following link.

https://lkml.org/lkml/2015/12/9/699

[arnd@arndb.de: crypto/async_pq: use __free_page() instead of put_page()]
[iamjoonsoo.kim@lge.com: fix build failure for xtensa]
[akpm@linux-foundation.org: tweak Kconfig text, per Vlastimil]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Kirill A. Shutemov
bcf6691797 mm, tracing: refresh __def_vmaflag_names
Get list of VMA flags up-to-date and sort it to match VM_* definition
order.

[vbabka@suse.cz: add a note above vmaflag definitions to update the names when changing]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Vlastimil Babka
698b1b3064 mm, compaction: introduce kcompactd
Memory compaction can be currently performed in several contexts:

 - kswapd balancing a zone after a high-order allocation failure
 - direct compaction to satisfy a high-order allocation, including THP
   page fault attemps
 - khugepaged trying to collapse a hugepage
 - manually from /proc

The purpose of compaction is two-fold.  The obvious purpose is to
satisfy a (pending or future) high-order allocation, and is easy to
evaluate.  The other purpose is to keep overal memory fragmentation low
and help the anti-fragmentation mechanism.  The success wrt the latter
purpose is more

The current situation wrt the purposes has a few drawbacks:

 - compaction is invoked only when a high-order page or hugepage is not
   available (or manually).  This might be too late for the purposes of
   keeping memory fragmentation low.
 - direct compaction increases latency of allocations.  Again, it would
   be better if compaction was performed asynchronously to keep
   fragmentation low, before the allocation itself comes.
 - (a special case of the previous) the cost of compaction during THP
   page faults can easily offset the benefits of THP.
 - kswapd compaction appears to be complex, fragile and not working in
   some scenarios.  It could also end up compacting for a high-order
   allocation request when it should be reclaiming memory for a later
   order-0 request.

To improve the situation, we should be able to benefit from an
equivalent of kswapd, but for compaction - i.e. a background thread
which responds to fragmentation and the need for high-order allocations
(including hugepages) somewhat proactively.

One possibility is to extend the responsibilities of kswapd, which could
however complicate its design too much.  It should be better to let
kswapd handle reclaim, as order-0 allocations are often more critical
than high-order ones.

Another possibility is to extend khugepaged, but this kthread is a
single instance and tied to THP configs.

This patch goes with the option of a new set of per-node kthreads called
kcompactd, and lays the foundations, without introducing any new
tunables.  The lifecycle mimics kswapd kthreads, including the memory
hotplug hooks.

For compaction, kcompactd uses the standard compaction_suitable() and
ompact_finished() criteria and the deferred compaction functionality.
Unlike direct compaction, it uses only sync compaction, as there's no
allocation latency to minimize.

This patch doesn't yet add a call to wakeup_kcompactd.  The kswapd
compact/reclaim loop for high-order pages will be replaced by waking up
kcompactd in the next patch with the description of what's wrong with
the old approach.

Waking up of the kcompactd threads is also tied to kswapd activity and
follows these rules:
 - we don't want to affect any fastpaths, so wake up kcompactd only from
   the slowpath, as it's done for kswapd
 - if kswapd is doing reclaim, it's more important than compaction, so
   don't invoke kcompactd until kswapd goes to sleep
 - the target order used for kswapd is passed to kcompactd

Future possible future uses for kcompactd include the ability to wake up
kcompactd on demand in special situations, such as when hugepages are
not available (currently not done due to __GFP_NO_KSWAPD) or when a
fragmentation event (i.e.  __rmqueue_fallback()) occurs.  It's also
possible to perform periodic compaction with kcompactd.

[arnd@arndb.de: fix build errors with kcompactd]
[paul.gortmaker@windriver.com: don't use modular references for non modular code]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Linus Torvalds
277edbabf6 Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
 "This time the majority of changes go into cpufreq and they are
  significant.

  First off, the way CPU frequency updates are triggered is different
  now.  Instead of having to set up and manage a deferrable timer for
  each CPU in the system to evaluate and possibly change its frequency
  periodically, cpufreq governors set up callbacks to be invoked by the
  scheduler on a regular basis (basically on utilization updates).  The
  "old" governors, "ondemand" and "conservative", still do all of their
  work in process context (although that is triggered by the scheduler
  now), but intel_pstate does it all in the callback invoked by the
  scheduler with no need for any additional asynchronous processing.

  Of course, this eliminates the overhead related to the management of
  all those timers, but also it allows the cpufreq governor code to be
  simplified quite a bit.  On top of that, the common code and data
  structures used by the "ondemand" and "conservative" governors are
  cleaned up and made more straightforward and some long-standing and
  quite annoying problems are addressed.  In particular, the handling of
  governor sysfs attributes is modified and the related locking becomes
  more fine grained which allows some concurrency problems to be avoided
  (particularly deadlocks with the core cpufreq code).

  In principle, the new mechanism for triggering frequency updates
  allows utilization information to be passed from the scheduler to
  cpufreq.  Although the current code doesn't make use of it, in the
  works is a new cpufreq governor that will make decisions based on the
  scheduler's utilization data.  That should allow the scheduler and
  cpufreq to work more closely together in the long run.

  In addition to the core and governor changes, cpufreq drivers are
  updated too.  Fixes and optimizations go into intel_pstate, the
  cpufreq-dt driver is updated on top of some modification in the
  Operating Performance Points (OPP) framework and there are fixes and
  other updates in the powernv cpufreq driver.

  Apart from the cpufreq updates there is some new ACPICA material,
  including a fix for a problem introduced by previous ACPICA updates,
  and some less significant changes in the ACPI code, like CPPC code
  optimizations, ACPI processor driver cleanups and support for loading
  ACPI tables from initrd.

  Also updated are the generic power domains framework, the Intel RAPL
  power capping driver and the turbostat utility and we have a bunch of
  traditional assorted fixes and cleanups.

  Specifics:

   - Redesign of cpufreq governors and the intel_pstate driver to make
     them use callbacks invoked by the scheduler to trigger CPU
     frequency evaluation instead of using per-CPU deferrable timers for
     that purpose (Rafael Wysocki).

   - Reorganization and cleanup of cpufreq governor code to make it more
     straightforward and fix some concurrency problems in it (Rafael
     Wysocki, Viresh Kumar).

   - Cleanup and improvements of locking in the cpufreq core (Viresh
     Kumar).

   - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
     Kumar, Eric Biggers).

   - intel_pstate driver updates including fixes, optimizations and a
     modification to make it enable enable hardware-coordinated P-state
     selection (HWP) by default if supported by the processor (Philippe
     Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
     Franciosi).

   - Operating Performance Points (OPP) framework updates to improve its
     handling of voltage regulators and device clocks and updates of the
     cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).

   - Updates of the powernv cpufreq driver to fix initialization and
     cleanup problems in it and correct its worker thread handling with
     respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
     Bhat).

   - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).

   - ACPICA updates including one fix for a regression introduced by
     previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
     Colin Ian King).

   - Support for installing ACPI tables from initrd (Lv Zheng).

   - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
     Chaugule).

   - Support for _HID(ACPI0010) devices (ACPI processor containers) and
     ACPI processor driver cleanups (Sudeep Holla).

   - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
     Aleksey Makarov).

   - Modification of the ACPI PCI IRQ management code to make it treat
     255 in the Interrupt Line register as "not connected" on x86 (as
     per the specification) and avoid attempts to use that value as a
     valid interrupt vector (Chen Fan).

   - ACPI APEI fixes related to resource leaks (Josh Hunt).

   - Removal of modularity from a few ACPI drivers (BGRT, GHES,
     intel_pmic_crc) that cannot be built as modules in practice (Paul
     Gortmaker).

   - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
     as a valid resource type (Harb Abdulhamid).

   - New device ID (future AMD I2C controller) in the ACPI driver for
     AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).

   - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).

   - cpuidle menu governor optimization to avoid a square root
     computation in it (Rasmus Villemoes).

   - Fix for potential use-after-free in the generic device properties
     framework (Heikki Krogerus).

   - Updates of the generic power domains (genpd) framework including
     support for multiple power states of a domain, fixes and debugfs
     output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
     Geert Uytterhoeven).

   - Intel RAPL power capping driver updates to reduce IPI overhead in
     it (Jacob Pan).

   - System suspend/hibernation code cleanups (Eric Biggers, Saurabh
     Sengar).

   - Year 2038 fix for the process freezer (Abhilash Jindal).

   - turbostat utility updates including new features (decoding of more
     registers and CPUID fields, sub-second intervals support, GFX MHz
     and RC6 printout, --out command line option), fixes (syscall jitter
     detection and workaround, reductioin of the number of syscalls
     made, fixes related to Xeon x200 processors, compiler warning
     fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"

* tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
  tools/power turbostat: bugfix: TDP MSRs print bits fixing
  tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
  tools/power turbostat: call __cpuid() instead of __get_cpuid()
  tools/power turbostat: indicate SMX and SGX support
  tools/power turbostat: detect and work around syscall jitter
  tools/power turbostat: show GFX%rc6
  tools/power turbostat: show GFXMHz
  tools/power turbostat: show IRQs per CPU
  tools/power turbostat: make fewer systems calls
  tools/power turbostat: fix compiler warnings
  tools/power turbostat: add --out option for saving output in a file
  tools/power turbostat: re-name "%Busy" field to "Busy%"
  tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
  tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
  tools/power turbostat: allow sub-sec intervals
  ACPI / APEI: ERST: Fixed leaked resources in erst_init
  ACPI / APEI: Fix leaked resources
  intel_pstate: Do not skip samples partially
  intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
  intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
  ...
2016-03-16 14:10:53 -07:00
Linus Torvalds
271ecc5253 Merge branch 'akpm' (patches from Andrew)
Merge first patch-bomb from Andrew Morton:

 - some misc things

 - ofs2 updates

 - about half of MM

 - checkpatch updates

 - autofs4 update

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits)
  autofs4: fix string.h include in auto_dev-ioctl.h
  autofs4: use pr_xxx() macros directly for logging
  autofs4: change log print macros to not insert newline
  autofs4: make autofs log prints consistent
  autofs4: fix some white space errors
  autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked()
  autofs4: fix coding style line length in autofs4_wait()
  autofs4: fix coding style problem in autofs4_get_set_timeout()
  autofs4: coding style fixes
  autofs: show pipe inode in mount options
  kallsyms: add support for relative offsets in kallsyms address table
  kallsyms: don't overload absolute symbol type for percpu symbols
  x86: kallsyms: disable absolute percpu symbols on !SMP
  checkpatch: fix another left brace warning
  checkpatch: improve UNSPECIFIED_INT test for bare signed/unsigned uses
  checkpatch: warn on bare unsigned or signed declarations without int
  checkpatch: exclude asm volatile from complex macro check
  mm: memcontrol: drop unnecessary lru locking from mem_cgroup_migrate()
  mm: migrate: consolidate mem_cgroup_migrate() calls
  mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  ...
2016-03-16 11:51:08 -07:00
Linus Torvalds
10dc374766 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
 "One of the largest releases for KVM...  Hardly any generic
  changes, but lots of architecture-specific updates.

  ARM:
   - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
   - PMU support for guests
   - 32bit world switch rewritten in C
   - various optimizations to the vgic save/restore code.

  PPC:
   - enabled KVM-VFIO integration ("VFIO device")
   - optimizations to speed up IPIs between vcpus
   - in-kernel handling of IOMMU hypercalls
   - support for dynamic DMA windows (DDW).

  s390:
   - provide the floating point registers via sync regs;
   - separated instruction vs.  data accesses
   - dirty log improvements for huge guests
   - bugfixes and documentation improvements.

  x86:
   - Hyper-V VMBus hypercall userspace exit
   - alternative implementation of lowest-priority interrupts using
     vector hashing (for better VT-d posted interrupt support)
   - fixed guest debugging with nested virtualizations
   - improved interrupt tracking in the in-kernel IOAPIC
   - generic infrastructure for tracking writes to guest
     memory - currently its only use is to speedup the legacy shadow
     paging (pre-EPT) case, but in the future it will be used for
     virtual GPUs as well
   - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits)
  KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch
  KVM: x86: disable MPX if host did not enable MPX XSAVE features
  arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit
  arm64: KVM: vgic-v3: Reset LRs at boot time
  arm64: KVM: vgic-v3: Do not save an LR known to be empty
  arm64: KVM: vgic-v3: Save maintenance interrupt state only if required
  arm64: KVM: vgic-v3: Avoid accessing ICH registers
  KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit
  KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit
  KVM: arm/arm64: vgic-v2: Reset LRs at boot time
  KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty
  KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function
  KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required
  KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers
  KVM: s390: allocate only one DMA page per VM
  KVM: s390: enable STFLE interpretation only if enabled for the guest
  KVM: s390: wake up when the VCPU cpu timer expires
  KVM: s390: step the VCPU timer while in enabled wait
  KVM: s390: protect VCPU cpu timer with a seqcount
  KVM: s390: step VCPU cpu timer during kvm_run ioctl
  ...
2016-03-16 09:55:35 -07:00
Vlastimil Babka
420adbe9fc mm, tracing: unify mm flags handling in tracepoints and printk
In tracepoints, it's possible to print gfp flags in a human-friendly
format through a macro show_gfp_flags(), which defines a translation
array and passes is to __print_flags().  Since the following patch will
introduce support for gfp flags printing in printk(), it would be nice
to reuse the array.  This is not straightforward, since __print_flags()
can't simply reference an array defined in a .c file such as mm/debug.c
- it has to be a macro to allow the macro magic to communicate the
format to userspace tools such as trace-cmd.

The solution is to create a macro __def_gfpflag_names which is used both
in show_gfp_flags(), and to define the gfpflag_names[] array in
mm/debug.c.

On the other hand, mm/debug.c also defines translation tables for page
flags and vma flags, and desire was expressed (but not implemented in
this series) to use these also from tracepoints.  Thus, this patch also
renames the events/gfpflags.h file to events/mmflags.h and moves the
table definitions there, using the same macro approach as for gfpflags.
This allows translating all three kinds of mm-specific flags both in
tracepoints and printk.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
1f7866b4ae mm, tracing: make show_gfp_flags() up to date
The show_gfp_flags() macro provides human-friendly printing of gfp flags
in tracepoints.  However, it is somewhat out of date and missing several
flags.  This patches fills in the missing flags, and distinguishes
properly between GFP_ATOMIC and __GFP_ATOMIC which were both translated
to "GFP_ATOMIC".  More generally, all __GFP_X flags which were
previously printed as GFP_X, are now printed as __GFP_X, since ommiting
the underscores results in output that doesn't actually match the source
code, and can only lead to confusion.  Where both variants are defined
equal (e.g.  _DMA and _DMA32), the variant without underscores are
preferred.

Also add a note in gfp.h so hopefully future changes will be synced
better.

__GFP_MOVABLE is defined twice in include/linux/gfp.h with different
comments.  Leave just the newer one, which was intended to replace the
old one.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Linus Torvalds
710d60cbf1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu hotplug updates from Thomas Gleixner:
 "This is the first part of the ongoing cpu hotplug rework:

   - Initial implementation of the state machine

   - Runs all online and prepare down callbacks on the plugged cpu and
     not on some random processor

   - Replaces busy loop waiting with completions

   - Adds tracepoints so the states can be followed"

More detailed commentary on this work from an earlier email:
 "What's wrong with the current cpu hotplug infrastructure?

   - Asymmetry

     The hotplug notifier mechanism is asymmetric versus the bringup and
     teardown.  This is mostly caused by the notifier mechanism.

   - Largely undocumented dependencies

     While some notifiers use explicitely defined notifier priorities,
     we have quite some notifiers which use numerical priorities to
     express dependencies without any documentation why.

   - Control processor driven

     Most of the bringup/teardown of a cpu is driven by a control
     processor.  While it is understandable, that preperatory steps,
     like idle thread creation, memory allocation for and initialization
     of essential facilities needs to be done before a cpu can boot,
     there is no reason why everything else must run on a control
     processor.  Before this patch series, bringup looks like this:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu

       bring the rest up

   - All or nothing approach

     There is no way to do partial bringups.  That's something which is
     really desired because we waste e.g.  at boot substantial amount of
     time just busy waiting that the cpu comes to life.  That's stupid
     as we could very well do preparatory steps and the initial IPI for
     other cpus and then go back and do the necessary low level
     synchronization with the freshly booted cpu.

   - Minimal debuggability

     Due to the notifier based design, it's impossible to switch between
     two stages of the bringup/teardown back and forth in order to test
     the correctness.  So in many hotplug notifiers the cancel
     mechanisms are either not existant or completely untested.

   - Notifier [un]registering is tedious

     To [un]register notifiers we need to protect against hotplug at
     every callsite.  There is no mechanism that bringup/teardown
     callbacks are issued on the online cpus, so every caller needs to
     do it itself.  That also includes error rollback.

  What's the new design?

     The base of the new design is a symmetric state machine, where both
     the control processor and the booting/dying cpu execute a well
     defined set of states.  Each state is symmetric in the end, except
     for some well defined exceptions, and the bringup/teardown can be
     stopped and reversed at almost all states.

     So the bringup of a cpu will look like this in the future:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu

                                       bring itself up

     The synchronization step does not require the control cpu to wait.
     That mechanism can be done asynchronously via a worker or some
     other mechanism.

     The teardown can be made very similar, so that the dying cpu cleans
     up and brings itself down.  Cleanups which need to be done after
     the cpu is gone, can be scheduled asynchronously as well.

  There is a long way to this, as we need to refactor the notion when a
  cpu is available.  Today we set the cpu online right after it comes
  out of the low level bringup, which is not really correct.

  The proper mechanism is to set it to available, i.e. cpu local
  threads, like softirqd, hotplug thread etc. can be scheduled on that
  cpu, and once it finished all booting steps, it's set to online, so
  general workloads can be scheduled on it.  The reverse happens on
  teardown.  First thing to do is to forbid scheduling of general
  workloads, then teardown all the per cpu resources and finally shut it
  off completely.

  This patch series implements the basic infrastructure for this at the
  core level.  This includes the following:

   - Basic state machine implementation with well defined states, so
     ordering and prioritization can be expressed.

   - Interfaces to [un]register state callbacks

     This invokes the bringup/teardown callback on all online cpus with
     the proper protection in place and [un]installs the callbacks in
     the state machine array.

     For callbacks which have no particular ordering requirement we have
     a dynamic state space, so that drivers don't have to register an
     explicit hotplug state.

     If a callback fails, the code automatically does a rollback to the
     previous state.

   - Sysfs interface to drive the state machine to a particular step.

     This is only partially functional today.  Full functionality and
     therefor testability will be achieved once we converted all
     existing hotplug notifiers over to the new scheme.

   - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying
     processor:

       Control CPU                     Booting CPU

       do preparatory steps
       kick cpu into life

                                       do low level init

       sync with booting cpu           sync with control cpu
       wait for boot
                                       bring itself up

                                       Signal completion to control cpu

     In a previous step of this work we've done a full tree mechanical
     conversion of all hotplug notifiers to the new scheme.  The balance
     is a net removal of about 4000 lines of code.

     This is not included in this series, as we decided to take a
     different approach.  Instead of mechanically converting everything
     over, we will do a proper overhaul of the usage sites one by one so
     they nicely fit into the symmetric callback scheme.

     I decided to do that after I looked at the ugliness of some of the
     converted sites and figured out that their hotplug mechanism is
     completely buggered anyway.  So there is no point to do a
     mechanical conversion first as we need to go through the usage
     sites one by one again in order to achieve a full symmetric and
     testable behaviour"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  cpu/hotplug: Document states better
  cpu/hotplug: Fix smpboot thread ordering
  cpu/hotplug: Remove redundant state check
  cpu/hotplug: Plug death reporting race
  rcu: Make CPU_DYING_IDLE an explicit call
  cpu/hotplug: Make wait for dead cpu completion based
  cpu/hotplug: Let upcoming cpu bring itself fully up
  arch/hotplug: Call into idle with a proper state
  cpu/hotplug: Move online calls to hotplugged cpu
  cpu/hotplug: Create hotplug threads
  cpu/hotplug: Split out the state walk into functions
  cpu/hotplug: Unpark smpboot threads from the state machine
  cpu/hotplug: Move scheduler cpu_online notifier to hotplug core
  cpu/hotplug: Implement setup/removal interface
  cpu/hotplug: Make target state writeable
  cpu/hotplug: Add sysfs state interface
  cpu/hotplug: Hand in target state to _cpu_up/down
  cpu/hotplug: Convert the hotplugged cpu work to a state machine
  cpu/hotplug: Convert to a state machine for the control processor
  cpu/hotplug: Add tracepoints
  ...
2016-03-15 13:50:29 -07:00
Linus Torvalds
e23604edac Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull NOHZ updates from Ingo Molnar:
 "NOHZ enhancements, by Frederic Weisbecker, which reorganizes/refactors
  the NOHZ 'can the tick be stopped?' infrastructure and related code to
  be data driven, and harmonizes the naming and handling of all the
  various properties"

[ This makes the ugly "fetch_or()" macro that the scheduler used
  internally a new generic helper, and does a bad job at it.

  I'm pulling it, but I've asked Ingo and Frederic to get this
  fixed up ]

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched-clock: Migrate to use new tick dependency mask model
  posix-cpu-timers: Migrate to use new tick dependency mask model
  sched: Migrate sched to use new tick dependency mask model
  sched: Account rr tasks
  perf: Migrate perf to use new tick dependency mask model
  nohz: Use enum code for tick stop failure tracing message
  nohz: New tick dependency mask
  nohz: Implement wide kick on top of irq work
  atomic: Export fetch_or()
2016-03-14 19:44:38 -07:00
Rafael J. Wysocki
4ed3900427 Merge branch 'pm-cpufreq'
* pm-cpufreq: (94 commits)
  intel_pstate: Do not skip samples partially
  intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
  intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
  intel_pstate: Optimize calculation for max/min_perf_adj
  intel_pstate: Remove extra conversions in pid calculation
  cpufreq: Move scheduler-related code to the sched directory
  Revert "cpufreq: postfix policy directory with the first CPU in related_cpus"
  cpufreq: Reduce cpufreq_update_util() overhead a bit
  cpufreq: Select IRQ_WORK if CPU_FREQ_GOV_COMMON is set
  cpufreq: Remove 'policy->governor_enabled'
  cpufreq: Rename __cpufreq_governor() to cpufreq_governor()
  cpufreq: Relocate handle_update() to kill its declaration
  cpufreq: governor: Drop unnecessary checks from show() and store()
  cpufreq: governor: Fix race in dbs_update_util_handler()
  cpufreq: governor: Make gov_set_update_util() static
  cpufreq: governor: Narrow down the dbs_data_mutex coverage
  cpufreq: governor: Make dbs_data_mutex static
  cpufreq: governor: Relocate definitions of tuners structures
  cpufreq: governor: Move per-CPU data to the common code
  cpufreq: governor: Make governor private data per-policy
  ...
2016-03-14 14:22:03 +01:00
Mark Brown
3b22371e20 Merge remote-tracking branches 'asoc/fix/jack', 'asoc/fix/max98088', 'asoc/fix/max98095', 'asoc/fix/omap', 'asoc/fix/pxa' and 'asoc/fix/qcom-be' into asoc-linus 2016-03-05 21:26:45 +09:00
Frederic Weisbecker
e6e6cc22e0 nohz: Use enum code for tick stop failure tracing message
It makes nohz tracing more lightweight, standard and easier to parse.

Examples:

       user_loop-2904  [007] d..1   517.701126: tick_stop: success=1 dependency=NONE
       user_loop-2904  [007] dn.1   518.021181: tick_stop: success=0 dependency=SCHED
    posix_timers-6142  [007] d..1  1739.027400: tick_stop: success=0 dependency=POSIX_TIMER
       user_loop-5463  [007] dN.1  1185.931939: tick_stop: success=0 dependency=PERF_EVENTS

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2016-03-02 16:42:15 +01:00
Thomas Gleixner
5ba9ac8e2c cpu/hotplug: Add tracepoints
We want to trace the hotplug machinery. Add tracepoints to track the
invocation of callbacks and their result.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel <riel@redhat.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20160226182340.593563875@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-01 20:36:54 +01:00
Arnd Bergmann
f4833a519a ASoC: trace: fix printing jack name
After a change to the snd_jack structure, the 'name' member
is no longer available in all configurations, which results in a
build failure in the tracing code:

include/trace/events/asoc.h: In function 'trace_event_raw_event_snd_soc_jack_report':
include/trace/events/asoc.h:240:32: error: 'struct snd_jack' has no member named 'name'

The name field is normally initialized from the card shortname and
the jack "id" field:

        snprintf(jack->name, sizeof(jack->name), "%s %s",
                 card->shortname, jack->id);

This changes the tracing output to just contain the 'id' by
itself, which slightly changes the output format but avoids the
link error and is hopefully still enough to see what is going on.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: fe0d128c57 ("ALSA: jack: Allow building the jack layer without input device")
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-26 10:52:48 +09:00
Christian Borntraeger
6b6de68c63 KVM: halt_polling: improve grow/shrink settings
Right now halt_poll_ns can be change during runtime. The
grow and shrink factors can only be set during module load.
Lets fix several aspects of grow shrink:
- make grow/shrink changeable by root
- make all variables unsigned int
- read the variables once to prevent races

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 18:48:29 +01:00
Sowmini Varadhan
46fcc6ef9d sunvnet: Add support for perf LDC event tracing
Add perf event macros for support of tracing and instrumentation
of LDC state machine

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-07 14:13:05 -05:00
Shilpasri G Bhat
0306e481d4 cpufreq: powernv/tracing: Add powernv_throttle tracepoint
This patch adds the powernv_throttle tracepoint to trace the CPU
frequency throttling event, which is used by the powernv-cpufreq
driver in POWER8.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-02-05 02:38:02 +01:00
Linus Torvalds
26cd83670f Merge tag 'trace-v4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull minor tracing fixes from Steven Rostedt:
 "This includes three minor fixes, mostly due to cut-and-paste issues.

  The first is a cut and paste issue that changed the amount of stack to
  skip when tracing a stack dump from 0 to 6, which basically made the
  stack disappear for small stack traces.

  The second fix is just removing an unused field in a struct that is no
  longer used, and currently just wastes space.

  The third is another cut-and-paste fix that had a tracepoint recording
  the wrong field (it was recording the previous field a second time)"

* tag 'trace-v4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/dma-buf/fence: Fix timeline str value on fence_annotate_wait_on
  ftrace: Remove unused nr_trampolines var
  tracing: Fix stacktrace skip depth in trace_buffer_unlock_commit_regs()
2016-01-28 17:00:50 -08:00
Gustavo Padovan
7fd1361599 tracing/dma-buf/fence: Fix timeline str value on fence_annotate_wait_on
timeline was wrongly assigned with ->get_driver_name().

Link: http://lkml.kernel.org/r/1453376895-30747-1-git-send-email-gustavo@padovan.org

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-01-25 11:03:17 -05:00
Linus Torvalds
048ccca8c1 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma updates from Doug Ledford:
 "Initial roundup of 4.5 merge window patches

   - Remove usage of ib_query_device and instead store attributes in
     ib_device struct

   - Move iopoll out of block and into lib, rename to irqpoll, and use
     in several places in the rdma stack as our new completion queue
     polling library mechanism.  Update the other block drivers that
     already used iopoll to use the new mechanism too.

   - Replace the per-entry GID table locks with a single GID table lock

   - IPoIB multicast cleanup

   - Cleanups to the IB MR facility

   - Add support for 64bit extended IB counters

   - Fix for netlink oops while parsing RDMA nl messages

   - RoCEv2 support for the core IB code

   - mlx4 RoCEv2 support

   - mlx5 RoCEv2 support

   - Cross Channel support for mlx5

   - Timestamp support for mlx5

   - Atomic support for mlx5

   - Raw QP support for mlx5

   - MAINTAINERS update for mlx4/mlx5

   - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates

   - Add support for remote invalidate to the iSER driver (pushed
     through the RDMA tree due to dependencies, acknowledged by nab)

   - Update to NFSoRDMA (pushed through the RDMA tree due to
     dependencies, acknowledged by Bruce)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits)
  IB/mlx5: Unify CQ create flags check
  IB/mlx5: Expose Raw Packet QP to user space consumers
  {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
  IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
  IB/mlx5: Add Raw Packet QP query functionality
  IB/mlx5: Add create and destroy functionality for Raw Packet QP
  IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
  IB/mlx5: Allocate a Transport Domain for each ucontext
  net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
  net/mlx5_core: Add RQ and SQ event handling
  net/mlx5_core: Export transport objects
  IB/mlx5: Expose CQE version to user-space
  IB/mlx5: Add CQE version 1 support to user QPs and SRQs
  IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext
  IB/sa: Fix netlink local service GFP crash
  IB/srpt: Remove redundant wc array
  IB/qib: Improve ipoib UD performance
  IB/mlx4: Advertise RoCE v2 support
  IB/mlx4: Create and use another QP1 for RoCEv2
  IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
  ...
2016-01-23 18:45:06 -08:00
Linus Torvalds
391f2a16b7 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
 "Some locking and page fault bug fixes from Jan Kara, some ext4
  encryption fixes from me, and Li Xi's Project Quota commits"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  fs: clean up the flags definition in uapi/linux/fs.h
  ext4: add FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support
  ext4: add project quota support
  ext4: adds project ID support
  ext4 crypto: simplify interfaces to directory entry insert functions
  ext4 crypto: add missing locking for keyring_key access
  ext4: use pre-zeroed blocks for DAX page faults
  ext4: implement allocation of pre-zeroed blocks
  ext4: provide ext4_issue_zeroout()
  ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag
  ext4: document lock ordering
  ext4: fix races of writeback with punch hole and zero range
  ext4: fix races between buffered IO and collapse / insert range
  ext4: move unlocked dio protection from ext4_alloc_file_blocks()
  ext4: fix races between page faults and hole punching
2016-01-22 11:23:35 -08:00
yalin wang
16fd0fe4aa mm: fix kernel crash in khugepaged thread
This crash is caused by NULL pointer deference, in page_to_pfn() marco,
when page == NULL :

  Unable to handle kernel NULL pointer dereference at virtual address 00000000
  Internal error: Oops: 94000006 [#1] SMP
  Modules linked in:
  CPU: 1 PID: 26 Comm: khugepaged Tainted: G        W       4.3.0-rc6-next-20151022ajb-00001-g32f3386-dirty #3
  PC is at khugepaged+0x378/0x1af8
  LR is at khugepaged+0x418/0x1af8
  Process khugepaged (pid: 26, stack limit = 0xffffffc079638020)
  Call trace:
    khugepaged+0x378/0x1af8
    kthread+0xdc/0xf4
    ret_from_fork+0xc/0x40
  Code: 35001700 f0002c60 aa0703e3 f9009fa0 (f94000e0)
  ---[ end trace 637503d8e28ae69e  ]---
  Kernel panic - not syncing: Fatal exception
  CPU2: stopping
  CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D W       4.3.0-rc6-next-20151022ajb-00001-g32f3386-dirty #3
  Hardware name: linux,dummy-virt (DT)

[akpm@linux-foundation.org: fix fat-fingered merge resolution]
Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-21 17:20:51 -08:00
Linus Torvalds
c1a198d923 Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "This has our usual assortment of fixes and cleanups, but the biggest
  change included is Omar Sandoval's free space tree.  It's not the
  default yet, mounting -o space_cache=v2 enables it and sets a readonly
  compat bit.  The tree can actually be deleted and regenerated if there
  are any problems, but it has held up really well in testing so far.

  For very large filesystems (30T+) our existing free space caching code
  can end up taking a huge amount of time during commits.  The new tree
  based code is faster and less work overall to update as the commit
  progresses.

  Omar worked on this during the summer and we'll hammer on it in
  production here at FB over the next few months"

* 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (73 commits)
  Btrfs: fix fitrim discarding device area reserved for boot loader's use
  Btrfs: Check metadata redundancy on balance
  btrfs: statfs: report zero available if metadata are exhausted
  btrfs: preallocate path for snapshot creation at ioctl time
  btrfs: allocate root item at snapshot ioctl time
  btrfs: do an allocation earlier during snapshot creation
  btrfs: use smaller type for btrfs_path locks
  btrfs: use smaller type for btrfs_path lowest_level
  btrfs: use smaller type for btrfs_path reada
  btrfs: cleanup, use enum values for btrfs_path reada
  btrfs: constify static arrays
  btrfs: constify remaining structs with function pointers
  btrfs tests: replace whole ops structure for free space tests
  btrfs: use list_for_each_entry* in backref.c
  btrfs: use list_for_each_entry_safe in free-space-cache.c
  btrfs: use list_for_each_entry* in check-integrity.c
  Btrfs: use linux/sizes.h to represent constants
  btrfs: cleanup, remove stray return statements
  btrfs: zero out delayed node upon allocation
  btrfs: pass proper enum type to start_transaction()
  ...
2016-01-18 12:44:40 -08:00