Commit Graph

20217 Commits

Author SHA1 Message Date
Al Viro
c0fec3a98b Merge branch 'iocb' into for-next 2015-04-11 22:24:41 -04:00
Thomas Gleixner
b7dccbea6b Merge tag 'irqchip-core-4.1-3' of git://git.infradead.org/users/jcooper/linux into irq/core
irqchip core change for v4.1 (round 3) from Jason Cooper

 Purge the gic_arch_extn hacks and abuse by using the new stacked domains

   NOTE: Due to the nature of these changes, patches crossing subsystems have
         been kept together in their own branches.

    - tegra
       - Handle the LIC properly

    - omap
       - Convert crossbar to stacked domains
       - kill arm,routable-irqs in GIC binding

    - exynos
       - Convert PMU wakeup to stacked domains

    - shmobile, ux500, zynq (irq_set_wake branch)
       - Switch from abusing gic_arch_extn to using gic_set_irqchip_flags
2015-04-11 11:17:28 +02:00
Rafael J. Wysocki
be77002101 Merge back earlier suspend/hibernate material for v4.1. 2015-04-10 12:01:59 +02:00
Linus Torvalds
e5e02de066 Merge tag 'pm+acpi-4.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
 "These are stable-candidate fixes of some recently reported issues in
  the cpufreq core, cpuidle core, the ACPI cpuidle driver and the
  hibernate core.

  Specifics:

   - Revert a 3.17 hibernate commit that was supposed to fix an issue
     related to e820 reserved regions, but broke resume from hibernation
     on Lenovo x230 (Rafael J Wysocki).

   - Prevent the ACPI cpuidle driver from overwriting the name and
     description of the C0 state set by the core when the list of
     C-states changes (Thomas Schlichter).

   - Remove the no longer needed state_count field from struct
     cpuidle_device which prevents the list of C-states shown by the
     sysfs interface from becoming incorrect when the current number of
     them is different from the number of C-states on boot (Bartlomiej
     Zolnierkiewicz).

   - The cpufreq core updates the policy object of the only online CPU
     during system resume to make it reflect the current hardware state,
     but it always assumes that CPU to be CPU0 which need not be the
     case, so fix the code to avoid that assumption (Viresh Kumar)"

* tag 'pm+acpi-4.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "PM / hibernate: avoid unsafe pages in e820 reserved regions"
  cpuidle: ACPI: do not overwrite name and description of C0
  cpuidle: remove state_count field from struct cpuidle_device
  cpufreq: Schedule work for the first-online CPU on resume
2015-04-09 17:44:27 -07:00
Rafael J. Wysocki
b2d5fb97d3 Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
* pm-sleep:
  Revert "PM / hibernate: avoid unsafe pages in e820 reserved regions"

* pm-cpufreq:
  cpufreq: Schedule work for the first-online CPU on resume

* pm-cpuidle:
  cpuidle: ACPI: do not overwrite name and description of C0
  cpuidle: remove state_count field from struct cpuidle_device
2015-04-09 23:25:23 +02:00
Jason Low
01ac33c1f9 locking/mutex: Further simplify mutex_spin_on_owner()
Similar to what Linus suggested for rwsem_spin_on_owner(), in
mutex_spin_on_owner() instead of having while (true) and
breaking out of the spin loop on lock->owner != owner, we can
have the loop directly check for while (lock->owner == owner) to
improve the readability of the code.

It also shrinks the code a bit:

   text    data     bss     dec     hex filename
   3721       0       0    3721     e89 mutex.o.before
   3705       0       0    3705     e79 mutex.o.after

Signed-off-by: Jason Low <jason.low2@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1428521960-5268-2-git-send-email-jason.low2@hp.com
[ Added code generation info. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09 08:10:23 +02:00
Al Viro
237dae8890 Merge branch 'iocb' into for-davem
trivial conflict in net/socket.c and non-trivial one in crypto -
that one had evaded aio_complete() removal.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:01:38 -04:00
Linus Torvalds
b97fdef8e6 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "Three fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: numa: disable change protection for vma(VM_HUGETLB)
  include/linux/dmapool.h: declare struct device
  mm: move zone lock to a different cache line than order-0 free page lists
2015-04-08 14:42:49 -07:00
Linus Torvalds
3afe9f8496 Copy the kernel module data from user space in chunks
Unlike most (all?) other copies from user space, kernel module loading
is almost unlimited in size.  So we do a potentially huge
"copy_from_user()" when we copy the module data from user space to the
kernel buffer, which can be a latency concern when preemption is
disabled (or voluntary).

Also, because 'copy_from_user()' clears the tail of the kernel buffer on
failures, even a *failed* copy can end up wasting a lot of time.

Normally neither of these are concerns in real life, but they do trigger
when doing stress-testing with trinity.  Running in a VM seems to add
its own overheadm causing trinity module load testing to even trigger
the watchdog.

The simple fix is to just chunk up the module loading, so that it never
tries to copy insanely big areas in one go.  That bounds the latency,
and also the amount of (unnecessarily, in this case) cleared memory for
the failure case.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-08 14:35:48 -07:00
Marc Zyngier
1b7047edfc genirq: Allow the irqchip state of an IRQ to be save/restored
There is a number of cases where a kernel subsystem may want to
introspect the state of an interrupt at the irqchip level:

- When a peripheral is shared between virtual machines,
  its interrupt state becomes part of the guest's state,
  and must be switched accordingly. KVM on arm/arm64 requires
  this for its guest-visible timer
- Some GPIO controllers seem to require peeking into the
  interrupt controller they are connected to to report
  their internal state

This seem to be a pattern that is common enough for the core code
to try and support this without too many horrible hacks. Introduce
a pair of accessors (irq_get_irqchip_state/irq_set_irqchip_state)
to retrieve the bits that can be of interest to another subsystem:
pending, active, and masked.

- irq_get_irqchip_state returns the state of the interrupt according
  to a parameter set to IRQCHIP_STATE_PENDING, IRQCHIP_STATE_ACTIVE,
  IRQCHIP_STATE_MASKED or IRQCHIP_STATE_LINE_LEVEL.
- irq_set_irqchip_state similarly sets the state of the interrupt.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Tested-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Phong Vo <pvo@apm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Tin Huynh <tnhuynh@apm.com>
Cc: Y Vo <yvo@apm.com>
Cc: Toan Le <toanle@apm.com>
Cc: Bjorn Andersson <bjorn@kryo.se>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1426676484-21812-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-08 23:28:28 +02:00
Marc Zyngier
fe0c52fc00 genirq: MSI: Fix freeing of unallocated MSI
While debugging an unrelated issue with the GICv3 ITS driver, the
following trace triggered:

WARNING: CPU: 1 PID: 1 at kernel/irq/irqdomain.c:1121 irq_domain_free_irqs+0x160/0x17c()
NULL pointer, cannot free irq
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W      3.19.0-rc6+ #3690
Hardware name: FVP Base (DT)
Call trace:
[<ffffffc000089398>] dump_backtrace+0x0/0x13c
[<ffffffc0000894e4>] show_stack+0x10/0x1c
[<ffffffc00066d134>] dump_stack+0x74/0x94
[<ffffffc0000a92f8>] warn_slowpath_common+0x9c/0xd4
[<ffffffc0000a938c>] warn_slowpath_fmt+0x5c/0x80
[<ffffffc0000ee04c>] irq_domain_free_irqs+0x15c/0x17c
[<ffffffc0000ef918>] msi_domain_free_irqs+0x58/0x74
[<ffffffc000386f58>] free_msi_irqs+0xb4/0x1c0

    // The msi_prepare callback fails here

[<ffffffc0003872c0>] pci_enable_msix+0x25c/0x3d4
[<ffffffc00038746c>] pci_enable_msix_range+0x34/0x80
[<ffffffc0003924ac>] vp_try_to_find_vqs+0xec/0x528
[<ffffffc000392954>] vp_find_vqs+0x6c/0xa8
[<ffffffc0003ee2a8>] init_vq+0x120/0x248
[<ffffffc0003eefb0>] virtblk_probe+0xb0/0x6bc
[<ffffffc00038fc34>] virtio_dev_probe+0x17c/0x214
[<ffffffc0003d4a04>] driver_probe_device+0x7c/0x23c
[<ffffffc0003d4cb0>] __driver_attach+0x98/0xa0
[<ffffffc0003d2c60>] bus_for_each_dev+0x60/0xb4
[<ffffffc0003d455c>] driver_attach+0x1c/0x28
[<ffffffc0003d41b0>] bus_add_driver+0x150/0x208
[<ffffffc0003d54c0>] driver_register+0x64/0x130
[<ffffffc00038f9e8>] register_virtio_driver+0x24/0x68
[<ffffffc00091320c>] init+0x70/0xac
[<ffffffc0000828f0>] do_one_initcall+0x94/0x1d0
[<ffffffc0008e9b00>] kernel_init_freeable+0x144/0x1e4
[<ffffffc00066a434>] kernel_init+0xc/0xd8
---[ end trace f9ee562a77cc7bae ]---

The ITS msi_prepare callback having failed, we end-up trying to
free MSIs that have never been allocated. Oddly enough, the kernel
is pretty upset about it.

It turns out that this behaviour was expected before the MSI domain
was introduced (and dealt with in arch_teardown_msi_irqs).

The obvious fix is to detect this early enough and bail out.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1422299419-6051-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-08 23:28:28 +02:00
Thomas Gleixner
462b69b1e4 Merge branch 'linus' into irq/core to get the GIC updates which
conflict with pending GIC changes.

Conflicts:
	drivers/usb/isp1760/isp1760-core.c
2015-04-08 23:26:21 +02:00
Steven Rostedt (Red Hat)
9828413d47 tracing: Add enum_map file to show enums that have been mapped
Add a enum_map file in the tracing directory to see what enums have been
saved to convert in the print fmt files.

As this requires the enum mapping to be persistent in memory, it is only
created if the new config option CONFIG_TRACE_ENUM_MAP_FILE is enabled.
This is for debugging and will increase the persistent memory footprint
of the kernel.

Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-04-08 10:58:35 -04:00
Steven Rostedt (Red Hat)
3673b8e4ce tracing: Allow for modules to convert their enums to values
Update the infrastructure such that modules that declare TRACE_DEFINE_ENUM()
will have those enums converted into their values in the tracepoint
print fmt strings.

Link: http://lkml.kernel.org/r/87vbhjp74q.fsf@rustcorp.com.au

Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-04-08 09:39:57 -04:00
Steven Rostedt (Red Hat)
0c564a538a tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.

For example, the tracepoint trace_tlb_flush() has:

 __print_symbolic(REC->reason,
    { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
    { TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
    { TLB_LOCAL_SHOOTDOWN, "local shootdown" },
    { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })

Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.

With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:

By adding:

 TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
 TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
 TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
 TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);

 $ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
 __print_symbolic(REC->reason,
    { 0, "flush on task switch" },
    { 1, "remote shootdown" },
    { 2, "local shootdown" },
    { 3, "local mm shootdown" })

The above is what userspace expects to see, and tools do not need to
be modified to parse them.

Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org

Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-04-08 09:39:56 -04:00
Naoya Horiguchi
6b79c57b92 mm: numa: disable change protection for vma(VM_HUGETLB)
Currently when a process accesses a hugetlb range protected with
PROTNONE, unexpected COWs are triggered, which finally puts the hugetlb
subsystem into a broken/uncontrollable state, where for example
h->resv_huge_pages is subtracted too much and wraps around to a very
large number, and the free hugepage pool is no longer maintainable.

This patch simply stops changing protection for vma(VM_HUGETLB) to fix
the problem.  And this also allows us to avoid useless overhead of minor
faults.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Suggested-by: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-07 16:45:33 -07:00
Greg Kroah-Hartman
b3e3bf2ef2 Merge 4.0-rc7 into tty-next
We want the fixes in here as well, also to help out with merge issues.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-07 11:07:20 +02:00
Rafael J. Wysocki
f82daee49c Revert "PM / hibernate: avoid unsafe pages in e820 reserved regions"
Commit 84c91b7ae0 (PM / hibernate: avoid unsafe pages in e820 reserved
regions) is reported to make resume from hibernation on Lenovo x230
unreliable, so revert it.

We will revisit the issue the commit in question was supposed to fix
in the future.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=96111
Reported-by: rhn <kebuac.rhn@porcupinefactory.org>
Cc: 3.17+ <stable@vger.kernel.org> # 3.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-07 01:13:23 +02:00
Frederic Weisbecker
6ba94429c8 workqueue: Reorder sysfs code
The sysfs code usually belongs to the botom of the file since it deals
with high level objects. In the workqueue code it's misplaced and such
that we'll need to work around functions references to allow the sysfs
code to call APIs like apply_workqueue_attrs().

Lets move that block further in the file, almost the botom.

And declare workqueue_sysfs_unregister() just before destroy_workqueue()
which reference it.

tj: Moved workqueue_sysfs_unregister() forward declaration where other
    forward declarations are.

Suggested-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2015-04-06 11:16:04 -04:00
Rafael J. Wysocki
def747087e timers/PM: Drop unnecessary braces from tick_freeze()
Some braces in tick_freeze() are not necessary, so drop them.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: peterz@infradead.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1534128.H5hN3KBFB4@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 15:15:52 +02:00
Rafael J. Wysocki
422fe7502e timers/PM: Fix up tick_unfreeze()
A recent conflict resolution has left tick_resume() in
tick_unfreeze() which leads to an unbalanced execution of
tick_resume_broadcast() every time that function runs.

Fix that by replacing the tick_resume() in tick_unfreeze()
with tick_resume_local() as appropriate.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: boris.ostrovsky@oracle.com
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: peterz@infradead.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8099075.V0LvN3pQAV@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 15:15:51 +02:00
Borislav Petkov
62a935b256 sched/core: Drop debugging leftover trace_printk call
Commit:

  3c18d447b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()")

forgot a trace_printk() debugging piece in and Steve's banner screamed
in dmesg. Remove it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1428050570-21041-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 10:48:25 +02:00
Thomas Gleixner
347c6f6dda timekeeping: Get rid of stale comment
Arch specific management of xtime/jiffies/wall_to_monotonic is
gone for quite a while. Zap the stale comment.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/2422730.dmO29q661S@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 08:44:37 +02:00
Thomas Gleixner
a49b116dcb clockevents: Cleanup dead cpu explicitely
clockevents_notify() is a leftover from the early design of the
clockevents facility. It's really not a notification mechanism,
it's a multiplex call. We are way better off to have explicit
calls instead of this monstrosity.

Split out the cleanup function for a dead cpu and invoke it
directly from the cpu down code. Make it conditional on
CPU_HOTPLUG as well.

Temporary change, will be refined in the future.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ Rebased, added clockevents_notify() removal ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1735025.raBZdQHM3m@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 08:44:37 +02:00
Thomas Gleixner
52c063d1ad clockevents: Make tick handover explicit
clockevents_notify() is a leftover from the early design of the
clockevents facility. It's really not a notification mechanism,
it's a multiplex call. We are way better off to have explicit
calls instead of this monstrosity.

Split out the tick_handover call and invoke it explicitely from
the hotplug code. Temporary solution will be cleaned up in later
patches.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ Rebase ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1658173.RkEEILFiQZ@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 08:44:36 +02:00