Commit Graph

19679 Commits

Author SHA1 Message Date
Linus Torvalds
0ba97bc4b4 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Ingo Molnar:
 "The main changes in this cycle were:

   - rework hrtimer expiry calculation in hrtimer_interrupt(): the
     previous code had a subtle bug where expiry caching would miss an
     expiry, resulting in occasional bogus (late) expiry of hrtimers.

   - continuing Y2038 fixes

   - ktime division optimization

   - misc smaller fixes and cleanups"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimer: Make __hrtimer_get_next_event() static
  rtc: Convert rtc_set_ntp_time() to use timespec64
  rtc: Remove redundant rtc_valid_tm() from rtc_hctosys()
  rtc: Modify rtc_hctosys() to address y2038 issues
  rtc: Update rtc-dev to use y2038-safe time interfaces
  rtc: Update interface.c to use y2038-safe time interfaces
  time: Expose get_monotonic_boottime64 for in-kernel use
  time: Expose getboottime64 for in-kernel uses
  ktime: Optimize ktime_divns for constant divisors
  hrtimer: Prevent stale expiry time in hrtimer_interrupt()
  ktime.h: Introduce ktime_ms_delta
2015-02-09 16:33:07 -08:00
Linus Torvalds
5b9b28a63f Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main scheduler changes in this cycle were:

   - various sched/deadline fixes and enhancements

   - rescheduling latency fixes/cleanups

   - rework the rq->clock code to be more consistent and more robust.

   - minor micro-optimizations

   - ->avg.decay_count fixes

   - add a stack overflow check to might_sleep()

   - idle-poll handler fix, possibly resulting in power savings

   - misc smaller updates and fixes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/Documentation: Remove unneeded word
  sched/wait: Introduce wait_on_bit_timeout()
  sched: Pull resched loop to __schedule() callers
  sched/deadline: Remove cpu_active_mask from cpudl_find()
  sched: Fix hrtick_start() on UP
  sched/deadline: Avoid pointless __setscheduler()
  sched/deadline: Fix stale yield state
  sched/deadline: Fix hrtick for a non-leftmost task
  sched/deadline: Modify cpudl::free_cpus to reflect rd->online
  sched/idle: Add missing checks to the exit condition of cpu_idle_poll()
  sched: Fix missing preemption opportunity
  sched/rt: Reduce rq lock contention by eliminating locking of non-feasible target
  sched/debug: Print rq->clock_task
  sched/core: Rework rq->clock update skips
  sched/core: Validate rq_clock*() serialization
  sched/core: Remove check of p->sched_class
  sched/fair: Fix sched_entity::avg::decay_count initialization
  sched/debug: Fix potential call to __ffs(0) in sched_show_task()
  sched/debug: Check for stack overflow in ___might_sleep()
  sched/fair: Fix the dealing with decay_count in __synchronize_entity_decay()
2015-02-09 16:06:06 -08:00
Linus Torvalds
a4cbbf549a Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - AMD range breakpoints support:

     Extend breakpoint tools and core to support address range through
     perf event with initial backend support for AMD extended
     breakpoints.

     The syntax is:

         perf record -e mem:addr/len:type

     For example set write breakpoint from 0x1000 to 0x1200 (0x1000 + 512)

         perf record -e mem:0x1000/512:w

   - event throttling/rotating fixes

   - various event group handling fixes, cleanups and general paranoia
     code to be more robust against bugs in the future.

    - kernel stack overhead fixes

  User-visible tooling side changes:

   - Show precise number of samples in at the end of a 'record' session,
     if processing build ids, since we will then traverse the whole
     perf.data file and see all the PERF_RECORD_SAMPLE records,
     otherwise stop showing the previous off-base heuristicly counted
     number of "samples" (Namhyung Kim).

   - Support to read compressed module from build-id cache (Namhyung
     Kim)

   - Enable sampling loads and stores simultaneously in 'perf mem'
     (Stephane Eranian)

   - 'perf diff' output improvements (Namhyung Kim)

   - Fix error reporting for evsel pgfault constructor (Arnaldo Carvalho
     de Melo)

  Tooling side infrastructure changes:

   - Cache eh/debug frame offset for dwarf unwind (Namhyung Kim)

   - Support parsing parameterized events (Cody P Schafer)

   - Add support for IP address formats in libtraceevent (David Ahern)

  Plus other misc fixes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  perf: Decouple unthrottling and rotating
  perf: Drop module reference on event init failure
  perf: Use POLLIN instead of POLL_IN for perf poll data in flag
  perf: Fix put_event() ctx lock
  perf: Fix move_group() order
  perf: Fix event->ctx locking
  perf: Add a bit of paranoia
  perf symbols: Convert lseek + read to pread
  perf tools: Use perf_data_file__fd() consistently
  perf symbols: Support to read compressed module from build-id cache
  perf evsel: Set attr.task bit for a tracking event
  perf header: Set header version correctly
  perf record: Show precise number of samples
  perf tools: Do not use __perf_session__process_events() directly
  perf callchain: Cache eh/debug frame offset for dwarf unwind
  perf tools: Provide stub for missing pthread_attr_setaffinity_np
  perf evsel: Don't rely on malloc working for sz 0
  tools lib traceevent: Add support for IP address formats
  perf ui/tui: Show fatal error message only if exists
  perf tests: Fix typo in sample-parsing.c
  ...
2015-02-09 15:43:55 -08:00
Linus Torvalds
8308756f45 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Ingo Molnar:
 "The main changes are:

   - mutex, completions and rtmutex micro-optimizations
   - lock debugging fix
   - various cleanups in the MCS and the futex code"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Optimize setting task running after being blocked
  locking/rwsem: Use task->state helpers
  sched/completion: Add lock-free checking of the blocking case
  sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state
  locking/mutex: Explicitly mark task as running after wakeup
  futex: Fix argument handling in futex_lock_pi() calls
  doc: Fix misnamed FUTEX_CMP_REQUEUE_PI op constants
  locking/Documentation: Update code path
  softirq/preempt: Add missing current->preempt_disable_ip update
  locking/osq: No need for load/acquire when acquire-polling
  locking/mcs: Better differentiate between MCS variants
  locking/mutex: Introduce ww_mutex_set_context_slowpath()
  locking/mutex: Move MCS related comments to proper location
  locking/mutex: Checking the stamp is WW only
2015-02-09 15:24:03 -08:00
Linus Torvalds
23e8fe2e16 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU changes in this cycle are:

   - Documentation updates.

   - Miscellaneous fixes.

   - Preemptible-RCU fixes, including fixing an old bug in the
     interaction of RCU priority boosting and CPU hotplug.

   - SRCU updates.

   - RCU CPU stall-warning updates.

   - RCU torture-test updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  rcu: Initialize tiny RCU stall-warning timeouts at boot
  rcu: Fix RCU CPU stall detection in tiny implementation
  rcu: Add GP-kthread-starvation checks to CPU stall warnings
  rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors
  rcu: Optionally run grace-period kthreads at real-time priority
  ksoftirqd: Use new cond_resched_rcu_qs() function
  ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  rcutorture: Add more diagnostics in rcu_barrier() test failure case
  torture: Flag console.log file to prevent holdovers from earlier runs
  torture: Add "-enable-kvm -soundhw pcspk" to qemu command line
  rcutorture: Handle different mpstat versions
  rcutorture: Check from beginning to end of grace period
  rcu: Remove redundant rcu_batches_completed() declaration
  rcutorture: Drop rcu_torture_completed() and friends
  rcu: Provide rcu_batches_completed_sched() for TINY_RCU
  rcutorture: Use unsigned for Reader Batch computations
  rcutorture: Make build-output parsing correctly flag RCU's warnings
  rcu: Make _batches_completed() functions return unsigned long
  rcutorture: Issue warnings on close calls due to Reader Batch blows
  documentation: Fix smp typo in memory-barriers.txt
  ...
2015-02-09 14:28:42 -08:00
Linus Torvalds
26cdd1f76a Merge branches 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and x86 fix from Ingo Molnar:
 "A CLOCK_TAI early expiry fix and an x86 microcode driver oops fix"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimer: Fix incorrect tai offset calculation for non high-res timer systems

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode: Return error from driver init code when loader is disabled
2015-02-06 13:56:02 -08:00
Linus Torvalds
396e9099ea Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Fix deadline parameter modification handling
  sched/wait: Remove might_sleep() from wait_event_cmd()
  sched: Fix crash if cpuset_cpumask_can_shrink() is passed an empty cpumask
  sched/fair: Avoid using uninitialized variable in preferred_group_nid()
2015-02-06 13:34:26 -08:00
Linus Torvalds
29f12c48df Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core kernel fixes from Ingo Molnar:
 "Two liblockdep fixes and a CPU hotplug race fix"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/liblockdep: don't include host headers
  tools/liblockdep: ignore generated .so file
  smpboot: Add missing get_online_cpus() in smpboot_register_percpu_thread()
2015-02-06 13:06:10 -08:00
John Stultz
2d926c15d6 hrtimer: Fix incorrect tai offset calculation for non high-res timer systems
I noticed some CLOCK_TAI timer test failures on one of my
less-frequently used configurations. And after digging in I
found in 76f4108892 (Cleanup hrtimer accessors to the
timekepeing state), the hrtimer_get_softirq_time tai offset
calucation was incorrectly rewritten, as the tai offset we
return shold be from CLOCK_MONOTONIC, and not CLOCK_REALTIME.

This results in CLOCK_TAI timers expiring early on non-highres
capable machines.

This patch fixes the issue, calculating the tai time properly
from the monotonic base.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable <stable@vger.kernel.org> # 3.17+
Link: http://lkml.kernel.org/r/1423097126-10236-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-05 08:39:37 +01:00
Mark Rutland
2fde4f94e0 perf: Decouple unthrottling and rotating
Currently the adjusments made as part of perf_event_task_tick() use the
percpu rotation lists to iterate over any active PMU contexts, but these
are not used by the context rotation code, having been replaced by
separate (per-context) hrtimer callbacks. However, some manipulation of
the rotation lists (i.e. removal of contexts) has remained in
perf_rotate_context(). This leads to the following issues:

* Contexts are not always removed from the rotation lists. Removal of
  PMUs which have been placed in rotation lists, but have not been
  removed by a hrtimer callback can result in corruption of the rotation
  lists (when memory backing the context is freed).

  This has been observed to result in hangs when PMU drivers built as
  modules are inserted and removed around the creation of events for
  said PMUs.

* Contexts which do not require rotation may be removed from the
  rotation lists as a result of a hrtimer, and will not be considered by
  the unthrottling code in perf_event_task_tick.

This patch fixes the issue by updating the rotation ist when events are
scheduled in/out, ensuring that each rotation list stays in sync with
the HW state. As each event holds a refcount on the module of its PMU,
this ensures that when a PMU module is unloaded none of its CPU contexts
can be in a rotation list. By maintaining a list of perf_event_contexts
rather than perf_event_cpu_contexts, we don't need separate paths to
handle the cpu and task contexts, which also makes the code a little
simpler.

As the rotation_list variables are not used for rotation, these are
renamed to active_ctx_list, which better matches their current function.
perf_pmu_rotate_{start,stop} are renamed to
perf_pmu_ctx_{activate,deactivate}.

Reported-by: Johannes Jensen <johannes.jensen@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <Will.Deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150129134511.GR17721@leverpostej
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 08:07:16 +01:00
Mark Rutland
cc34b98bac perf: Drop module reference on event init failure
When initialising an event, perf_init_event will call try_module_get() to
ensure that the PMU's module cannot be removed for the lifetime of the
event, with __free_event() dropping the reference when the event is
finally destroyed. If something fails after the event has been
initialised, but before the event is installed, perf_event_alloc will
drop the reference on the module.

However, if we fail to initialise an event for some reason (e.g. we ask
an uncore PMU to perform sampling, and it refuses to initialise the
event), we do not drop the refcount. If we try to open such a bogus
event without a precise IDR type, we will loop over each PMU in the pmus
list, incrementing each of their refcounts without decrementing them.

This patch adds a module_put when pmu->event_init(event) fails, ensuring
that the refcounts are balanced in failure cases. As the innards of the
precise and search based initialisation look very similar, this logic is
hoisted out into a new helper function. While the early return for the
failed try_module_get is removed from the search case, this is handled
by the remaining return when ret is not -ENOENT.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1420642611-22667-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 08:07:14 +01:00
Jiri Olsa
7c60fc0e02 perf: Use POLLIN instead of POLL_IN for perf poll data in flag
Currently we flag available data (via poll syscall) on perf fd with
POLL_IN macro, which is normally used for SIGIO interface.

We've been lucky, because POLLIN (0x1) is subset of POLL_IN (0x20001)
and sys_poll (do_pollfd function) cut the extra bit out (0x20000).

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422467678-22341-1-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 08:07:13 +01:00
Peter Zijlstra
a83fe28e2e perf: Fix put_event() ctx lock
So what I suspect; but I'm in zombie mode today it seems; is that while
I initially thought that it was impossible for ctx to change when
refcount dropped to 0, I now suspect its possible.

Note that until perf_remove_from_context() the event is still active and
visible on the lists. So a concurrent sys_perf_event_open() from another
task into this task can race.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: mark.rutland@arm.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150129134434.GB26304@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 08:07:12 +01:00
Peter Zijlstra (Intel)
8f95b435b6 perf: Fix move_group() order
Jiri reported triggering the new WARN_ON_ONCE in event_sched_out over
the weekend:

  event_sched_out.isra.79+0x2b9/0x2d0
  group_sched_out+0x69/0xc0
  ctx_sched_out+0x106/0x130
  task_ctx_sched_out+0x37/0x70
  __perf_install_in_context+0x70/0x1a0
  remote_function+0x48/0x60
  generic_exec_single+0x15b/0x1d0
  smp_call_function_single+0x67/0xa0
  task_function_call+0x53/0x80
  perf_install_in_context+0x8b/0x110

I think the below should cure this; if we install a group leader it
will iterate the (still intact) group list and find its siblings and
try and install those too -- even though those still have the old
event->ctx -- in the new ctx.

Upon installing the first group sibling we'd try and schedule out the
group and trigger the above warn.

Fix this by installing the group leader last, installing siblings
would have no effect, they're not reachable through the group lists
and therefore we don't schedule them.

Also delay resetting the state until we're absolutely sure the events
are quiescent.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Reported-by: vincent.weaver@maine.edu
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150126162639.GA21418@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 08:07:11 +01:00
Peter Zijlstra
f63a8daa58 perf: Fix event->ctx locking
There have been a few reported issues wrt. the lack of locking around
changing event->ctx. This patch tries to address those.

It avoids the whole rwsem thing; and while it appears to work, please
give it some thought in review.

What I did fail at is sensible runtime checks on the use of
event->ctx, the RCU use makes it very hard.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 08:07:10 +01:00
Peter Zijlstra
652884fe0c perf: Add a bit of paranoia
Add a few WARN()s to catch things that should never happen.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.150481799@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 08:07:09 +01:00
Ingo Molnar
8f4bf4bcc4 Merge tag 'v3.19-rc7' into perf/core, to merge fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:58:29 +01:00
Davidlohr Bueso
afffc6c180 locking/rtmutex: Optimize setting task running after being blocked
We explicitly mark the task running after returning from
a __rt_mutex_slowlock() call, which does the actual sleeping
via wait-wake-trylocking. As such, this patch does two things:

(1) refactors the code so that setting current to TASK_RUNNING
    is done by __rt_mutex_slowlock(), and not by the callers. The
    downside to this is that it becomes a bit unclear when at what
    point we block. As such I've added a comment that the task
    blocks when calling __rt_mutex_slowlock() so readers can figure
    out when it is running again.

(2) relaxes setting current's state through __set_current_state(),
    instead of it's more expensive barrier alternative. There was no
    need for the implied barrier as we're obviously not planning on
    blocking.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422857784.18096.1.camel@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:57:42 +01:00
Davidlohr Bueso
73105994c5 locking/rwsem: Use task->state helpers
Call __set_task_state() instead of assigning the new state
directly. These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP
environments, keeping track of who last changed the state.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422257769-14083-2-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:57:39 +01:00
Nicholas Mc Guire
7c34e3180a sched/completion: Add lock-free checking of the blocking case
The "thread would block" case can be checked without grabbing ->wait.lock.

[ If the check does not return early then grab the lock and recheck.
  A memory barrier is not needed as complete() and complete_all() imply
  a barrier.

  The ACCESS_ONCE() is needed for calls in a loop that, if inlined, could
  optimize out the re-fetching of x->done. ]

Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422013307-13200-1-git-send-email-der.herr@hofr.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:57:37 +01:00
Nicholas Mc Guire
de30ec4730 sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421467534-22834-1-git-send-email-der.herr@hofr.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:57:36 +01:00
Davidlohr Bueso
51587bcf31 locking/mutex: Explicitly mark task as running after wakeup
By the time we wake up and get the lock after being asleep
in the slowpath, we better be running. As good practice,
be explicit about this and avoid any mischief.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421717961.4903.11.camel@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:57:33 +01:00
Ingo Molnar
2622e849a1 Merge tag 'v3.19-rc7' into locking/core, to refresh the branch before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:53:17 +01:00
Sharon Dvir
139b6fd26d sched/Documentation: Remove unneeded word
The second 'mutex' shouldn't be there, it can't be about the mutex,
as the mutex can't be freed, but unlocked, the memory where the
mutex resides however, can be freed.

Signed-off-by: Sharon Dvir <sharon.dvir1@mail.huji.ac.il>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1422827252-31363-1-git-send-email-sharon.dvir1@mail.huji.ac.il
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:52:33 +01:00
Frederic Weisbecker
bfd9b2b5f8 sched: Pull resched loop to __schedule() callers
__schedule() disables preemption during its job and re-enables it
afterward without doing a preemption check to avoid recursion.

But if an event happens after the context switch which requires
rescheduling, we need to check again if a task of a higher priority
needs the CPU. A preempt irq can raise such a situation. To handle that,
__schedule() loops on need_resched().

But preempt_schedule_*() functions, which call __schedule(), also loop
on need_resched() to handle missed preempt irqs. Hence we end up with
the same loop happening twice.

Lets simplify that by attributing the need_resched() loop responsibility
to all __schedule() callers.

There is a risk that the outer loop now handles reschedules that used
to be handled by the inner loop with the added overhead of caller details
(inc/dec of PREEMPT_ACTIVE, irq save/restore) but assuming those inner
rescheduling loop weren't too frequent, this shouldn't matter. Especially
since the whole preemption path is now losing one loop in any case.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1422404652-29067-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 07:52:30 +01:00