mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
1b375dc30710180c4b88cc59caba6e3481ec5c8b
16103 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
80628ca06c |
kernel/fork.c:copy_process(): unify CLONE_THREAD-or-thread_group_leader code
Cleanup and preparation for the next changes. Move the "if (clone_flags & CLONE_THREAD)" code down under "if (likely(p->pid))" and turn it into into the "else" branch. This makes the process/thread initialization more symmetrical and removes one check. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
b57922b6c7 |
fork: reorder permissions when violating number of processes limits
When a task is attempting to violate the RLIMIT_NPROC limit we have a check to see if the task is sufficiently priviledged. The check first looks at CAP_SYS_ADMIN, then CAP_SYS_RESOURCE, then if the task is uid=0. A result is that tasks which are allowed by the uid=0 check are first checked against the security subsystem. This results in the security subsystem auditting a denial for sys_admin and sys_resource and then the task passing the uid=0 check. This patch rearranges the code to first check uid=0, since if we pass that we shouldn't hit the security system at all. We then check sys_resource, since it is the smallest capability which will solve the problem. Lastly we check the fallback everything cap_sysadmin. We don't want to give this capability many places since it is so powerful. This will eliminate many of the false positive/needless denial messages we get when a root task tries to violate the nproc limit. (note that kthreads count against root, so on a sufficiently large machine we can actually get past the default limits before any userspace tasks are launched.) Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
81dabb4641 |
exit.c: unexport __set_special_pids()
Move __set_special_pids() from exit.c to sys.c close to its single caller and make it static. And rename it to set_special_pids(), another helper with this name has gone away. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
7f57cfa4e2 |
usermodehelper: kill the sub_info->path[0] check
call_usermodehelper_exec() does nothing but returns success if path[0] ==
0. The only user which needs this strange feature is request_module(), it
can check modprobe_path[0] itself like other users do if they want to
detect the "disabled by admin" case.
Kill it. Not only it looks strange, it can confuse other callers. And
this allows us to revert
|
||
|
|
29000caecb |
ptrace: add ability to get/set signal-blocked mask
crtools uses a parasite code for dumping processes. The parasite code is injected into a process with help PTRACE_SEIZE. Currently crtools blocks signals from a parasite code. If a process has pending signals, crtools wait while a process handles these signals. This method is not suitable for stopped tasks. A stopped task can have a few pending signals, when we will try to execute a parasite code, we will need to drop SIGSTOP, but all other signals must remain pending, because a state of processes must not be changed during checkpointing. This patch adds two ptrace commands to set/get signal-blocked mask. I think gdb can use this commands too. [akpm@linux-foundation.org: be consistent with brace layout] Signed-off-by: Andrey Vagin <avagin@openvz.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
10fb46d5f7 |
kprobes: handle empty/invalid input to debugfs "enabled" file
When writing invalid input to 'debug/kprobes/enabled' it'll silently be ignored. Even worse, when writing an empty string to this file, the outcome is purely random as the switch statement will make its decision based on the value of an uninitialized stack variable. Fix this by handling invalid/empty input as error returning -EINVAL. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
45c64940c8 |
kernel/sys.c:do_sysinfo(): use get_monotonic_boottime()
Change do_sysinfo() to use get_monotonic_boottime() instead of do_posix_clock_monotonic_gettime() + monotonic_to_bootbased(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Tomas Janousek <tjanouse@redhat.com> Cc: Tomas Smetana <tsmetana@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
7ec75e1ca1 |
kernel/sys.c: sys_reboot(): fix malformed panic message
If LINUX_REBOOT_CMD_HALT for reboot failed, the message "cannot halt" will stay on the same line with the next message, so append a '\n'. Signed-off-by: liguang <lig.fnst@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
f170168b9a |
drivers: avoid parsing names as kthread_run() format strings
Calling kthread_run with a single name parameter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
0ed5fd1385 |
mm: use totalram_pages instead of num_physpages at runtime
The global variable num_physpages is scheduled to be removed, so use totalram_pages instead of num_physpages at runtime. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
f991fae5c6 |
Merge tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"This time the total number of ACPI commits is slightly greater than
the number of cpufreq commits, but Viresh Kumar (who works on cpufreq)
remains the most active patch submitter.
To me, the most significant change is the addition of offline/online
device operations to the driver core (with the Greg's blessing) and
the related modifications of the ACPI core hotplug code. Next are the
freezer updates from Colin Cross that should make the freezing of
tasks a bit less heavy weight.
We also have a couple of regression fixes, a number of fixes for
issues that have not been identified as regressions, two new drivers
and a bunch of cleanups all over.
Highlights:
- Hotplug changes to support graceful hot-removal failures.
It sometimes is necessary to fail device hot-removal operations
gracefully if they cannot be carried out completely. For example,
if memory from a memory module being hot-removed has been allocated
for the kernel's own use and cannot be moved elsewhere, it's
desirable to fail the hot-removal operation in a graceful way
rather than to crash the kernel, but currenty a success or a kernel
crash are the only possible outcomes of an attempted memory
hot-removal. Needless to say, that is not a very attractive
alternative and it had to be addressed.
However, in order to make it work for memory, I first had to make
it work for CPUs and for this purpose I needed to modify the ACPI
processor driver. It's been split into two parts, a resident one
handling the low-level initialization/cleanup and a modular one
playing the actual driver's role (but it binds to the CPU system
device objects rather than to the ACPI device objects representing
processors). That's been sort of like a live brain surgery on a
patient who's riding a bike.
So this is a little scary, but since we found and fixed a couple of
regressions it caused to happen during the early linux-next testing
(a month ago), nobody has complained.
As a bonus we remove some duplicated ACPI hotplug code, because the
ACPI-based CPU hotplug is now going to use the common ACPI hotplug
code.
- Lighter weight freezing of tasks.
These changes from Colin Cross and Mandeep Singh Baines are
targeted at making the freezing of tasks a bit less heavy weight
operation. They reduce the number of tasks woken up every time
during the freezing, by using the observation that the freezer
simply doesn't need to wake up some of them and wait for them all
to call refrigerator(). The time needed for the freezer to decide
to report a failure is reduced too.
Also reintroduced is the check causing a lockdep warining to
trigger when try_to_freeze() is called with locks held (which is
generally unsafe and shouldn't happen).
- cpufreq updates
First off, a commit from Srivatsa S Bhat fixes a resume regression
introduced during the 3.10 cycle causing some cpufreq sysfs
attributes to return wrong values to user space after resume. The
fix is kind of fresh, but also it's pretty obvious once Srivatsa
has identified the root cause.
Second, we have a new freqdomain_cpus sysfs attribute for the
acpi-cpufreq driver to provide information previously available via
related_cpus. From Lan Tianyu.
Finally, we fix a number of issues, mostly related to the
CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean
up some code. The majority of changes from Viresh Kumar with bits
from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia,
Arnd Bergmann, and Tang Yuantian.
- ACPICA update
A usual bunch of updates from the ACPICA upstream.
During the 3.4 cycle we introduced support for ACPI 5 extended
sleep registers, but they are only supposed to be used if the
HW-reduced mode bit is set in the FADT flags and the code attempted
to use them without checking that bit. That caused suspend/resume
regressions to happen on some systems. Fix from Lv Zheng causes
those registers to be used only if the HW-reduced mode bit is set.
Apart from this some other ACPICA bugs are fixed and code cleanups
are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and
Zhang Rui.
- cpuidle updates
New driver for Xilinx Zynq processors is added by Michal Simek.
Multidriver support simplification, addition of some missing
kerneldoc comments and Kconfig-related fixes come from Daniel
Lezcano.
- ACPI power management updates
Changes to make suspend/resume work correctly in Xen guests from
Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and
cleanups and fixes of the ACPI device power state selection
routine.
- ACPI documentation updates
Some previously missing pieces of ACPI documentation are added by
Lv Zheng and Aaron Lu (hopefully, that will help people to
uderstand how the ACPI subsystem works) and one outdated doc is
updated by Hanjun Guo.
- Assorted ACPI updates
We finally nailed down the IA-64 issue that was the reason for
reverting commit
|
||
|
|
a0b2062b09 |
posix_timers: fix racy timer delta caching on task exit
When a task exits, we perform a caching of the remaining cputime delta
before expiring of its timers.
This is done from the following places:
* When the task is reaped. We iterate through its list of
posix cpu timers and store the remaining timer delta to
the timer struct instead of the absolute value.
(See posix_cpu_timers_exit() / posix_cpu_timers_exit_group() )
* When we call posix_cpu_timer_get() or posix_cpu_timer_schedule().
If the timer's task is considered dying when watched from these
places, the same conversion from absolute to relative expiry time
is performed. Then the given task's reference is released.
(See clear_dead_task() ).
The relevance of this caching is questionable but this is another
and deeper debate.
The big issue here is that these two sources of caching don't mix
up very well together.
More specifically, the caching can easily be done twice, resulting
in a wrong delta as it gets spuriously substracted a second time by
the elapsed clock. This can happen in the following scenario:
1) The task exits and gets reaped: we call posix_cpu_timers_exit()
and the absolute timer expiry values are converted to a relative
delta.
2) timer_gettime() -> posix_cpu_timer_get() is called and relies on
clear_dead_task() because tsk->exit_state == EXIT_DEAD.
The delta gets substracted again by the elapsed clock and we return
a wrong result.
To fix this, just remove the caching done on task reaping time. It
doesn't bring much value on its own. The caching done from
posix_cpu_timer_get/schedule is enough.
And it would also be hard to get it really right: we could make it put and
clear the target task in the timer struct so that readers know if they are
dealing with a relative cached of absolute value. But it would be racy.
The only safe way to do it would be to lock the itimer->it_lock so that we
know nobody reads the cputime expiry value while we modify it and its
target task reference. Doing so would involve some funny workarounds to
avoid circular lock against the sighand lock. There is just no reason to
maintain this.
The user visible effect of this patch can be observed by running the
following code: it creates a subthread that launches a posix cputimer
which expires after 10 seconds. But then the subthread only busy loops for 2
seconds and exits. The parent reaps the subthread and read the timer value.
Its expected value should the be the initial timer's expiration value
minus the cputime elapsed in the subthread. Roughly 10 - 2 = 8 seconds:
#include <sys/time.h>
#include <stdio.h>
#include <unistd.h>
#include <time.h>
#include <pthread.h>
static timer_t id;
static struct itimerspec val = { .it_value.tv_sec = 10, }, new;
static void *thread(void *unused)
{
int err;
struct timeval start, end, diff;
timer_create(CLOCK_THREAD_CPUTIME_ID, NULL, &id);
if (err < 0) {
perror("Can't create timer\n");
return NULL;
}
/* Arm 10 sec timer */
err = timer_settime(id, 0, &val, NULL);
if (err < 0) {
perror("Can't set timer\n");
return NULL;
}
/* Exit after 2 seconds of execution */
gettimeofday(&start, NULL);
do {
gettimeofday(&end, NULL);
timersub(&end, &start, &diff);
} while (diff.tv_sec < 2);
return NULL;
}
int main(int argc, char **argv)
{
pthread_t pthread;
int err;
err = pthread_create(&pthread, NULL, thread, NULL);
if (err) {
perror("Can't create thread\n");
return -1;
}
pthread_join(pthread, NULL);
/* Just wait a little bit to make sure the child got reaped */
sleep(1);
err = timer_gettime(id, &new);
if (err)
perror("Can't get timer value\n");
printf("%d %ld\n", new.it_value.tv_sec, new.it_value.tv_nsec);
return 0;
}
Before the patch:
$ ./posix_cpu_timers
6 2278074
After the patch:
$ ./posix_cpu_timers
8 1158766
Before the patch, the elapsed time got two more seconds spuriously accounted.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Olivier Langlois <olivier@trillion01.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
76cdcdd979 |
posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()
In order to re-arm a timer after it fired, we take a sample of the current
process or thread cputime.
If the task is dying though, we don't arm anything but we cache the
remaining timer expiration delta for further reads.
Something similar is performed in posix_cpu_timer_get() but here we forget
to take the process wide cputime sample before caching it.
As a result we are storing random stack content, leading every further
reads of that timer to return junk values.
Fix this by taking the appropriate sample in the case of process wide
timers.
This probably doesn't matter much in practice because, at this stage, the
thread is the last one in the group and we reached exit_notify(). This
implies that we called exit_itimers() and there should be no more timers
to handle for that task.
So this is likely dead code anyway but let's fix the current logic
and the warning that came along:
kernel/posix-cpu-timers.c: In function 'posix_cpu_timer_schedule':
kernel/posix-cpu-timers.c:1127: warning: 'now' may be used uninitialized in this function
Then we can start to think further about cleaning up that code.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Chen Gang <gang.chen@asianux.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Olivier Langlois <olivier@trillion01.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
2473f3e7a9 |
posix_cpu_timers: consolidate expired timers check
Consolidate the common code amongst per thread and per process timers list on tick time. List traversal, expiry check and subsequent updates can be shared in a common helper. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
1a7fa510b3 |
posix_cpu_timers: consolidate timer list cleanups
Cleaning up the posix cpu timers on task exit shares some common code among timer list types, most notably the list traversal and expiry time update. Unify this in a common helper. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
55ccb616a6 |
posix_cpu_timer: consolidate expiry time type
The posix cpu timer expiry time is stored in a union of two types: a 64 bits field if we rely on scheduler precise accounting, or a cputime_t if we rely on jiffies. This results in quite some duplicate code and special cases to handle the two types. Just unify this into a single 64 bits field. cputime_t can always fit into it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
0b0585c3e1 |
Merge branch 'for-3.11-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cpuset changes from Tejun Heo:
"cpuset has always been rather odd about its configurations - a cgroup
right after creation didn't allow any task executions before
configuration, changing configuration in the parent modifies the
descendants irreversibly and so on. These behaviors are inherently
nasty and almost hostile against sharing the hierarchy with other
controllers making it very difficult to use in unified hierarchy.
Li is currently in the process of updating the behaviors for
__DEVEL__sane_behavior which is the bulk of changes in this pull
request. It isn't complete yet and the behaviors will change further
but all changes are gated behind sane_behavior. In the process, the
rather hairy work-item punting which was used to work around the
limitations of cgroup descendant iterator was simplified."
* 'for-3.11-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: rename @cont to @cgrp
cpuset: fix to migrate mm correctly in a corner case
cpuset: allow to move tasks to empty cpusets
cpuset: allow to keep tasks in empty cpusets
cpuset: introduce effective_{cpumask|nodemask}_cpuset()
cpuset: record old_mems_allowed in struct cpuset
cpuset: remove async hotplug propagation work
cpuset: let hotplug propagation work wait for task attaching
cpuset: re-structure update_cpumask() a bit
cpuset: remove cpuset_test_cpumask()
cpuset: remove unnecessary variable in cpuset_attach()
cpuset: cleanup guarantee_online_{cpus|mems}()
cpuset: remove redundant check in cpuset_cpus_allowed_fallback()
|
||
|
|
b028161fbb |
Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
"This pull request contains the following changes.
- cgroup_subsys_state (css) reference counting has been converted to
percpu-ref. css is what each resource controller embeds into its
own control structure and perform reference count against. It may
be used in hot paths of various subsystems and is similar to module
refcnt in that aspect. For example, block-cgroup's css refcnting
was showing up a lot in Mikulaus's device-mapper scalability work
and this should alleviate it.
- cgroup subtree iterator has been updated so that RCU read lock can
be released after grabbing reference. This allows simplifying its
users which requires blocking which used to build iteration list
under RCU read lock and then traverse it outside. This pull
request contains simplification of cgroup core and device-cgroup.
A separate pull request will update cpuset.
- Fixes for various bugs including corner race conditions and RCU
usage bugs.
- A lot of cleanups and some prepartory work for the planned unified
hierarchy support."
* 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (48 commits)
cgroup: CGRP_ROOT_SUBSYS_BOUND should also be ignored when mounting an existing hierarchy
cgroup: CGRP_ROOT_SUBSYS_BOUND should be ignored when comparing mount options
cgroup: fix deadlock on cgroup_mutex via drop_parsed_module_refcounts()
cgroup: always use RCU accessors for protected accesses
cgroup: fix RCU accesses around task->cgroups
cgroup: fix RCU accesses to task->cgroups
cgroup: grab cgroup_mutex in drop_parsed_module_refcounts()
cgroup: fix cgroupfs_root early destruction path
cgroup: reserve ID 0 for dummy_root and 1 for unified hierarchy
cgroup: implement for_each_[builtin_]subsys()
cgroup: move init_css_set initialization inside cgroup_mutex
cgroup: s/for_each_subsys()/for_each_root_subsys()/
cgroup: clean up find_css_set() and friends
cgroup: remove cgroup->actual_subsys_mask
cgroup: prefix global variables with "cgroup_"
cgroup: convert CFTYPE_* flags to enums
cgroup: rename cont to cgrp
cgroup: clean up cgroup_serial_nr_cursor
cgroup: convert cgroup_cft_commit() to use cgroup_for_each_descendant_pre()
cgroup: make serial_nr_cursor available throughout cgroup.c
...
|
||
|
|
f317ff9eed |
Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue changes from Tejun Heo: "Surprisingly, Lai and I didn't break too many things implementing custom pools and stuff last time around and there aren't any follow-up changes necessary at this point. The only change in this pull request is Viresh's patches to make some per-cpu workqueues to behave as unbound workqueues dependent on a boot param whose default can be configured via a config option. This leads to higher processing overhead / lower bandwidth as more work items are bounced across CPUs; however, it can lead to noticeable powersave in certain configurations - ~10% w/ idlish constant workload on a big.LITTLE configuration according to Viresh. This is because per-cpu workqueues interfere with how the scheduler perceives whether or not each CPU is idle by forcing pinned tasks on them, which makes the scheduler's power-aware scheduling decisions less effective. Its effectiveness is likely less pronounced on homogenous configurations and this type of optimization can probably be made automatic; however, the changes are pretty minimal and the affected workqueues are clearly marked, so it's an easy gain for some configurations for the time being with pretty unintrusive changes." * 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: fbcon: queue work on power efficient wq block: queue work on power efficient wq PHYLIB: queue work on system_power_efficient_wq workqueue: Add system wide power_efficient workqueues workqueues: Introduce new flag WQ_POWER_EFFICIENT for power oriented workqueues |
||
|
|
3045f94a20 |
Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS update from Ingo Molnar:
"The changes in this tree are:
- ACPI APEI (ACPI Platform Error Interface) improvements, by Chen
Gong
- misc MCE fixes/cleanups"
* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Update MCE severity condition check
mce: acpi/apei: Add comments to clarify usage of the various bitfields in the MCA subsystem
ACPI/APEI: Update einj documentation for param1/param2
ACPI/APEI: Add parameter check before error injection
ACPI, APEI, EINJ: Fix error return code in einj_init()
x86, mce: Fix "braodcast" typo
|
||
|
|
2d722f6d56 |
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"The main changes:
- load-calculation cleanups and improvements, by Alex Shi
- various nohz related tidying up of statisics, by Frederic
Weisbecker
- factor out /proc functions to kernel/sched/proc.c, by Paul
Gortmaker
- simplify the RT policy scheduler, by Kirill Tkhai
- various fixes and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
sched/debug: Remove CONFIG_FAIR_GROUP_SCHED mask
sched/debug: Fix formatting of /proc/<PID>/sched
sched: Fix typo in struct sched_avg member description
sched/fair: Fix typo describing flags in enqueue_entity
sched/debug: Add load-tracking statistics to task
sched: Change get_rq_runnable_load() to static and inline
sched/tg: Remove tg.load_weight
sched/cfs_rq: Change atomic64_t removed_load to atomic_long_t
sched/tg: Use 'unsigned long' for load variable in task group
sched: Change cfs_rq load avg to unsigned long
sched: Consider runnable load average in move_tasks()
sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task
sched: Update cpu load after task_tick
sched: Fix sleep time double accounting in enqueue entity
sched: Set an initial value of runnable avg for new forked task
sched: Move a few runnable tg variables into CONFIG_SMP
Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking"
sched: Don't mix use of typedef ctl_table and struct ctl_table
sched: Remove WARN_ON(!sd) from init_sched_groups_power()
sched: Fix memory leakage in build_sched_groups()
...
|
||
|
|
f0bb4c0ab0 |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel improvements:
- watchdog driver improvements by Li Zefan
- Power7 CPI stack events related improvements by Sukadev Bhattiprolu
- event multiplexing via hrtimers and other improvements by Stephane
Eranian
- kernel stack use optimization by Andrew Hunter
- AMD IOMMU uncore PMU support by Suravee Suthikulpanit
- NMI handling rate-limits by Dave Hansen
- various hw_breakpoint fixes by Oleg Nesterov
- hw_breakpoint overflow period sampling and related signal handling
fixes by Jiri Olsa
- Intel Haswell PMU support by Andi Kleen
Tooling improvements:
- Reset SIGTERM handler in workload child process, fix from David
Ahern.
- Makefile reorganization, prep work for Kconfig patches, from Jiri
Olsa.
- Add automated make test suite, from Jiri Olsa.
- Add --percent-limit option to 'top' and 'report', from Namhyung
Kim.
- Sorting improvements, from Namhyung Kim.
- Expand definition of sysfs format attribute, from Michael Ellerman.
Tooling fixes:
- 'perf tests' fixes from Jiri Olsa.
- Make Power7 CPI stack events available in sysfs, from Sukadev
Bhattiprolu.
- Handle death by SIGTERM in 'perf record', fix from David Ahern.
- Fix printing of perf_event_paranoid message, from David Ahern.
- Handle realloc failures in 'perf kvm', from David Ahern.
- Fix divide by 0 in variance, from David Ahern.
- Save parent pid in thread struct, from David Ahern.
- Handle JITed code in shared memory, from Andi Kleen.
- Fixes for 'perf diff', from Jiri Olsa.
- Remove some unused struct members, from Jiri Olsa.
- Add missing liblk.a dependency for python/perf.so, fix from Jiri
Olsa.
- Respect CROSS_COMPILE in liblk.a, from Rabin Vincent.
- No need to do locking when adding hists in perf report, only 'top'
needs that, from Namhyung Kim.
- Fix alignment of symbol column in in the hists browser (top,
report) when -v is given, from NAmhyung Kim.
- Fix 'perf top' -E option behavior, from Namhyung Kim.
- Fix bug in isupper() and islower(), from Sukadev Bhattiprolu.
- Fix compile errors in bp_signal 'perf test', from Sukadev
Bhattiprolu.
... and more things"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (102 commits)
perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable()
perf/x86: Fix shared register mutual exclusion enforcement
perf/x86/intel: Support full width counting
x86: Add NMI duration tracepoints
perf: Drop sample rate when sampling is too slow
x86: Warn when NMI handlers take large amounts of time
hw_breakpoint: Introduce "struct bp_cpuinfo"
hw_breakpoint: Simplify *register_wide_hw_breakpoint()
hw_breakpoint: Introduce cpumask_of_bp()
hw_breakpoint: Simplify the "weight" usage in toggle_bp_slot() paths
hw_breakpoint: Simplify list/idx mess in toggle_bp_slot() paths
perf/x86/intel: Add mem-loads/stores support for Haswell
perf/x86/intel: Support Haswell/v4 LBR format
perf/x86/intel: Move NMI clearing to end of PMI handler
perf/x86/intel: Add Haswell PEBS support
perf/x86/intel: Add simple Haswell PMU support
perf/x86/intel: Add Haswell PEBS record support
perf/x86/intel: Fix sparse warning
perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation
perf/x86/amd: Add IOMMU Performance Counter resource management
...
|
||
|
|
a4883ef6af |
Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core irq changes from Ingo Molnar: "The main changes: - generic-irqchip driver additions, cleanups and fixes - 3 new irqchip drivers: ARMv7-M NVIC, TB10x and Marvell Orion SoCs - irq_get_trigger_type() simplification and cross-arch cleanup - various cleanups, simplifications - documentation updates" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) softirq: Use _RET_IP_ genirq: Add the generic chip to the genirq docbook genirq: generic-chip: Export some irq_gc_ functions genirq: Fix can_request_irq() for IRQs without an action irqchip: exynos-combiner: Staticize combiner_init irqchip: Add support for ARMv7-M NVIC irqchip: Add TB10x interrupt controller driver irqdomain: Use irq_get_trigger_type() to get IRQ flags MIPS: octeon: Use irq_get_trigger_type() to get IRQ flags arm: orion: Use irq_get_trigger_type() to get IRQ flags mfd: stmpe: use irq_get_trigger_type() to get IRQ flags mfd: twl4030-irq: Use irq_get_trigger_type() to get IRQ flags gpio: mvebu: Use irq_get_trigger_type() to get IRQ flags genirq: Add irq_get_trigger_type() to get IRQ flags genirq: Irqchip: document gcflags arg of irq_alloc_domain_generic_chips genirq: Set irq thread to RT priority on creation irqchip: Add support for Marvell Orion SoCs genirq: Add kerneldoc for irq_disable. genirq: irqchip: Add mask to block out invalid irqs genirq: Generic chip: Add linear irq domain support ... |
||
|
|
ab3d681e9d |
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"The major changes:
- Simplify RCU's grace-period and callback processing based on the new
numbering for callbacks.
- Removal of TINY_PREEMPT_RCU in favor of TREE_PREEMPT_RCU for
single-CPU low-latency systems.
- SRCU-related changes and fixes.
- Miscellaneous fixes, including converting a few remaining printk()
calls to pr_*().
- Documentation updates"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
rcu: Shrink TINY_RCU by reworking CPU-stall ifdefs
rcu: Shrink TINY_RCU by moving exit_rcu()
rcu: Remove TINY_PREEMPT_RCU tracing documentation
rcu: Consolidate rcutiny_plugin.h ifdefs
rcu: Remove rcu_preempt_note_context_switch()
rcu: Remove the CONFIG_TINY_RCU ifdefs in rcutiny.h
rcu: Remove check_cpu_stall_preempt()
rcu: Simplify RCU_TINY RCU callback invocation
rcu: Remove rcu_preempt_process_callbacks()
rcu: Remove rcu_preempt_remove_callbacks()
rcu: Remove rcu_preempt_check_callbacks()
rcu: Remove show_tiny_preempt_stats()
rcu: Remove TINY_PREEMPT_RCU
powerpc,kvm: fix imbalance srcu_read_[un]lock()
rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw().
rcu: Apply Dave Jones's NOCB Kconfig help feedback
rcu: Merge adjacent identical ifdefs
rcu: Drive quiescent-state-forcing delay from HZ
rcu: Remove "Experimental" flags
kthread: Add kworker kthreads to OS-jitter documentation
...
|
||
|
|
0c46d68d19 |
Merge branch 'core-mutexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull WW mutex support from Ingo Molnar:
"This tree adds support for wound/wait style locks, which the graphics
guys would like to make use of in the TTM graphics subsystem.
Wound/wait mutexes are used when other multiple lock acquisitions of a
similar type can be done in an arbitrary order. The deadlock handling
used here is called wait/wound in the RDBMS literature: The older
tasks waits until it can acquire the contended lock. The younger
tasks needs to back off and drop all the locks it is currently
holding, ie the younger task is wounded.
See this LWN.net description of W/W mutexes:
https://lwn.net/Articles/548909/
The comments there outline specific usecases for this facility (which
have already been implemented for the DRM tree).
Also see Documentation/ww-mutex-design.txt for more details"
* 'core-mutexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking-selftests: Handle unexpected failures more strictly
mutex: Add more w/w tests to test EDEADLK path handling
mutex: Add more tests to lib/locking-selftest.c
mutex: Add w/w tests to lib/locking-selftest.c
mutex: Add w/w mutex slowpath debugging
mutex: Add support for wound/wait style locks
arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not
|