linux

mirror of https://github.com/armbian/linux.git synced 2026-01-06 10:13:00 -08:00

Author	SHA1	Message	Date
Paul E. McKenney	15f5191b6a	rcu: Avoid sparse warnings in rcu_nocb_wake trace event The event-tracing macros do not like bool tracing arguments, so this commit makes them be of type char. This change has the knock-on effect of making it illegal to pass a pointer into one of these arguments, so also change rcutiny's first call to trace_rcu_batch_end() to convert from pointer to boolean, prefixing with "!!". Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:18:17 -07:00
Paul E. McKenney	69a79bb12a	rcu: Track rcu_nocb_kthread()'s sleeping and awakening This commit adds event traces to track all of rcu_nocb_kthread()'s blocking and awakening. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:18:16 -07:00
Paul E. McKenney	756cbf6bef	rcu: Distinguish between NOCB and non-NOCB rcu_callback trace events One way to distinguish between NOCB and non-NOCB rcu_callback trace events is that the former always print zero for the lazy and non-lazy queue lengths. Unfortunately, this also means that we cannot see the NOCB queue lengths. This commit therefore accesses the NOCB queue lengths, but negates them. NOCB rcu_callback trace events should therefore have negative queue lengths. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Match operand size per kbuild test robot's advice. ]	2013-09-23 09:18:14 -07:00
Paul E. McKenney	9261dd0da6	rcu: Add tracing for rcuo no-CBs CPU wakeup handshake Lost wakeups from call_rcu() to the rcuo kthreads can result in hangs that are difficult to diagnose. This commit therefore adds tracing to help pin down the cause of these hangs. Reported-by: Clark Williams <williams@redhat.com> Reported-by: Carsten Emde <C.Emde@osadl.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Add const per kbuild test robot's advice. ]	2013-09-23 09:18:13 -07:00
Paul E. McKenney	bb311eccbd	rcu: Add tracing of normal (non-NOCB) grace-period requests This commit adds tracing to the normal grace-period request points. These are rcu_gp_cleanup(), which checks for the need for another grace period at the end of the previous grace period, and rcu_start_gp_advanced(), which restarts RCU's state machine after an idle period. These trace events are intended to help track down bugs where RCU remains idle despite there being work for it to do. Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:18:08 -07:00
Paul E. McKenney	63c4db78e8	rcu: Add tracing to rcu_gp_kthread() This commit adds tracing to the rcu_gp_kthread() function in order to help trace down hangs potentially involving this kthread. Reported-by: Clark Williams <williams@redhat.com> Reported-by: Carsten Emde <C.Emde@osadl.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:16:14 -07:00
Paul E. McKenney	591c6d1710	rcu: Flag lockless access to ->gp_flags with ACCESS_ONCE() This commit applies ACCESS_ONCE() to an outside-of-lock access to ->gp_flags. Although it is hard to imagine any sane compiler messing this particular case up, the documentation benefits are substantial. Plus the definition of "sane compiler" grows ever looser. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:16:13 -07:00
Paul E. McKenney	88d6df612c	rcu: Prevent spurious-wakeup DoS attack on rcu_gp_kthread() Spurious wakeups in the force-quiescent-state loop in rcu_gp_kthread() cause the timeout to be recalculated, which would prevent rcu_gp_fqs() from ever being called. This would in turn would prevent the grace period from ever ending for as long as there was at least one CPU in an extended quiescent state that had not yet passed through a quiescent state. This commit therefore avoids recalculating the timeout unless the previous pass's call to wait_event_interruptible_timeout() actually did time out, thus preventing the above scenario. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:16:11 -07:00
Paul E. McKenney	f7be820939	rcu: Improve grace-period start logic This commit improves grace-period start logic by checking ->gp_flags under the lock and by issuing a warning if a grace period is already in progress. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:16:10 -07:00
Paul E. McKenney	0d75292467	rcu: Have rcutiny tracepoints use tracepoint_string() This commit extends the work done in `f7f7bac9` (rcu: Have the RCU tracepoints use the tracepoint_string infrastructure) to cover rcutiny. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org>	2013-09-23 09:15:31 -07:00
Paul E. McKenney	26cdfedf6a	rcu: Reject memory-order-induced stall-warning false positives If a system is idle from an RCU perspective for longer than specified by CONFIG_RCU_CPU_STALL_TIMEOUT, and if one CPU starts a grace period just as a second checks for CPU stalls, and if this second CPU happens to see the old value of rsp->jiffies_stall, it will incorrectly report a CPU stall. This is quite rare, but apparently occurs deterministically on systems with about 6TB of memory. This commit therefore orders accesses to the data used to determine whether or not a CPU stall is in progress. Grace-period initialization and cleanup first increments rsp->completed to mark the end of the previous grace period, then records the current jiffies in rsp->gp_start, then records the jiffies at which a stall can be expected to occur in rsp->jiffies_stall, and finally increments rsp->gpnum to mark the start of the new grace period. Now, this ordering by itself does not prevent false positives. For example, if grace-period initialization was delayed between recording rsp->gp_start and rsp->jiffies_stall, the CPU stall warning code might still see an old value of rsp->jiffies_stall. Therefore, this commit also orders the CPU stall warning accesses as well, loading rsp->gpnum and jiffies, then rsp->jiffies_stall, then rsp->gp_start, and finally rsp->completed. This ordering means that the false-positive scenario in the previous paragraph would result in rsp->completed being greater than or equal to rsp->gpnum, which is never valid for a CPU stall, allowing the false positive to be rejected. Furthermore, any fetch that gets an old value of rsp->jiffies_stall must also get an old value of rsp->gpnum, which will again be rejected by the comparison of rsp->gpnum and rsp->completed. Situations where rsp->gp_start is later than rsp->jiffies_stall are also rejected, as are situations where jiffies is less than rsp->jiffies_stall. Although use of unsynchronized accesses means that there are likely still some false-positive scenarios (synchronization has proven to be a very bad idea on large systems), this should get rid of a large class of these scenarios. Reported-by: Fabian Herschel <fabian.herschel@suse.com> Reported-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Tested-by: Jochen Striepe <jochen@tolot.escape.de>	2013-09-23 09:15:30 -07:00
Paul E. McKenney	69c8d28c96	rcu: Micro-optimize rcu_cpu_has_callbacks() The for_each_rcu_flavor() loop unconditionally scans all flavors, even when the first flavor might have some non-lazy callbacks. Once the loop has seen a non-lazy callback, further passes through the loop cannot change the state. This is not a huge problem, given that there can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched), but this code is on the path to idle, so speeding it up even a small amount would have some benefit. This commit therefore does two things: 1. Rearranges the order of the list of RCU flavors in order to place the most active flavor first in the list. The most active RCU flavor is RCU-preempt, or, if there is no RCU-preempt, RCU-sched. 2. Reworks the for_each_rcu_flavor() to exit early when the first non-lazy callback is seen, or, in the case where the caller does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n), when the first callback is seen. Reported-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:15:28 -07:00
Paul E. McKenney	289828e62d	rcu: Silence unused-variable warnings The "idle" variable in both rcu_eqs_enter_common() and rcu_eqs_exit_common() is only used in a WARN_ON_ONCE(). If the kernel is built disabling WARN_ON_ONCE(), the compiler will complain (rightly) that "idle" is unused. This commit therefore adds a __maybe_unused to the declaration of both variables. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:15:27 -07:00
Christoph Lameter	c9d4b0af9e	rcu: Replace __get_cpu_var() uses __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided and less registers are used when code is generated. At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too. The patchset includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int x = &__get_cpu_var(y); Converts to int x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int x = __get_cpu_var(y); Converts to int x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, u); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(this_cpu_ptr(&x), y, sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to this_cpu_inc(y) Signed-off-by: Christoph Lameter <cl@linux.com> [ paulmck: Address conflicts. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:15:22 -07:00
Paul E. McKenney	829511d8aa	rcu: Fix dubious "if" condition in __call_rcu_nocb_enqueue() This commit replaces an incorrect (but fortunately functional) bitwise OR ("\|") operator with the correct logical OR ("\|\|"). Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:13:44 -07:00
Paul E. McKenney	01896f7e0a	rcu: Convert local functions to static The rcu_cpu_stall_timeout kernel parameter, the rcu_dynticks per-CPU variable, and the rcu_gp_fqs() function are used only locally. This commit therefore marks them as static. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:12:31 -07:00
Li Zefan	cd64647f04	hung_task: Change sysctl_hung_task_check_count to 'int' As 'sysctl_hung_task_check_count' is 'unsigned long' when this value is assigned to max_count in check_hung_uninterruptible_tasks(), it's truncated to 'int' type. This causes a minor artifact: if we write 2^32 to sysctl.hung_task_check_count, hung task detection will be effectively disabled. With this fix, it will still truncate the user input to 32 bits, but reading sysctl.hung_task_check_count reflects the actual truncated value. Signed-off-by: Li Zefan <lizefan@huawei.com> Acked-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/523FFF4E.9050401@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-23 11:10:49 +02:00
Rusty Russell	3f2b9c9cdf	module: remove rmmod --wait option. The option to wait for a module reference count to reach zero was in the initial module implementation, but it was never supported in modprobe (you had to use rmmod --wait). After discussion with Lucas, It has been deprecated (with a 10 second sleep) in kmod for the last year. This finally removes it: the flag will evoke a printk warning and a normal (non-blocking) remove attempt. Cc: Lucas De Marchi <lucas.de.marchi@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-09-23 15:44:58 +09:30
Paul E. McKenney	b3f2d02598	rcu: Use proper cpp macro for ->gp_flags One of the ->gp_flags assignments used a raw number rather than the cpp macro that was intended for this purpose, which this commit fixes. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-20 09:43:06 -07:00
Jason Low	f48627e686	sched/balancing: Periodically decay max cost of idle balance This patch builds on patch 2 and periodically decays that max value to do idle balancing per sched domain by approximately 1% per second. Also decay the rq's max_idle_balance_cost value. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-4-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:46 +02:00
Jason Low	9bd721c55c	sched/balancing: Consider max cost of idle balance per sched domain In this patch, we keep track of the max cost we spend doing idle load balancing for each sched domain. If the avg time the CPU remains idle is less then the time we have already spent on idle balancing + the max cost of idle balancing in the sched domain, then we don't continue to attempt the balance. We also keep a per rq variable, max_idle_balance_cost, which keeps track of the max time spent on newidle load balances throughout all its domains so that we can determine the avg_idle's max value. By using the max, we avoid overrunning the average. This further reduces the chance we attempt balancing when the CPU is not idle for longer than the cost to balance. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-3-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:44 +02:00
Jason Low	abfafa54db	sched: Reduce overestimating rq->avg_idle When updating avg_idle, if the delta exceeds some max value, then avg_idle gets set to the max, regardless of what the previous avg was. This can cause avg_idle to often be overestimated. This patch modifies the way we update avg_idle by always updating it with the function call to update_avg() first. Then, if avg_idle exceeds the max, we set it to the max. Signed-off-by: Jason Low <jason.low2@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-2-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:41 +02:00
Vladimir Davydov	7aff2e3a56	sched/balancing: Prevent the reselection of a previous env.dst_cpu if some tasks are pinned Currently new_dst_cpu is prevented from being reselected actually, not dst_cpu. This can result in attempting to pull tasks to this_cpu twice. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/281f59b6e596c718dd565ad267fc38f5b8e5c995.1379265590.git.vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:02:20 +02:00
Ingo Molnar	40a0c68ca9	Merge branch 'sched/urgent' into sched/core Merge in the latest fixes before applying a dependent patch. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:01:01 +02:00
Vladimir Davydov	7e3115ef51	sched/balancing: Fix cfs_rq->task_h_load calculation Patch a003a2 (sched: Consider runnable load average in move_tasks()) sets all top-level cfs_rqs' h_load to rq->avg.load_avg_contrib, which is always 0. This mistype leads to all tasks having weight 0 when load balancing in a cpu-cgroup enabled setup. There obviously should be sum of weights of all runnable tasks there instead. Fix it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Reviewed-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379173186-11944-1-git-send-email-vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 11:59:39 +02:00

... 30 31 32 33 34 ...

17356 Commits