Commit Graph

96 Commits

Author SHA1 Message Date
Alexey Dobriyan
f22ef333c3 cpumask: make cpumask_next() out-of-line
Every for_each_XXX_cpu() invocation calls cpumask_next() which is an
inline function:

	static inline unsigned int cpumask_next(int n, const struct cpumask *srcp)
	{
	        /* -1 is a legal arg here. */
	        if (n != -1)
	                cpumask_check(n);
	        return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
	}

However!

find_next_bit() is regular out-of-line function which means "nr_cpu_ids"
load and increment happen at the caller resulting in a lot of bloat

x86_64 defconfig:
	add/remove: 3/0 grow/shrink: 8/373 up/down: 155/-5668 (-5513)
x86_64 allyesconfig-ish:
	add/remove: 3/1 grow/shrink: 57/634 up/down: 3515/-28177 (-24662) !!!

Some archs redefine find_next_bit() but it is OK:

	m68k		inline but SMP is not supported
	arm		out-of-line
	unicore32	out-of-line

Function call will happen anyway, so move load and increment into callee.

Link: http://lkml.kernel.org/r/20170824230010.GA1593@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:51 -07:00
Alexey Dobriyan
9b130ad5bb treewide: make "nr_cpu_ids" unsigned
First, number of CPUs can't be negative number.

Second, different signnnedness leads to suboptimal code in the following
cases:

1)
	kmalloc(nr_cpu_ids * sizeof(X));

"int" has to be sign extended to size_t.

2)
	while (loff_t *pos < nr_cpu_ids)

MOVSXD is 1 byte longed than the same MOV.

Other cases exist as well. Basically compiler is told that nr_cpu_ids
can't be negative which can't be deduced if it is "int".

Code savings on allyesconfig kernel: -3KB

	add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370)
	function                                     old     new   delta
	coretemp_cpu_online                          450     512     +62
	rcu_init_one                                1234    1272     +38
	pci_device_probe                             374     399     +25

				...

	pgdat_reclaimable_pages                      628     556     -72
	select_fallback_rq                           446     369     -77
	task_numa_find_cpu                          1923    1807    -116

Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:48 -07:00
Peter Zijlstra
6c8557bdb2 smp, cpumask: Use non-atomic cpumask_{set,clear}_cpu()
The cpumasks in smp_call_function_many() are private and not subject
to concurrency, atomic bitops are pointless and expensive.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-23 10:01:32 +02:00
Peter Zijlstra
c743f0a5c5 sched/fair, cpumask: Export for_each_cpu_wrap()
More users for for_each_cpu_wrap() have appeared. Promote the construct
to generic cpumask interface.

The implementation is slightly modified to reduce arguments.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Lauro Ramos Venancio <lvenanci@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: lwang@redhat.com
Link: http://lkml.kernel.org/r/20170414122005.o35me2h5nowqkxbv@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15 10:15:23 +02:00
Alexey Dobriyan
c311c79799 cpumask: make "nr_cpumask_bits" unsigned
Bit searching functions accept "unsigned long" indices but
"nr_cpumask_bits" is "int" which is signed, so inevitable sign
extensions occur on x86_64.  Those MOVSX are #1 MOVSX bloat by number of
uses across whole kernel.

Change "nr_cpumask_bits" to unsigned, this number can't be negative
after all.  It allows to do implicit zero-extension on x86_64 without
MOVSX.

Change signed comparisons into unsigned comparisons where necessary.

Other uses looks fine because it is either argument passed to a function
or comparison is already unsigned.

Net win on allyesconfig type of kernel: ~2.8 KB (!)

	add/remove: 0/0 grow/shrink: 8/725 up/down: 93/-2926 (-2833)
	function                                     old     new   delta
	xen_exit_mmap                                691     735     +44
	qstat_read                                   426     440     +14
	__cpufreq_cooling_register                  1678    1687      +9
	trace_rb_cpu_prepare                         447     455      +8
	vermagic                                      54      60      +6
	nfp_driver_version                            54      60      +6
	rcu_torture_stats_print                     1147    1151      +4
	find_next_push_cpu                           267     269      +2
	xen_irq_resume                               961     960      -1
				...
	init_vp_index                                946     906     -40
	od_set_powersave_bias                        328     281     -47
	power_cpu_exit                               193     139     -54
	arch_show_interrupts                        3538    3484     -54
	select_idle_sibling                         1558    1471     -87
	Total: Before=158358910, After=158356077, chg -0.00%

Same arguments apply to "nr_cpu_ids" but I haven't yet found enough
courage to delve into this issue (and proper fix may require new type
"cpu_t" which is whole separate story).

Link: http://lkml.kernel.org/r/20170309205322.GA1728@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:11 -07:00
Matthias Kaehlcke
f7e30f01a9 cpumask: Add helper cpumask_available()
With CONFIG_CPUMASK_OFFSTACK=y cpumask_var_t is a struct cpumask
pointer, otherwise a struct cpumask array with a single element.

Some code dealing with cpumasks needs to validate that a cpumask_var_t
is not a NULL pointer when CONFIG_CPUMASK_OFFSTACK=y. This is typically
done by performing the check always, regardless of the underlying type
of cpumask_var_t. This works in both cases, however clang raises a
warning like this when CONFIG_CPUMASK_OFFSTACK=n:

kernel/irq/manage.c:839:28: error: address of array
'desc->irq_common_data.affinity' will always evaluate to 'true'
[-Werror,-Wpointer-bool-conversion]

Add the inline helper cpumask_available() which only performs the
pointer check if CONFIG_CPUMASK_OFFSTACK=y.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Cc: Grant Grundler <grundler@chromium.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Hackmann <ghackmann@google.com>
Cc: Michael Davidson <md@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170412182030.83657-1-mka@chromium.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14 19:50:47 +02:00
Linus Torvalds
20dcfe1b7d Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Nothing exciting, just the usual pile of fixes, updates and cleanups:

   - A bunch of clocksource driver updates

   - Removal of CONFIG_TIMER_STATS and the related /proc file

   - More posix timer slim down work

   - A scalability enhancement in the tick broadcast code

   - Math cleanups"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  hrtimer: Catch invalid clockids again
  math64, tile: Fix build failure
  clocksource/drivers/arm_arch_timer:: Mark cyclecounter __ro_after_init
  timerfd: Protect the might cancel mechanism proper
  timer_list: Remove useless cast when printing
  time: Remove CONFIG_TIMER_STATS
  clocksource/drivers/arm_arch_timer: Work around Hisilicon erratum 161010101
  clocksource/drivers/arm_arch_timer: Introduce generic errata handling infrastructure
  clocksource/drivers/arm_arch_timer: Remove fsl-a008585 parameter
  clocksource/drivers/arm_arch_timer: Add dt binding for hisilicon-161010101 erratum
  clocksource/drivers/ostm: Add renesas-ostm timer driver
  clocksource/drivers/ostm: Document renesas-ostm timer DT bindings
  clocksource/drivers/tcb_clksrc: Use 32 bit tcb as sched_clock
  clocksource/drivers/gemini: Add driver for the Cortina Gemini
  clocksource: add DT bindings for Cortina Gemini
  clockevents: Add a clkevt-of mechanism like clksrc-of
  tick/broadcast: Reduce lock cacheline contention
  timers: Omit POSIX timer stuff from task_struct when disabled
  x86/timer: Make delay() work during early bootup
  delay: Add explanation of udelay() inaccuracy
  ...
2017-02-20 10:06:32 -08:00
Tejun Heo
4d59b6ccf0 cpumask: use nr_cpumask_bits for parsing functions
Commit 513e3d2d11 ("cpumask: always use nr_cpu_ids in formatting and
parsing functions") converted both cpumask printing and parsing
functions to use nr_cpu_ids instead of nr_cpumask_bits.  While this was
okay for the printing functions as it just picked one of the two output
formats that we were alternating between depending on a kernel config,
doing the same for parsing wasn't okay.

nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS.  We can always use
nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can
be more efficient to use NR_CPUS when we can get away with it.
Converting the printing functions to nr_cpu_ids makes sense because it
affects how the masks get presented to userspace and doesn't break
anything; however, using nr_cpu_ids for parsing functions can
incorrectly leave the higher bits uninitialized while reading in these
masks from userland.  As all testing and comparison functions use
nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks
can erroneously yield false negative results.

This made the taskstats interface incorrectly return -EINVAL even when
the inputs were correct.

Fix it by restoring the parse functions to use nr_cpumask_bits instead
of nr_cpu_ids.

Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org
Fixes: 513e3d2d11 ("cpumask: always use nr_cpu_ids in formatting and parsing functions")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Martin Steigerwald <martin.steigerwald@teamix.de>
Debugged-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-08 15:41:43 -08:00
Waiman Long
668802c257 tick/broadcast: Reduce lock cacheline contention
It was observed that on an Intel x86 system without the ARAT (Always
running APIC timer) feature and with fairly large number of CPUs as
well as CPUs coming in and out of intel_idle frequently, the lock
contention on the tick_broadcast_lock can become significant.

To reduce contention, the lock is put into its own cacheline and all
the cpumask_var_t variables are put into the __read_mostly section.

Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket
16-core 32-thread Nehalam system, the performance number improved
from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on
a 4.9.6 kernel.  This is a 3.4% improvement.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1485799063-20857-1-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04 08:54:46 +01:00
Thomas Gleixner
427d77a323 x86/smpboot: Prevent false positive out of bounds cpumask access warning
prefill_possible_map() reinitializes the cpu_possible_map by setting the
possible cpu bits and clearing all other bits up to NR_CPUS.

This is technically always correct because cpu_possible_map is statically
allocated and sized NR_CPUS. With CPUMASK_OFFSTACK and DEBUG_PER_CPU_MAPS
enabled the bounds check of cpu masks happens on nr_cpu_ids. nr_cpu_ids is
initialized to NR_CPUS and only limited after the set/clear bit loops have
been executed. 

But if the system was booted with "nr_cpus=N" on the command line, where N
is < NR_CPUS then nr_cpu_ids is limited in the parameter parsing function
before prefill_possible_map() is invoked. As a consequence the cpumask
bounds check triggers when clearing the bits past nr_cpu_ids.

Add a helper which allows to reset cpu_possible_map w/o the bounds check
and then set only the possible bits which are well inside bounds.

Reported-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: 0x7f454c46@gmail.com
Cc: Jan Beulich <JBeulich@novell.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612131836050.3415@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-15 11:32:31 +01:00
Geliang Tang
b06fb41533 cpumask: fix code comment
Fix code comment for cpumask_parse().

Link: http://lkml.kernel.org/r/71aae2c60ae5dae0cf554199ce6aea8f88c69347.1465380581.git.geliangtang@gmail.com
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 19:35:24 -04:00
Peter Zijlstra (Intel)
e9d867a67f sched: Allow per-cpu kernel threads to run on online && !active
In order to enable symmetric hotplug, we must mirror the online &&
!active state of cpu-down on the cpu-up side.

However, to retain sanity, limit this state to per-cpu kthreads.

Aside from the change to set_cpus_allowed_ptr(), which allow moving
the per-cpu kthreads on, the other critical piece is the cpu selection
for pinned tasks in select_task_rq(). This avoids dropping into
select_fallback_rq().

select_fallback_rq() cannot be allowed to select !active cpus because
its used to migrate user tasks away. And we do not want to move user
tasks onto cpus that are in transition.

Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160301152303.GV6356@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:22 +02:00
Eric Biggers
95f27356e4 cpumask: remove incorrect information from comment
Since commit cdfdef75e7 ("cpumask: only allocate nr_cpumask_bits."),
this comment above cpumask_size() is no longer relevant.

Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Rasmus Villemoes
9425676a36 kernel/cpu.c: make set_cpu_* static inlines
Almost all callers of the set_cpu_* functions pass an explicit true or
false.  Making them static inline thus replaces the function calls with a
simple set_bit/clear_bit, saving some .text.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes
5aec01b834 kernel/cpu.c: eliminate cpu_*_mask
Replace the variables cpu_possible_mask, cpu_online_mask, cpu_present_mask
and cpu_active_mask with macros expanding to expressions of the same type
and value, eliminating some indirection.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rasmus Villemoes
4b804c85dc kernel/cpu.c: export __cpu_*_mask
Exporting the cpumasks __cpu_possible_mask and friends will allow us to
remove the extra indirection through the cpu_*_mask variables.  It will
also allow the set_cpu_* functions to become static inlines, which will
give a .text reduction.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Rusty Russell
f36963c9d3 cpumask_set_cpu_local_first => cpumask_local_spread, lament
da91309e0a (cpumask: Utility function to set n'th cpu...) created a
genuinely weird function.  I never saw it before, it went through DaveM.
(He only does this to make us other maintainers feel better about our own
mistakes.)

cpumask_set_cpu_local_first's purpose is say "I need to spread things
across N online cpus, choose the ones on this numa node first"; you call
it in a loop.

It can fail.  One of the two callers ignores this, the other aborts and
fails the device open.

It can fail in two ways: allocating the off-stack cpumask, or through a
convoluted codepath which AFAICT can only occur if cpu_online_mask
changes.  Which shouldn't happen, because if cpu_online_mask can change
while you call this, it could return a now-offline cpu anyway.

It contains a nonsensical test "!cpumask_of_node(numa_node)".  This was
drawn to my attention by Geert, who said this causes a warning on Sparc.
It sets a single bit in a cpumask instead of returning a cpu number,
because that's what the callers want.

It could be made more efficient by passing the previous cpu rather than
an index, but that would be more invasive to the callers.

Fixes: da91309e0a
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (then rebased)
Tested-by: Amir Vadai <amirv@mellanox.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Acked-by: David S. Miller <davem@davemloft.net>
2015-05-28 11:05:20 +09:30
Rusty Russell
1527781d22 cpumask: resurrect CPU_MASK_CPU0
We removed it in 2f0f267ea0 (cpumask: remove deprecated functions.),
but grep shows it still used by MIPS, and not unreasonably.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-16 12:33:51 +09:30
Rasmus Villemoes
3bbf7f4624 linux/cpumask.h: add typechecking to cpumask_test_cpu
The Subtlety (1) referred to vanished with 6ba2ef7baa ("cpumask:
Move deprecated functions to end of header."). That used to mention
some suboptimal code generation by a, by now, rather ancient gcc. With
gcc 4.7, I don't see any change in the generated code by making it a
static inline, so let's add type checking and get rid of the ghost
reference.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-31 13:55:31 +10:30
Rusty Russell
cdfdef75e7 cpumask: only allocate nr_cpumask_bits.
Now we'll find out the hard way if anyone has CPUMASK_OFFSTACK and is
returning these or assigning them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-10 13:54:42 +10:30
Rusty Russell
2f0f267ea0 cpumask: remove deprecated functions.
Using these functions with offstack cpus is unsafe.  They use all NR_CPUS
bits, unstead of nr_cpumask_bits.

In particular, lustre (in staging) used cpus_ and that caused a bug.

Reported-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-10 13:54:41 +10:30
Rusty Russell
9941a383df CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
They're used to initialize various static fields, though static
cpumasks should generally be avoided.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-05 15:27:06 +10:30
Tejun Heo
46385326cc bitmap, cpumask, nodemask: remove dedicated formatting functions
Now that all bitmap formatting usages have been converted to
'%*pb[l]', the separate formatting functions are unnecessary.  The
following functions are removed.

* bitmap_scn[list]printf()
* cpumask_scnprintf(), cpulist_scnprintf()
* [__]nodemask_scnprintf(), [__]nodelist_scnprintf()
* seq_bitmap[_list](), seq_cpumask[_list](), seq_nodemask[_list]()
* seq_buf_bitmask()

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:39 -08:00
Tejun Heo
f1bbc032e4 cpumask, nodemask: implement cpumask/nodemask_pr_args()
printf family of functions can now format bitmaps using '%*pb[l]' and
all cpumask and nodemask formatting will be converted to use it.  To
ease printing these masks with '%*pb[l]' which require two params -
the number of bits and the actual bitmap, this patch implement
cpumask_pr_args() and nodemask_pr_args() which can be used to provide
arguments for '%*pb[l]'

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:36 -08:00
Tejun Heo
513e3d2d11 cpumask: always use nr_cpu_ids in formatting and parsing functions
bitmap implements two variants of scnprintf functions to format a bitmap
into a string and cpumask and nodemask wrap them to provide equivalent
interfaces.  The scnprintf family of functions require a string buffer as
an output target which complicates code paths which just want to print out
the mask through printk for informational or debug purposes as they have
to worry about how large the buffer should be and whether it's too large
to allocate on stack.

Neither cpumask or nodemask provides a guildeline on how large the target
buffer should be forcing users come up with their own solutions - some
allocate an arbitrarily sized buffer which is small enough to allocate on
stack but may be too short in corner cases, other come up with a custom
upper limit calculation considering the output format, some allocate the
buffer dynamically while one resorted to using lock to synchronize access
to a static buffer.

This is an artificial problem which is being solved repeatedly for no
benefit.  In a lot of cases, the output area already exists and can be
targeted directly making the intermediate buffer unnecessary.  This
patchset teaches printf family of functions how to format bitmaps and
replace the dedicated formatting functions with it.

Pointer formatting is extended to cover bitmap formatting.  It uses the
field width for the number of bits instead of precision.  The format used
is '%*pb[l]', with the optional trailing 'l' specifying list format
instead of hex masks.  For more details, please see 0002.

This patch (of 31):

Currently, the formatting and parsing functions in cpumask.h use
nr_cpumask_bits like other cpumask functions; however, nr_cpumask_bits
is either NR_CPUS or nr_cpu_ids depending on CONFIG_CPUMASK_OFFSTACK.
This leads to inconsistent behaviors.

With CONFIG_NR_CPUS=512 and !CONFIG_CPUMASK_OFFSTACK

  # cat /sys/devices/virtual/net/lo/queues/rx-0/rps_cpus
  00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000
  # cat /proc/self/status | grep Cpus_allowed:
  Cpus_allowed:   f

With CONFIG_NR_CPUS=1024 and CONFIG_CPUMASK_OFFSTACK (fedora default)

  # cat /sys/devices/virtual/net/lo/queues/rx-0/rps_cpus
  0
  # cat /proc/self/status | grep Cpus_allowed:
  Cpus_allowed:   f

Note that /proc/self/status is always using nr_cpu_ids regardless of
config.  This is because seq cpumask formattings functions always use
nr_cpu_ids.

Given that the same output fields may switch between the two forms,
converging on nr_cpu_ids always isn't too likely to surprise userland.
This patch updates the formatting and parsing functions in cpumask.h
to always use nr_cpu_ids.  There's no point in dealing with CPUs which
aren't even possible on the machine.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:36 -08:00