The smpboot threads rely on the park/unpark mechanism which binds per
cpu threads on a particular core. Though the functionality is racy:
CPU0 CPU1 CPU2
unpark(T) wake_up_process(T)
clear(SHOULD_PARK) T runs
leave parkme() due to !SHOULD_PARK
bind_to(CPU2) BUG_ON(wrong CPU)
We cannot let the tasks move themself to the target CPU as one of
those tasks is actually the migration thread itself, which requires
that it starts running on the target cpu right away.
The solution to this problem is to prevent wakeups in park mode which
are not from unpark(). That way we can guarantee that the association
of the task to the target cpu is working correctly.
Add a new task state (TASK_PARKED) which prevents other wakeups and
use this state explicitly for the unpark wakeup.
Peter noticed: Also, since the task state is visible to userspace and
all the parked tasks are still in the PID space, its a good hint in ps
and friends that these tasks aren't really there for the moment.
The migration thread has another related issue.
CPU0 CPU1
Bring up CPU2
create_thread(T)
park(T)
wait_for_completion()
parkme()
complete()
sched_set_stop_task()
schedule(TASK_PARKED)
The sched_set_stop_task() call is issued while the task is on the
runqueue of CPU1 and that confuses the hell out of the stop_task class
on that cpu. So we need the same synchronizaion before
sched_set_stop_task().
Reported-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Dave Hansen <dave@sr71.net>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Peter Ziljstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: dhillf@gmail.com
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull slave-dmaengine fixes from Vinod Koul:
"The first one fixes issue in pl330 to check for DT compatible and
the second one fixes omap-dma to start without delay"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: omap-dma: Start DMA without delay for cyclic channels
DMA: PL330: Add check if device tree compatible
Pull power management fixes from Rafael Wysocki:
- System reboot/halt fix related to CPU offline ordering from Huacai
Chen.
- intel_pstate driver fix for a delay time computation error
occasionally crashing systems using it from Dirk Brandewie.
* tag 'pm-3.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()
cpufreq / intel_pstate: Set timer timeout correctly
Pull regmap revert from Mark Brown:
"regmap: Back out work buffer fix
This reverts commit bc8ce4 (regmap: don't corrupt work buffer in
_regmap_raw_write()) since it turns out that it can cause issues when
taken in isolation from the other changes in -next that lead to its
discovery. On the basis that nobody noticed the problems for quite
some time without that subsequent work let's drop it from v3.9."
* tag 'regmap-v3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Back out work buffer fix
Pull GPIO fix from Linus Walleij:
"Oneliner fix for the PCA 953x driver."
* tag 'gpio-fixes-v3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: pca953x: fix irq_domain_add_simple usage
Pull ARM SoC bug fixes from Arnd Bergmann:
"A little later during the week than the last few pull requests, since
there was very little that came in before 3.9-rc6. At least things
have calmed down again here.
Some important bug fixes that came in over the last 10 days, mostly
mvebu and imx:
- Multiple regressions on i.mx following the conversion of the clock
code, hopefully the last we are seeing of those.
- a regression in the mvebu irq handling code
- An incorrect register offset in the rewritten s3c24xx irq code.
- Two bugs in setting up the iomega_ix2_200 machine
- Turning on an extra bus clock on imx
- A MAINTAINERS file entry for Roland Stigge"
* tag 'arm-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm: mvebu: Fix the irq map function in SMP mode
Fix GE0/GE1 init on ix2-200 as GE0 has no PHY
ARM: S3C24XX: Fix interrupt pending register offset of the EINT controller
ARM: S3C24XX: Correct NR_IRQS definition for s3c2440
ARM i.MX6: Fix ldb_di clock selection
ARM: imx: provide twd clock lookup from device tree
ARM: imx35 Bugfix admux clock
ARM: clk-imx35: Bugfix iomux clock
ARM: mxs: Slow down the I2C clock speed
MAINTAINERS: Add maintainer for LPC32xx
ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED
We actually have to pass chip as the host_data parameter of
irq_domain_add_simple() as later on, it is used to initialize chip_data
in pca953x_gpio_irq_map(). Failing to do so is leading to a NULL pointer
dereference after calling irq_data_get_irq_chip_data() in
pca953x_irq_mask(), pca953x_irq_unmask(), pca953x_irq_bus_lock(),
pca953x_irq_bus_sync_unlock() and pca953x_irq_set_type().
Fixes regression introduced by commit
0e8f2fdacf (gpio: pca953x: use simple
irqdomain)
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
From Jason Cooper <jason@lakedaemon.net>:
mvebu fixes for v3.9 round 3
- Kirkwood
- a couple of small fixes for the Iomega ix2-200 board (ether and led)
- mvebu
- allow GPIO button to work on Mirabox when running SMP
* tag 'mvebu_fixes_for_v3.9_round3' of git://git.infradead.org/users/jcooper/linux:
arm: mvebu: Fix the irq map function in SMP mode
Fix GE0/GE1 init on ix2-200 as GE0 has no PHY
ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull char/misc fix from Greg Kroah-Hartman:
"Here is a single Kconfig dependancy build fix for 3.9.
It's been in linux-next for a while, and fixes a problem that has been
reported multiple times."
* tag 'char-misc-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc/vmw_vmci: Add dependency on CONFIG_NET
Pull tty/serial fixes from Greg Kroah-Hartman:
"Here are 4 small tty/serial fixes for 3.9.
One fixes a bug where we broke the documentation build, and the others
fix reported problems in some serial drivers.
All have been in linux-next for a while"
* tag 'tty-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: mxser: fix cycle termination condition in mxser_probe() and mxser_module_init()
Revert "tty/8250_pnp: serial port detection regression since v3.7"
OMAP/serial: Revert bad fix of Rx FIFO threshold granularity
tty: Documentation: fix a path in a DocBook template
Pull Xen fixes from Konrad Rzeszutek Wilk:
"Two bug-fixes:
- Early bootup issue found on DL380 machines
- Fix for the timer interrupt not being processed right awaym leading
to quite delayed time skew on certain workloads"
* tag 'stable/for-linus-3.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/mmu: On early bootup, flush the TLB when changing RO->RW bits Xen provided pagetables.
xen/events: Handle VIRQ_TIMER before any other hardirq in event loop.
Pull tracing fix from Steven Rostedt:
"Namhyung Kim fixed a long standing bug that can cause a kernel panic.
If the function profiler fails to allocate memory for everything, it
will do a double free on the same pointer which can cause a panic"
* tag 'trace-fixes-3.9-rc-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix double free when function profile init failed
Pull networking fixes from David Miller:
1) cfg80211_conn_scan() must be called with the sched_scan_mutex, fix
from Artem Savkov.
2) Fix regression in TCP ICMPv6 processing, we do not want to treat
redirects as socket errors, from Christoph Paasch.
3) Fix several recvmsg() msg_name kernel memory leaks into userspace,
in ATM, AX25, Bluetooth, CAIF, IRDA, s390 IUCV, L2TP, LLC, Netrom,
NFC, Rose, TIPC, and VSOCK. From Mathias Krause and Wei Yongjun.
4) Fix AF_IUCV handling of segmented SKBs in recvmsg(), from Ursula
Braun and Eric Dumazet.
5) CAN gw.c code does kfree() on SLAB cache memory, use
kmem_cache_free() instead. Fix from Wei Yongjun.
6) Fix LSM regression on TCP SYN/ACKs, some LSMs such as SELINUX want
an skb->sk socket context available for these packets, but nothing
else requires it. From Eric Dumazet and Paul Moore.
7) Fix ipv4 address lifetime processing so that we don't perform
sleepable acts inside of rcu_read_lock() sections, do them in an
rtnl_lock() section instead. From Jiri Pirko.
8) mvneta driver accidently sets HW features after device registry, it
should do so beforehand. Fix from Willy Tarreau.
9) Fix bonding unload races more correctly, from Nikolay Aleksandrov
and Veaceslav Falico.
10) rtnl_dump_ifinfo() and rtnl_calcit() invoke nlmsg_parse() with wrong
header size argument. Fix from Michael Riesch.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
lsm: add the missing documentation for the security_skb_owned_by() hook
bnx2x: Prevent null pointer dereference in AFEX mode
e100: Add dma mapping error check
selinux: add a skb_owned_by() hook
can: gw: use kmem_cache_free() instead of kfree()
netrom: fix invalid use of sizeof in nr_recvmsg()
qeth: fix qeth_wait_for_threads() deadlock for OSN devices
af_iucv: fix recvmsg by replacing skb_pull() function
rtnetlink: Call nlmsg_parse() with correct header length
bonding: fix bonding_masters race condition in bond unloading
Revert "bonding: remove sysfs before removing devices"
net: mvneta: enable features before registering the driver
hyperv: Fix RNDIS send_completion code path
hyperv: Fix a kernel warning from netvsc_linkstatus_callback()
net: ipv4: fix schedule while atomic bug in check_lifetime()
net: ipv4: reset check_lifetime_work after changing lifetime
bnx2x: Fix KR2 rapid link flap
sctp: remove 'sridhar' from maintainers list
VSOCK: Fix missing msg_namelen update in vsock_stream_recvmsg()
VSOCK: vmci - fix possible info leak in vmci_transport_dgram_dequeue()
...
Pull C6X fix from Mark Salter.
Final (?) fix from the barrier discussion.
* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
add memory barrier to arch_local_irq_restore
Unfortunately we didn't catch the missing comments earlier when the
patch was merged.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The cnic module is responsible for initializing various bnx2x structs
via callbacks provided by the bnx2x module.
One such struct is the queue object for the FCoE queue.
If a device is working in AFEX mode and its configuration allows FCoE yet
the cnic module is not loaded, it's very likely a null pointer dereference
will occur, as the bnx2x will erroneously access the FCoE's queue object.
Prevent said access until cnic properly registers itself.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull another nfs fixlet from Trond Myklebust:
"I suddenly noticed that a one-line issue that I _thought_ I had fixed
with the nfs41_walk_client_list patch was apparently still there in
the pull request I sent earlier today. I'm very sorry for not
catching that in time.
- Fix a brain fart in nfs41_walk_client_list"
* tag 'nfs-for-3.9-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Doh! Typo in the fix to nfs41_walk_client_list
This patch fix the regression introduced by the commit 3202bf0157
"arm: mvebu: Improve the SMP support of the interrupt controller":
GPIO IRQ were no longer delivered to the CPUs.
To be delivered to a CPU an interrupt must be enabled at CPU level and
at interrupt source level. Before the offending patch, all the
interrupts were enabled at source level during map() function. Mask()
and unmask() was done by handling the per-CPU part. It was fine when
running in UP with only one CPU.
The offending patch added support for SMP, in this case mask() and
unmask() was done by handling the interrupt source level part. The
per-CPU level part was handled by the affinity API to select the CPU
which will receive the interrupt. (Due to some hardware limitation
only one CPU at a time can received a given interrupt).
For "normal" interrupt __setup_irq() was called when an irq was
registered. irq_set_affinity() is called from this function, which
enabled the interrupt on one of the CPUs. Whereas for GPIO IRQ which
were chained interrupts, the irq_set_affinity() was never called and
none of the CPUs was selected to receive the interrupt.
With this patch all the interrupt are enable on the current CPU during
map() function. Enabling the interrupts on a CPU doesn't depend
anymore on irq_set_affinity() and then the chained irq are not anymore
a special case. However the CPU which will receive the irq can still
be modify later using irq_set_affinity().
Tested with Mirabox (A370) and Openblocks AX3 (AXP), rootfs mounted
over NFS, compiled with CONFIG_SMP=y/N.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reported-by: Ryan Press <ryan@presslab.us>
Investigated-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Ryan Press <ryan@presslab.us>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Make sure that we set the status to 0 on success. Missed in testing
because it never appears when doing multiple mounts to _different_
servers.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: <stable@vger.kernel.org> # 3.7.x: 7b1f1fd: NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
Pull NFS client bugfixes from Trond Myklebust:
- fix for memory corruption issues in nfs4[01]_walk_client_list (stable)
- fix for an Oopsable bug in rpc_clone_client (stable)
- another state manager deadlock in the NFSv4 open code
- memory leaks in nfs4_discover_server_trunking and rpc_new_client
* tag 'nfs-for-3.9-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix another potential state manager deadlock
SUNRPC: Fix a potential memory leak in rpc_new_client
NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
NFSv4: Fix a memory leak in nfs4_discover_server_trunking
SUNRPC: Remove extra xprt_put()
Pull crypto fixes from Herbert Xu:
"This fixes a GCM bug that breaks IPsec and a compile problem in
ux500."
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ux500 - add missing comma
crypto: gcm - fix assumption that assoc has one segment
Pull drm fixes from Dave Airlie:
"Just a spare semicolon in nouveau that caused some issues, and an
mgag200 fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/mgag200: Index 24 in extended CRTC registers is 24 in hex, not decimal.
drm/nouveau: fix unconditional return waiting on memory