There's a deadlock when concurrently hot-adding memory through the probe
interface and switching a memory block from offline to online.
When hot-adding memory via the probe interface, add_memory() first takes
mem_hotplug_begin() and then device_lock() is later taken when registering
the newly initialized memory block. This creates a lock dependency of (1)
mem_hotplug.lock (2) dev->mutex.
When switching a memory block from offline to online, dev->mutex is first
grabbed in device_online() when the write(2) transitions an existing
memory block from offline to online, and then online_pages() will take
mem_hotplug_begin().
This creates a lock inversion between mem_hotplug.lock and dev->mutex.
Vitaly reports that this deadlock can happen when kworker handling a probe
event races with systemd-udevd switching a memory block's state.
This patch requires the state transition to take mem_hotplug_begin()
before dev->mutex. Hot-adding memory via the probe interface creates a
memory block while holding mem_hotplug_begin(), there is no way to take
dev->mutex first in this case.
online_pages() and offline_pages() are only called when transitioning
memory block state. We now require that mem_hotplug_begin() is taken
before calling them -- this requires exporting the mem_hotplug_begin() and
mem_hotplug_done() to generic code. In all hot-add and hot-remove cases,
mem_hotplug_begin() is done prior to device_online(). This is all that is
needed to avoid the deadlock.
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull driver core updates from Greg KH:
"Here's the driver-core / kobject / lz4 tree update for 4.1-rc1.
Everything here has been in linux-next for a while with no reported
issues. It's mostly just coding style cleanups, with other minor
changes in here as well, nothing big"
* tag 'driver-core-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
debugfs: allow bad parent pointers to be passed in
stable_kernel_rules: Add clause about specification of kernel versions to patch.
kobject: WARN as tip when call kobject_get() to a kobject not initialized
lib/lz4: Pull out constant tables
drivers: platform: parse IRQ flags from resources
driver core: Make probe deferral more quiet
drivers/core/of: Add symlink to device-tree from devices with an OF node
device: Add dev_of_node() accessor
drivers: base: fw: fix ret value when loading fw
firmware: Avoid manual device_create_file() calls
drivers/base: cacheinfo: validate device node for all the caches
drivers/base: use tabs where possible in code indentation
driver core: add missing blank line after declaration
drivers: base: node: Delete space after pointer declaration
drivers: base: memory: Use tabs instead of spaces
firmware_class: Fix whitespace and indentation
drivers: base: dma-mapping: Erase blank space after pointer
drivers: base: class: Add a blank line after declarations
attribute_container: fix missing blank lines after declarations
drivers: base: memory: Fix switch indent
...
Pull regmap update from Mark Brown:
"Just one patch for regmap this time around, a change from Steven
Rostedt to prettify the way we're making the regmap internal header
available to the trace events (it turns out that the trace subsystem
doesn't actually need to be in trace/events)"
* tag 'regmap-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Move tracing header into drivers/base/regmap
This fixes a regression from the net subsystem:
After commit d52fdbb735
"smc91x: retrieve IRQ and trigger flags in a modern way"
a regression would appear on some legacy platforms such
as the ARM PXA Zylonite that specify IRQ resources like
this:
static struct resource r = {
.start = X,
.end = X,
.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHEDGE,
};
The previous code would retrieve the resource and parse
the high edge setting in the SMC91x driver, a use pattern
that means every driver specifying an IRQ flag from a
static resource need to parse resource flags and apply
them at runtime.
As we switched the code to use IRQ descriptors to retrieve
the the trigger type like this:
irqd_get_trigger_type(irq_get_irq_data(...));
the code would work for new platforms using e.g. device
tree as the backing irq descriptor would have its flags
properly set, whereas this kind of oldstyle static
resources at no point assign the trigger flags to the
corresponding IRQ descriptor.
To make the behaviour identical on modern device tree
and legacy static platform data platforms, modify
platform_get_irq() to assign the trigger flags to the
irq descriptor when a client looks up an IRQ from static
resources.
Fixes: d52fdbb735 ("smc91x: retrieve IRQ and trigger flags in a modern way")
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently probe deferral prints a message every time a device requests
deferral at info severity (which is displayed by default). This can have
an impact on system boot times with serial consoles and is generally quite
noisy.
Since subsystems and drivers should already be logging the specific reason
for probe deferral in order to aid users in understanding problems the
messages from the driver core should be redundant lower the severity of
the messages printed, cutting down on the volume of output on the console.
This does mean that if the drivers and subsystems aren't doing a good job
we get no output on the console by default. Ideally we'd be able to arrange
to print if nothing else printed, though that's a little fun. Even better
would be to come up with a mechanism that explicitly does dependencies so
we don't have to keep polling and erroring.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
So I've been annoyed lately with having a bunch of devices such as i2c
eeproms (for use by VPDs, server world !) and other bits and pieces that
I want to be able to identify from userspace, and possibly provide
additional data about from FW.
Basically, it boils down to correlating the sysfs device with the OF
tree device node, so that user space can use device-tree info such as
additional "location" or "label" (or whatever else we can come up with)
propreties to identify a given device, or get some attributes of use
about it, etc...
Now, so far, we've done that in some subsystem in a fairly ad-hoc basis
using "devspec" properties. For example, PCI creates them if it can
correlate the probed device with a DT node. Some powerpc specific busses
do that too.
However, i2c doesn't and it would be nice to have something more generic
since technically any device can have a corresponding device tree node.
This patch adds an "of_node" symlink to devices that have a non-NULL
dev->of_node pointer, the patch is pretty trivial and seems to work just
fine for me.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When using the user mode helper to load firmwares the function _request_firmware
gets a positive return value from fw_load_from_user_helper and because of this
the firmware buffer is not assigned. This happens only when the return value
is zero. This patch fixes this problem in _request_firmware_load. When the
completion is ready the return value is set to zero.
Signed-off-by: Zahari Doychev <zahari.doychev@linux.com>
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the static attribute groups assigned to the device instead of
manual device_create_file() & co calls. It simplifies the code and
can avoid possible races, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On architectures that depend on DT for obtaining cache hierarcy, we need
to validate the device node for all the cache indices, failing to do so
might result in wrong information being exposed to the userspace.
This is quite possible on initial/incomplete versions of the device
trees. In such cases, it's better to bail out if all the required device
nodes are not present.
This patch adds checks for the validation of device node for all the
caches and doesn't initialise the cacheinfo if there's any error.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: stable <stable@vger.kernel.org> # 4.0
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linux kernel coding style require that tabs should be used instead of
spaces for code indentation.
Problem found using checkpatch.pl script.
Signed-off-by: Lavinia Tache <lavinia.tachee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes the following error found by checkpatch.pl:
ERROR: "foo * bar" should be "foo *bar"
Signed-off-by: Ana Nedelcu <anafnedelcu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes the following checkpatch.pl error:
ERROR: "foo * bar" should be "foo *bar"
Signed-off-by: Marius Cristian Eseanu <eseanu.cristian@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes the following warning found by checkpatch.pl:
WARNING: Missing a black line after declarations
Signed-off-by: Cosmin Tomulescu <cosmintom@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported by checkpatch.pl
While at it, removed blank line between function call and error
checking.
Signed-off-by: Andrei Poenaru <andreigpoenaru@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The put_device() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We can use the ATTRIBUTE_GROUPS() macro here, so use it, saving some
lines of code.
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Instead of manual calls of multiple device_create_file() and
device_remove_file(), use the static attribute groups assigned to the
new device. This also fixes the possible races, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is not necessary to call device_remove_groups() when device_add_groups()
fails.
The group added by device_add_groups() should be removed if sysfs_create_link()
fails.
Fixes: fa6fdb33b4 ("driver core: bus_type: add dev_groups")
Signed-off-by: Junjie Mao <junjie_mao@yeah.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>