Commit Graph

94 Commits

Author SHA1 Message Date
Zhang Zhen
ed2f240094 memory-hotplug: add sysfs valid_zones attribute
Currently memory-hotplug has two limits:

1. If the memory block is in ZONE_NORMAL, you can change it to
   ZONE_MOVABLE, but this memory block must be adjacent to ZONE_MOVABLE.

2. If the memory block is in ZONE_MOVABLE, you can change it to
   ZONE_NORMAL, but this memory block must be adjacent to ZONE_NORMAL.

With this patch, we can easy to know a memory block can be onlined to
which zone, and don't need to know the above two limits.

Updated the related Documentation.

[akpm@linux-foundation.org: use conventional comment layout]
[akpm@linux-foundation.org: fix build with CONFIG_MEMORY_HOTREMOVE=n]
[akpm@linux-foundation.org: remove unused local zone_prev]
Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:52 -04:00
Zhang Zhen
b69deb2b7e mm/mem-hotplug: replace simple_strtoull() with kstrtoull()
Use the newer and more pleasant kstrtoull() to replace
simple_strtoull(), because simple_strtoull() is marked for obsoletion.

Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:18 -07:00
Tang Chen
4f7c6b49c4 mem-hotplug: introduce MMOP_OFFLINE to replace the hard coding -1
In store_mem_state(), we have:

  ...
  334         else if (!strncmp(buf, "offline", min_t(int, count, 7)))
  335                 online_type = -1;
  ...
  355         case -1:
  356                 ret = device_offline(&mem->dev);
  357                 break;
  ...

Here, "offline" is hard coded as -1.

This patch does the following renaming:

 ONLINE_KEEP     ->  MMOP_ONLINE_KEEP
 ONLINE_KERNEL   ->  MMOP_ONLINE_KERNEL
 ONLINE_MOVABLE  ->  MMOP_ONLINE_MOVABLE

and introduces MMOP_OFFLINE = -1 to avoid hard coding.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:16 -07:00
Tang Chen
1f6a6cc82e mem-hotplug: avoid illegal state prefixed with legal state when changing state of memory_block
We use the following command to online a memory_block:

    echo online|online_kernel|online_movable > /sys/devices/system/memory/memoryXXX/state

But, if we do the following:

    echo online_fhsjkghfkd > /sys/devices/system/memory/memoryXXX/state

the block will also be onlined.

This is because the following code in store_mem_state() does not compare
the whole string, but only the prefix of the string.

  store_mem_state()
  {
       ......
   328         if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))

Here, only compare the first 13 letters of the string. If we give "online_kernelXXXXXX",
it will be recognized as online_kernel, which is incorrect.

   329                 online_type = ONLINE_KERNEL;
   330         else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))

We have the same problem here,

   331                 online_type = ONLINE_MOVABLE;
   332         else if (!strncmp(buf, "online", min_t(int, count, 6)))

here,

(Here is more problematic.  If we give online_movalbe, which is a typo
of online_movable, it will be recognized as online without noticing the
author.)

   333                 online_type = ONLINE_KEEP;
   334         else if (!strncmp(buf, "offline", min_t(int, count, 7)))

and here.

   335                 online_type = -1;
   336         else {
   337                 ret = -EINVAL;
   338                 goto err;
   339         }
       ......
  }

This patch fixes this problem by using sysfs_streq() to compare the
whole string.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reported-by: Hu Tao <hutao@cn.fujitsu.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:16 -07:00
Li Zhong
56a3c655a3 memory-hotplug: update documentation to hide information about SECTIONS and remove end_phys_index
Seems we all agree that information about SECTION, e.g. section size,
sections per memory block should be kept as kernel internals, and not
exposed to userspace.

This patch updates Documentation/memory-hotplug.txt to refer to memory
blocks instead of memory sections where appropriate and added a
paragraph to explain that memory blocks are made of memory sections.
The documentation update is mostly provided by Nathan.

Also, as end_phys_index in code is actually not the end section id, but
the end memory block id, which should always be the same as phys_index.
So it is removed here.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:58 -07:00
Yasuaki Ishimatsu
a37f86305c driver core: Release device_hotplug_lock when store_mem_state returns EINVAL
When inserting a wrong value to /sys/devices/system/memory/memoryX/state file,
following messages are shown. And device_hotplug_lock is never released.

================================================
[ BUG: lock held when returning to user space! ]
3.12.0-rc4-debug+ #3 Tainted: G        W
------------------------------------------------
bash/6442 is leaving the kernel with locks still held!
1 lock held by bash/6442:
 #0:  (device_hotplug_lock){+.+.+.}, at: [<ffffffff8146cbb5>] lock_device_hotplug_sysfs+0x15/0x50

This issue was introdued by commit fa2be40 (drivers: base: use standard
device online/offline for state change).

This patch releases device_hotplug_lcok when store_mem_state returns EINVAL.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
CC: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-16 18:42:41 -07:00
Linus Torvalds
40031da445 Merge tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:

 1) ACPI-based PCI hotplug (ACPIPHP) subsystem rework and introduction
    of Intel Thunderbolt support on systems that use ACPI for signalling
    Thunderbolt hotplug events.  This also should make ACPIPHP work in
    some cases in which it was known to have problems.  From
    Rafael J Wysocki, Mika Westerberg and Kirill A Shutemov.

 2) ACPI core code cleanups and dock station support cleanups from
    Jiang Liu and Rafael J Wysocki.

 3) Fixes for locking problems related to ACPI device hotplug from
    Rafael J Wysocki.

 4) ACPICA update to version 20130725 includig fixes, cleanups, support
    for more than 256 GPEs per GPE block and a change to make the ACPI
    PM Timer optional (we've seen systems without the PM Timer in the
    field already).  One of the fixes, related to the DeRefOf operator,
    is necessary to prevent some Windows 8 oriented AML from causing
    problems to happen.  From Bob Moore, Lv Zheng, and Jung-uk Kim.

 5) Removal of the old and long deprecated /proc/acpi/event interface
    and related driver changes from Thomas Renninger.

 6) ACPI and Xen changes to make the reduced hardware sleep work with
    the latter from Ben Guthro.

 7) ACPI video driver cleanups and a blacklist of systems that should
    not tell the BIOS that they are compatible with Windows 8 (or ACPI
    backlight and possibly other things will not work on them).  From
    Felipe Contreras.

 8) Assorted ACPI fixes and cleanups from Aaron Lu, Hanjun Guo,
    Kuppuswamy Sathyanarayanan, Lan Tianyu, Sachin Kamat, Tang Chen,
    Toshi Kani, and Wei Yongjun.

 9) cpufreq ondemand governor target frequency selection change to
    reduce oscillations between min and max frequencies (essentially,
    it causes the governor to choose target frequencies proportional
    to load) from Stratos Karafotis.

10) cpufreq fixes allowing sysfs attributes file permissions to be
    preserved over suspend/resume cycles Srivatsa S Bhat.

11) Removal of Device Tree parsing for CPU device nodes from multiple
    cpufreq drivers that required some changes related to
    of_get_cpu_node() to be made in a few architectures and in the
    driver core.  From Sudeep KarkadaNagesha.

12) cpufreq core fixes and cleanups related to mutual exclusion and
    driver module references from Viresh Kumar, Lukasz Majewski and
    Rafael J Wysocki.

13) Assorted cpufreq fixes and cleanups from Amit Daniel Kachhap,
    Bartlomiej Zolnierkiewicz, Hanjun Guo, Jingoo Han, Joseph Lo,
    Julia Lawall, Li Zhong, Mark Brown, Sascha Hauer, Stephen Boyd,
    Stratos Karafotis, and Viresh Kumar.

14) Fixes to prevent race conditions in coupled cpuidle from happening
    from Colin Cross.

15) cpuidle core fixes and cleanups from Daniel Lezcano and
    Tuukka Tikkanen.

16) Assorted cpuidle fixes and cleanups from Daniel Lezcano,
    Geert Uytterhoeven, Jingoo Han, Julia Lawall, Linus Walleij,
    and Sahara.

17) System sleep tracing changes from Todd E Brandt and Shuah Khan.

18) PNP subsystem conversion to using struct dev_pm_ops for power
    management from Shuah Khan.

* tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (217 commits)
  cpufreq: Don't use smp_processor_id() in preemptible context
  cpuidle: coupled: fix race condition between pokes and safe state
  cpuidle: coupled: abort idle if pokes are pending
  cpuidle: coupled: disable interrupts after entering safe state
  ACPI / hotplug: Remove containers synchronously
  driver core / ACPI: Avoid device hot remove locking issues
  cpufreq: governor: Fix typos in comments
  cpufreq: governors: Remove duplicate check of target freq in supported range
  cpufreq: Fix timer/workqueue corruption due to double queueing
  ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
  ACPI / thermal: Add check of "_TZD" availability and evaluating result
  cpufreq: imx6q: Fix clock enable balance
  ACPI: blacklist win8 OSI for buggy laptops
  cpufreq: tegra: fix the wrong clock name
  cpuidle: Change struct menu_device field types
  cpuidle: Add a comment warning about possible overflow
  cpuidle: Fix variable domains in get_typical_interval()
  cpuidle: Fix menu_device->intervals type
  cpuidle: CodingStyle: Break up multiple assignments on single line
  cpuidle: Check called function parameter in get_typical_interval()
  ...
2013-09-03 15:59:39 -07:00
Linus Torvalds
542a086ac7 Merge tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core patches from Greg KH:
 "Here's the big driver core pull request for 3.12-rc1.

  Lots of tiny changes here fixing up the way sysfs attributes are
  created, to try to make drivers simpler, and fix a whole class race
  conditions with creations of device attributes after the device was
  announced to userspace.

  All the various pieces are acked by the different subsystem
  maintainers"

* tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits)
  firmware loader: fix pending_fw_head list corruption
  drivers/base/memory.c: introduce help macro to_memory_block
  dynamic debug: line queries failing due to uninitialized local variable
  sysfs: sysfs_create_groups returns a value.
  debugfs: provide debugfs_create_x64() when disabled
  rbd: convert bus code to use bus_groups
  firmware: dcdbas: use binary attribute groups
  sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled
  driver core: add #include <linux/sysfs.h> to core files.
  HID: convert bus code to use dev_groups
  Input: serio: convert bus code to use drv_groups
  Input: gameport: convert bus code to use drv_groups
  driver core: firmware: use __ATTR_RW()
  driver core: core: use DEVICE_ATTR_RO
  driver core: bus: use DRIVER_ATTR_WO()
  driver core: create write-only attribute macros for devices and drivers
  sysfs: create __ATTR_WO()
  driver-core: platform: convert bus code to use dev_groups
  workqueue: convert bus code to use dev_groups
  MEI: convert bus code to use dev_groups
  ...
2013-09-03 11:37:15 -07:00
Rafael J. Wysocki
5e33bc4165 driver core / ACPI: Avoid device hot remove locking issues
device_hotplug_lock is held around the acpi_bus_trim() call in
acpi_scan_hot_remove() which generally removes devices (it removes
ACPI device objects at least, but it may also remove "physical"
device objects through .detach() callbacks of ACPI scan handlers).
Thus, potentially, device sysfs attributes are removed under that
lock and to remove those attributes it is necessary to hold the
s_active references of their directory entries for writing.

On the other hand, the execution of a .show() or .store() callback
from a sysfs attribute is carried out with that attribute's s_active
reference held for reading.  Consequently, if any device sysfs
attribute that may be removed from within acpi_scan_hot_remove()
through acpi_bus_trim() has a .store() or .show() callback which
acquires device_hotplug_lock, the execution of that callback may
deadlock with the removal of the attribute.  [Unfortunately, the
"online" device attribute of CPUs and memory blocks is one of them.]

To avoid such deadlocks, make all of the sysfs attribute callbacks
that need to lock device hotplug, for example store_online(), use
a special function, lock_device_hotplug_sysfs(), to lock device
hotplug and return the result of that function immediately if it is
not zero.  This will cause the s_active reference of the directory
entry in question to be released and the syscall to be restarted
if device_hotplug_lock cannot be acquired.

[show_online() actually doesn't need to lock device hotplug, but
it is useful to serialize it with respect to device_offline() and
device_online() for the same device (in case user space attempts to
run them concurrently) which can be done with the help of
device_lock().]

Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reported-and-tested-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
2013-08-29 22:00:53 +02:00
Russ Anderson
21ea9f5ace drivers/base/memory.c: fix show_mem_removable() to handle missing sections
"cat /sys/devices/system/memory/memory*/removable" crashed the system.

The problem is that show_mem_removable() is passing a
bad pfn to is_mem_section_removable(), which causes

    if (!node_online(page_to_nid(page)))

to blow up.  Why is it passing in a bad pfn?

The reason is that show_mem_removable() will loop sections_per_block
times.  sections_per_block is 16, but mem->section_count is 8,
indicating holes in this memory block.  Checking that the memory section
is present before checking to see if the memory section is removable
fixes the problem.

   harp5-sys:~ # cat /sys/devices/system/memory/memory*/removable
   0
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   BUG: unable to handle kernel paging request at ffffea00c3200000
   IP: [<ffffffff81117ed1>] is_pageblock_removable_nolock+0x1/0x90
   PGD 83ffd4067 PUD 37bdfce067 PMD 0
   Oops: 0000 [#1] SMP
   Modules linked in: autofs4 binfmt_misc rdma_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_uverbs ib_umad iw_cxgb3 cxgb3 mdio mlx4_en mlx4_ib ib_sa mlx4_core ib_mthca ib_mad ib_core fuse nls_iso8859_1 nls_cp437 vfat fat joydev loop hid_generic usbhid hid hwperf(O) numatools(O) dm_mod iTCO_wdt ipv6 iTCO_vendor_support igb i2c_i801 ioatdma i2c_algo_bit ehci_pci pcspkr lpc_ich i2c_core ehci_hcd ptp sg mfd_core dca rtc_cmos pps_core mperf button xhci_hcd sd_mod crc_t10dif usbcore usb_common scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh gru(O) xvma(O) xfs crc32c libcrc32c thermal sata_nv processor piix mptsas mptscsih scsi_transport_sas mptbase megaraid_sas fan thermal_sys hwmon ext3 jbd ata_piix ahci libahci libata scsi_mod
   CPU: 4 PID: 5991 Comm: cat Tainted: G           O 3.11.0-rc5-rja-uv+ #10
   Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013
   task: ffff88081f034580 ti: ffff880820022000 task.ti: ffff880820022000
   RIP: 0010:[<ffffffff81117ed1>]  [<ffffffff81117ed1>] is_pageblock_removable_nolock+0x1/0x90
   RSP: 0018:ffff880820023df8  EFLAGS: 00010287
   RAX: 0000000000040000 RBX: ffffea00c3200000 RCX: 0000000000000004
   RDX: ffffea00c30b0000 RSI: 00000000001c0000 RDI: ffffea00c3200000
   RBP: ffff880820023e38 R08: 0000000000000000 R09: 0000000000000001
   R10: 0000000000000000 R11: 0000000000000001 R12: ffffea00c33c0000
   R13: 0000160000000000 R14: 6db6db6db6db6db7 R15: 0000000000000001
   FS:  00007ffff7fb2700(0000) GS:ffff88083fc80000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: ffffea00c3200000 CR3: 000000081b954000 CR4: 00000000000407e0
   Call Trace:
     show_mem_removable+0x41/0x70
     dev_attr_show+0x2a/0x60
     sysfs_read_file+0xf7/0x1c0
     vfs_read+0xc8/0x130
     SyS_read+0x5d/0xa0
     system_call_fastpath+0x16/0x1b

Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-28 19:26:38 -07:00
Gu Zheng
7315f0ccfc drivers/base/memory.c: introduce help macro to_memory_block
Introduce help macro to_memory_block to hide the conversion(device-->memory_block),
just clean up.

Reviewed-by: Yasuaki Ishimatsu  <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-28 15:47:38 -07:00
Seth Jennings
fa2be40fe7 drivers: base: use standard device online/offline for state change
There are two ways to set the online/offline state for a memory block:
echo 0|1 > online and echo online|online_kernel|online_movable|offline >
state.

The state attribute can online a memory block with extra data, the
"online type", where the online attribute uses a default online type of
ONLINE_KEEP, same as echo online > state.

Currently there is a state_mutex that provides consistency between the
memory block state and the underlying memory.

The problem is that this code does a lot of things that the common
device layer can do for us, such as the serialization of the
online/offline handlers using the device lock, setting the dev->offline
field, and calling kobject_uevent().

This patch refactors the online/offline code to allow the common
device_[online|offline] functions to be used.  The result is a simpler
and more common code path for the two state setting mechanisms.  It also
removes the state_mutex from the struct memory_block as the memory block
device lock provides the state consistency.

No functional change is intended by this patch.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:52:20 -07:00
Seth Jennings
cb5e39b803 drivers: base: refactor add_memory_section() to add_memory_block()
Right now memory_dev_init() maintains the memory block pointer
between iterations of add_memory_section().  This is nasty.

This patch refactors add_memory_section() to become add_memory_block().
The refactoring pulls the section scanning out of memory_dev_init()
and simplifies the signature.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:49:47 -07:00
Seth Jennings
37171e3cb7 drivers: base: remove improper get/put in add_memory_section()
The path through add_memory_section() when the memory block already
exists uses flawed refcounting logic.  A get_device() is done on a
memory block using a pointer that might not be valid as we dropped
our previous reference and didn't obtain a new reference in the
proper way.

Lets stop pretending and just remove the get/put.  The
mem_sysfs_mutex, which we hold over the entire init loop now, will
prevent the memory blocks from disappearing from under us.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:49:47 -07:00
Seth Jennings
37a7bd6255 drivers: base: reduce add_memory_section() for boot-time only
Now that add_memory_section() is only called from boot time, reduce
the logic and remove the enum.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:48:41 -07:00
Seth Jennings
d7f80530ad drivers: base: unshare add_memory_section() from hotplug
add_memory_section() is currently called from both boot time and run
time via hotplug and there is a lot of nastiness to allow for shared
code including an enum parameter to convey the calling context to
add_memory_section().

This patch is the first step in breaking up the messy code sharing by
pulling the hotplug path for add_memory_section() directly into
register_new_memory().

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:48:40 -07:00
Seth Jennings
df2b717c66 drivers: base: use device get/put functions
Use the [get|put]_device functions for ref'ing the memory block device
rather than the kobject functions which should be hidden away by the
device layer.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:48:40 -07:00
Seth Jennings
879f1bec8e drivers: base: remove unneeded variable
The error variable is not needed.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:47:39 -07:00
Seth Jennings
b1eaef3da5 drivers: base: move mutex lock out of add_memory_section()
There is no point in releasing the mutex for each section that is added
during boot time.  Just hold it over the entire initialization loop.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21 11:47:39 -07:00
Jingoo Han
34da5e6770 driver core: replace strict_strto*() with kstrto*()
The usage of strict_strto*() is not preferred, because
strict_strto*() is obsolete. Thus, kstrto*() should be
used.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-26 18:02:43 -07:00
Linus Torvalds
f991fae5c6 Merge tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
 "This time the total number of ACPI commits is slightly greater than
  the number of cpufreq commits, but Viresh Kumar (who works on cpufreq)
  remains the most active patch submitter.

  To me, the most significant change is the addition of offline/online
  device operations to the driver core (with the Greg's blessing) and
  the related modifications of the ACPI core hotplug code.  Next are the
  freezer updates from Colin Cross that should make the freezing of
  tasks a bit less heavy weight.

  We also have a couple of regression fixes, a number of fixes for
  issues that have not been identified as regressions, two new drivers
  and a bunch of cleanups all over.

  Highlights:

   - Hotplug changes to support graceful hot-removal failures.

     It sometimes is necessary to fail device hot-removal operations
     gracefully if they cannot be carried out completely.  For example,
     if memory from a memory module being hot-removed has been allocated
     for the kernel's own use and cannot be moved elsewhere, it's
     desirable to fail the hot-removal operation in a graceful way
     rather than to crash the kernel, but currenty a success or a kernel
     crash are the only possible outcomes of an attempted memory
     hot-removal.  Needless to say, that is not a very attractive
     alternative and it had to be addressed.

     However, in order to make it work for memory, I first had to make
     it work for CPUs and for this purpose I needed to modify the ACPI
     processor driver.  It's been split into two parts, a resident one
     handling the low-level initialization/cleanup and a modular one
     playing the actual driver's role (but it binds to the CPU system
     device objects rather than to the ACPI device objects representing
     processors).  That's been sort of like a live brain surgery on a
     patient who's riding a bike.

     So this is a little scary, but since we found and fixed a couple of
     regressions it caused to happen during the early linux-next testing
     (a month ago), nobody has complained.

     As a bonus we remove some duplicated ACPI hotplug code, because the
     ACPI-based CPU hotplug is now going to use the common ACPI hotplug
     code.

   - Lighter weight freezing of tasks.

     These changes from Colin Cross and Mandeep Singh Baines are
     targeted at making the freezing of tasks a bit less heavy weight
     operation.  They reduce the number of tasks woken up every time
     during the freezing, by using the observation that the freezer
     simply doesn't need to wake up some of them and wait for them all
     to call refrigerator().  The time needed for the freezer to decide
     to report a failure is reduced too.

     Also reintroduced is the check causing a lockdep warining to
     trigger when try_to_freeze() is called with locks held (which is
     generally unsafe and shouldn't happen).

   - cpufreq updates

     First off, a commit from Srivatsa S Bhat fixes a resume regression
     introduced during the 3.10 cycle causing some cpufreq sysfs
     attributes to return wrong values to user space after resume.  The
     fix is kind of fresh, but also it's pretty obvious once Srivatsa
     has identified the root cause.

     Second, we have a new freqdomain_cpus sysfs attribute for the
     acpi-cpufreq driver to provide information previously available via
     related_cpus.  From Lan Tianyu.

     Finally, we fix a number of issues, mostly related to the
     CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean
     up some code.  The majority of changes from Viresh Kumar with bits
     from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia,
     Arnd Bergmann, and Tang Yuantian.

   - ACPICA update

     A usual bunch of updates from the ACPICA upstream.

     During the 3.4 cycle we introduced support for ACPI 5 extended
     sleep registers, but they are only supposed to be used if the
     HW-reduced mode bit is set in the FADT flags and the code attempted
     to use them without checking that bit.  That caused suspend/resume
     regressions to happen on some systems.  Fix from Lv Zheng causes
     those registers to be used only if the HW-reduced mode bit is set.

     Apart from this some other ACPICA bugs are fixed and code cleanups
     are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and
     Zhang Rui.

   - cpuidle updates

     New driver for Xilinx Zynq processors is added by Michal Simek.

     Multidriver support simplification, addition of some missing
     kerneldoc comments and Kconfig-related fixes come from Daniel
     Lezcano.

   - ACPI power management updates

     Changes to make suspend/resume work correctly in Xen guests from
     Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and
     cleanups and fixes of the ACPI device power state selection
     routine.

   - ACPI documentation updates

     Some previously missing pieces of ACPI documentation are added by
     Lv Zheng and Aaron Lu (hopefully, that will help people to
     uderstand how the ACPI subsystem works) and one outdated doc is
     updated by Hanjun Guo.

   - Assorted ACPI updates

     We finally nailed down the IA-64 issue that was the reason for
     reverting commit 9f29ab11dd ("ACPI / scan: do not match drivers
     against objects having scan handlers"), so we can fix it and move
     the ACPI scan handler check added to the ACPI video driver back to
     the core.

     A mechanism for adding CMOS RTC address space handlers is
     introduced by Lan Tianyu to allow some EC-related breakage to be
     fixed on some systems.

     A spec-compliant implementation of acpi_os_get_timer() is added by
     Mika Westerberg.

     The evaluation of _STA is added to do_acpi_find_child() to avoid
     situations in which a pointer to a disabled device object is
     returned instead of an enabled one with the same _ADR value.  From
     Jeff Wu.

     Intel BayTrail PCH (Platform Controller Hub) support is added to
     the ACPI driver for Intel Low-Power Subsystems (LPSS) and that
     driver is modified to work around a couple of known BIOS issues.
     Changes from Mika Westerberg and Heikki Krogerus.

     The EC driver is fixed by Vasiliy Kulikov to use get_user() and
     put_user() instead of dereferencing user space pointers blindly.

     Code cleanups are made by Bjorn Helgaas, Nicholas Mazzuca and Toshi
     Kani.

   - Assorted power management updates

     The "runtime idle" helper routine is changed to take the return
     values of the callbacks executed by it into account and to call
     rpm_suspend() if they return 0, which allows us to reduce the
     overall code bloat a bit (by dropping some code that's not
     necessary any more after that modification).

     The runtime PM documentation is updated by Alan Stern (to reflect
     the "runtime idle" behavior change).

     New trace points for PM QoS are added by Sahara
     (<keun-o.park@windriver.com>).

     PM QoS documentation is updated by Lan Tianyu.

     Code cleanups are made and minor issues are addressed by Bernie
     Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan.

   - devfreq updates

     New driver for the Exynos5-bus device from Abhilash Kesavan.

     Minor cleanups, fixes and MAINTAINERS update from MyungJoo Ham,
     Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun.

   - OMAP power management updates

     Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver
     updates from Andrii Tseglytskyi and Nishanth Menon."

* tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
  cpufreq: Fix cpufreq regression after suspend/resume
  ACPI / PM: Fix possible NULL pointer deref in acpi_pm_device_sleep_state()
  PM / Sleep: Warn about system time after resume with pm_trace
  cpufreq: don't leave stale policy pointer in cdbs->cur_policy
  acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
  cpufreq: make sure frequency transitions are serialized
  ACPI: implement acpi_os_get_timer() according the spec
  ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
  ACPI: Add CMOS RTC Operation Region handler support
  ACPI / processor: Drop unused variable from processor_perflib.c
  cpufreq: tegra: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: s3c64xx: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: omap: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: imx6q: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: exynos: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: dbx500: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: davinci: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases
  ...
2013-07-03 14:35:40 -07:00
Nathan Fontenot
96b2c0fc8e drivers/base: Use attribute groups to create sysfs memory files
Update the sysfs memory code to create/delete files at the time of device
and subsystem registration.

The current code creates files in the root memory directory explicitly through
the use of init_* routines. The files for each memory block are created and
deleted explicitly using the mem_[create|delete]_simple_file macros.

This patch creates attribute groups for the memory root files and files in
each memory block directory so that they are created and deleted implicitly
at subsys and device register and unregister time.

This did necessitate moving the register_memory() updating it to set the
dev.groups field.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-06 12:38:11 -07:00
Rafael J. Wysocki
ea50be5934 Driver core / MM: Drop offline_memory_block()
Since offline_memory_block(mem) is functionally equivalent to
device_offline(&mem->dev), make the only caller of the former use
the latter instead and drop offline_memory_block() entirely.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
2013-06-01 21:37:10 +02:00
Rafael J. Wysocki
b2c064b25a Driver core / memory: Simplify __memory_block_change_state()
As noted by Tang Chen, the last_online field in struct memory_block
introduced by commit 4960e05 (Driver core: Introduce offline/online
callbacks for memory blocks) is not really necessary, because
online_pages() restores the previous state if passed ONLINE_KEEP as
the last argument.  Therefore, remove that field along with the code
referring to it.

References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
2013-06-01 21:37:09 +02:00
Rafael J. Wysocki
4960e05e22 Driver core: Introduce offline/online callbacks for memory blocks
Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks.  That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.

The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
2013-05-12 14:14:45 +02:00