Commit Graph

635261 Commits

Author SHA1 Message Date
Len Brown 388e9c8134 tools/power turbostat: Make extensible via the --add parameter
Create the "--add" parameter.  This can be used to teach an existing
turbostat binary about any number of any type of counter.

turbostat(8) details the syntax for --add.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-24 15:16:10 -05:00
Len Brown 7268d407ad tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz
This changes only the TSC frequency decoding line seen with --debug

old: TSC: 1382 MHz (19200000 Hz * 216 / 3 / 1000000)
new: TSC: 1800 MHz (25000000 Hz * 216 / 3 / 1000000)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 23:18:26 -05:00
Len Brown 5cc6323c79 tools/power turbostat: line up headers when -M is used
The -M option adds an 18-column item, and the header
needs to be wide enough to keep the header aligned
with the columns.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 21:14:38 -05:00
Len Brown d8ebb44226 tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding
SKX has fewer package C-states than previous generations,
and so the decoding of PKG_CSTATE_LIMIT has changed.

This changes the line ending with pkg-cstate-limit=XXX: pcYYY

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 20:27:46 -05:00
Len Brown 005c82d64d tools/power turbostat: Support Knights Mill (KNM)
Original-author: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:35:38 -05:00
Srinivas Pandruvada ddadb8adea tools/power turbostat: Display HWP OOB status
Display if the HWP is enabled in OOB (Out of band) mode.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:20 -05:00
Xiaolong Wang 5bbac26eae tools/power turbostat: fix Denverton BCLK
Add Denverton to the group of SandyBridge and later processors,
to let the bclk be recognized as 100MHz rather than 133MHz,
then avoid the wrong value of the frequencies based on it,
including Bzy_MHz, max efficiency freuency, base frequency,
and turbo mode frequencies.

Signed-off-by: Xiaolong Wang <xiaolong.wang@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:19 -05:00
Len Brown 869ce69e1e tools/power turbostat: use intel-family.h model strings
All except for model 1F, a Nehalem, which is currently incorrectly
indentified as a Westmere in that new header.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:19 -05:00
Jacob Pan 0f64490978 tools/power/turbostat: Add Denverton RAPL support
The Denverton CPU RAPL supports package, core, and DRAM domains.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:18 -05:00
Jacob Pan 2c48c990ea tools/power/turbostat: Add Denverton support
Denverton is an Atom based micro server which shares the same
Goldmont architecture as Broxton. The available C-states on
Denverton is a subset of Broxton with only C1, C1e, and C6.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:18 -05:00
Jacob Pan 9148494c59 tools/power/turbostat: split core MSR support into status + limit
Some CPUs may not have PP0/Core domain power limit MSRs. We
should still allow its domain energy status to be used. This
patch splits PP0/Core RAPL into two separate flags for power
limit and energy status such that energy status can continue
to be reported without power limit.

Without this patch, turbostat will not be able to use the
remaining RAPL features if some PL MSRs are not present.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:17 -05:00
Colin Ian King 0a91e55152 tools/power turbostat: fix error case overflow read of slm_freq_table[]
When i >= SLM_BCLK_FREQS, the frequency read from the slm_freq_table
is off the end of the array because msr is set to 3 rather than the
actual array index i.  Set i to 3 rather than msr to fix this.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:17 -05:00
Mika Westerberg 01a67adfc5 tools/power turbostat: Allocate correct amount of fd and irq entries
The tool uses topo.max_cpu_num to determine number of entries needed for
fd_percpu[] and irqs_per_cpu[]. For example on a system with 4 CPUs
topo.max_cpu_num is 3 so we get too small array for holding per-CPU items.

Fix this to use right number of entries, which is topo.max_cpu_num + 1.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:16 -05:00
Len Brown 3d109de23c tools/power turbostat: switch to tab delimited output
Switch to tab-delimited output from fixed-width columns
to make it simpler to import into spreadsheets.

As the fixed width columnns were 8-spaces wide,
the output on the screen should not change.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:16 -05:00
Len Brown ba3dec99fc tools/power turbostat: Gracefully handle ACPI S3
turbostat gives valid results across suspend to idle, aka freeze,
whether invoked in  interval mode, or in command mode.
Indeed, this can be used to measure suspend to idle:

turbostat echo freeze > /sys/power/state

But this does not work across suspend to ACPI S3, because the
processor counters, including the TSC, are reset on resume.
Further, when turbostat detects a problem, it does't forgive
the hardware, and interval mode will print *'s from there on out.

Instead, upon detecting counters going backwards, simply
reset and start over.

Interval mode across ACPI S3: (observe TSC going backwards)

root@sharkbay:/home/lenb/turbostat-src# ./turbostat -M 0x10
     CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz           MSR 0x010
       -       1    0.06     858    2294  0x0000000000000000
       0       0    0.06     847    2294  0x0000002a254b98ac
       1       1    0.06     878    2294  0x0000002a254efa3a
       2       1    0.07     843    2294  0x0000002a2551df65
       3       0    0.05     863    2294  0x0000002a2553fea2
turbostat: re-initialized with num_cpus 4
     CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz           MSR 0x010
       -       2    0.20     849    2294  0x0000000000000000
       0       2    0.26     856    2294  0x0000000449abb60d
       1       2    0.20     844    2294  0x0000000449b087ec
       2       2    0.21     850    2294  0x0000000449b35d5d
       3       1    0.12     839    2294  0x0000000449b5fd5a
^C

Command mode across ACPI S3:
root@sharkbay:/home/lenb/turbostat-src# ./turbostat -M 0x10 sleep 10
./turbostat: Counter reset detected
14.196299 sec

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:15 -05:00
Len Brown e975db5d52 tools/power turbostat: tidy up output on Joule counter overflow
The RAPL Joules counter is limited in capacity.
Turbostat estimates how soon it can roll-over
based on the max TDP of the processor --
which tells us the maximum increment rate.

eg.
RAPL: 2759 sec. Joule Counter Range, at 95 Watts

So if a sample duration is longer than 2759 seconds on this system,
'**' replace the decimal place in the display to indicate
that the results may be suspect.

But the display had an extra ' ' in this case, throwing off the columns.

Also, the -J "Joules" option appended an extra "time" column
to the display.  While this may be useful, it printed the interval time,
which may not be the accurate time per processor.  Remove this column,
which appeared only when using '-J',
as we plan to add accurate per-cpu interval times in a future commit.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:15 -05:00
Linus Torvalds 43c4f67c96 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "7 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix false-positive WARN_ON() in truncate/invalidate for hugetlb
  kasan: support use-after-scope detection
  kasan: update kasan_global for gcc 7
  lib/debugobjects: export for use in modules
  zram: fix unbalanced idr management at hot removal
  thp: fix corner case of munlock() of PTE-mapped THPs
  mm, thp: propagation of conditional compilation in khugepaged.c
2016-11-30 16:33:41 -08:00
Kirill A. Shutemov 5cbc198ae0 mm: fix false-positive WARN_ON() in truncate/invalidate for hugetlb
Hugetlb pages have ->index in size of the huge pages (PMD_SIZE or
PUD_SIZE), not in PAGE_SIZE as other types of pages.  This means we
cannot user page_to_pgoff() to check whether we've got the right page
for the radix-tree index.

Let's introduce page_to_index() which would return radix-tree index for
given page.

We will be able to get rid of this once hugetlb will be switched to
multi-order entries.

Fixes: fc127da085 ("truncate: handle file thp")
Link: http://lkml.kernel.org/r/20161123093053.mjbnvn5zwxw5e6lk@black.fi.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Doug Nelson <doug.nelson@intel.com>
Tested-by: Doug Nelson <doug.nelson@intel.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Dmitry Vyukov 828347f8f9 kasan: support use-after-scope detection
Gcc revision 241896 implements use-after-scope detection.  Will be
available in gcc 7.  Support it in KASAN.

Gcc emits 2 new callbacks to poison/unpoison large stack objects when
they go in/out of scope.  Implement the callbacks and add a test.

[dvyukov@google.com: v3]
  Link: http://lkml.kernel.org/r/1479998292-144502-1-git-send-email-dvyukov@google.com
Link: http://lkml.kernel.org/r/1479226045-145148-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Dmitry Vyukov 045d599a28 kasan: update kasan_global for gcc 7
kasan_global struct is part of compiler/runtime ABI.  gcc revision
241983 has added a new field to kasan_global struct.  Update kernel
definition of kasan_global struct to include the new field.

Without this patch KASAN is broken with gcc 7.

Link: http://lkml.kernel.org/r/1479219743-28682-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Chris Wilson f8ff04e2be lib/debugobjects: export for use in modules
Drivers, or other modules, that use a mixture of objects (especially
objects embedded within other objects) would like to take advantage of
the debugobjects facilities to help catch misuse.  Currently, the
debugobjects interface is only available to builtin drivers and requires
a set of EXPORT_SYMBOL_GPL for use by modules.

I am using the debugobjects in i915.ko to try and catch some invalid
operations on embedded objects.  The problem currently only presents
itself across module unload so forcing i915 to be builtin is not an
option.

Link: http://lkml.kernel.org/r/20161122143039.6433-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Du, Changbin" <changbin.du@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Takashi Iwai 529e71e164 zram: fix unbalanced idr management at hot removal
The zram hot removal code calls idr_remove() even when zram_remove()
returns an error (typically -EBUSY).  This results in a leftover at the
device release, eventually leading to a crash when the module is
reloaded.

As described in the bug report below, the following procedure would
cause an Oops with zram:

 - provision three zram devices via modprobe zram num_devices=3
 - configure a size for each device
   + echo "1G" > /sys/block/$zram_name/disksize
 - mkfs and mount zram0 only
 - attempt to hot remove all three devices
   + echo 2 > /sys/class/zram-control/hot_remove
   + echo 1 > /sys/class/zram-control/hot_remove
   + echo 0 > /sys/class/zram-control/hot_remove
     - zram0 removal fails with EBUSY, as expected
 - unmount zram0
 - try zram0 hot remove again
   + echo 0 > /sys/class/zram-control/hot_remove
     - fails with ENODEV (unexpected)
 - unload zram kernel module
   + completes successfully
 - zram0 device node still exists
 - attempt to mount /dev/zram0
   + mount command is killed
   + following BUG is encountered

 BUG: unable to handle kernel paging request at ffffffffa0002ba0
 IP: get_disk+0x16/0x50
 Oops: 0000 [#1] SMP
 CPU: 0 PID: 252 Comm: mount Not tainted 4.9.0-rc6 #176
 Call Trace:
   exact_lock+0xc/0x20
   kobj_lookup+0xdc/0x160
   get_gendisk+0x2f/0x110
   __blkdev_get+0x10c/0x3c0
   blkdev_get+0x19d/0x2e0
   blkdev_open+0x56/0x70
   do_dentry_open.isra.19+0x1ff/0x310
   vfs_open+0x43/0x60
   path_openat+0x2c9/0xf30
   do_filp_open+0x79/0xd0
   do_sys_open+0x114/0x1e0
   SyS_open+0x19/0x20
   entry_SYSCALL_64_fastpath+0x13/0x94

This patch adds the proper error check in hot_remove_store() not to call
idr_remove() unconditionally.

Fixes: 17ec4cd985 ("zram: don't call idr_remove() from zram_remove()")
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1010970
Link: http://lkml.kernel.org/r/20161121132140.12683-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reported-by: David Disseldorp <ddiss@suse.de>
Tested-by: David Disseldorp <ddiss@suse.de>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org>    [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Kirill A. Shutemov 655548bf62 thp: fix corner case of munlock() of PTE-mapped THPs
The following program triggers BUG() in munlock_vma_pages_range():

	// autogenerated by syzkaller (http://github.com/google/syzkaller)
	#include <sys/mman.h>

	int main()
	{
	  mmap((void*)0x20105000ul, 0xc00000ul, 0x2ul, 0x2172ul, -1, 0);
	  mremap((void*)0x201fd000ul, 0x4000ul, 0xc00000ul, 0x3ul, 0x203f0000ul);
	  return 0;
	}

The test-case constructs the situation when munlock_vma_pages_range()
finds PTE-mapped THP-head in the middle of page table and, by mistake,
skips HPAGE_PMD_NR pages after that.

As result, on the next iteration it hits the middle of PMD-mapped THP
and gets upset seeing mlocked tail page.

The solution is only skip HPAGE_PMD_NR pages if the THP was mlocked
during munlock_vma_page().  It would guarantee that the page is
PMD-mapped as we never mlock PTE-mapeed THPs.

Fixes: e90309c9f7 ("thp: allow mlocked THP again")
Link: http://lkml.kernel.org/r/20161115132703.7s7rrgmwttegcdh4@black.fi.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Jérémy Lefaure e1465d125d mm, thp: propagation of conditional compilation in khugepaged.c
Commit b46e756f5e ("thp: extract khugepaged from mm/huge_memory.c")
moved code from huge_memory.c to khugepaged.c.  Some of this code should
be compiled only when CONFIG_SYSFS is enabled but the condition around
this code was not moved into khugepaged.c.

The result is a compilation error when CONFIG_SYSFS is disabled:

  mm/built-in.o: In function `khugepaged_defrag_store': khugepaged.c:(.text+0x2d095): undefined reference to `single_hugepage_flag_store'
  mm/built-in.o: In function `khugepaged_defrag_show': khugepaged.c:(.text+0x2d0ab): undefined reference to `single_hugepage_flag_show'

This commit adds the #ifdef CONFIG_SYSFS around the code related to
sysfs.

Link: http://lkml.kernel.org/r/20161114203448.24197-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Linus Torvalds f513581c35 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "Two small fixes for MIPI PLLs on sunxi devices and a build fix for a
  Broadcom clk driver having unmet dependencies"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: bcm: Fix unmet Kconfig dependencies for CLK_BCM_63XX
  clk: sunxi-ng: enable so-said LDOs for A33 SoC's pll-mipi clock
  clk: sunxi-ng: sun6i-a31: Enable PLL-MIPI LDOs when ungating it
2016-11-30 15:15:49 -08:00