Merge tag 'kvm-s390-next-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: features and fixes for 4.9

- lazy enablement of runtime instrumentation
- up to 255 CPUs for nested guests
- rework of machine check deliver
- cleanups/fixes
This commit is contained in:
Paolo Bonzini
2016-09-08 15:35:44 +02:00
138 changed files with 2460 additions and 651 deletions
+1 -1
View File
@@ -131,7 +131,7 @@ pygments_style = 'sphinx'
todo_include_todos = False
primary_domain = 'C'
highlight_language = 'C'
highlight_language = 'guess'
# -- Options for HTML output ----------------------------------------------
+2 -2
View File
@@ -19,5 +19,5 @@ enhancements. It can monitor up to 4 voltages, 16 temperatures and
implemented in this driver.
Specification of the chip can be found here:
ftp:///pub/Mainboard-OEM-Sales/Services/Software&Tools/Linux_SystemMonitoring&Watchdog&GPIO/BMC-Teutates_Specification_V1.21.pdf
ftp:///pub/Mainboard-OEM-Sales/Services/Software&Tools/Linux_SystemMonitoring&Watchdog&GPIO/Fujitsu_mainboards-1-Sensors_HowTo-en-US.pdf
ftp://ftp.ts.fujitsu.com/pub/Mainboard-OEM-Sales/Services/Software&Tools/Linux_SystemMonitoring&Watchdog&GPIO/BMC-Teutates_Specification_V1.21.pdf
ftp://ftp.ts.fujitsu.com/pub/Mainboard-OEM-Sales/Services/Software&Tools/Linux_SystemMonitoring&Watchdog&GPIO/Fujitsu_mainboards-1-Sensors_HowTo-en-US.pdf
-6
View File
@@ -366,8 +366,6 @@ Domain`_ references.
Cross-referencing from reStructuredText
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. highlight:: none
To cross-reference the functions and types defined in the kernel-doc comments
from reStructuredText documents, please use the `Sphinx C Domain`_
references. For example::
@@ -390,8 +388,6 @@ For further details, please refer to the `Sphinx C Domain`_ documentation.
Function documentation
----------------------
.. highlight:: c
The general format of a function and function-like macro kernel-doc comment is::
/**
@@ -572,8 +568,6 @@ DocBook XML [DEPRECATED]
Converting DocBook to Sphinx
----------------------------
.. highlight:: none
Over time, we expect all of the documents under ``Documentation/DocBook`` to be
converted to Sphinx and reStructuredText. For most DocBook XML documents, a good
enough solution is to use the simple ``Documentation/sphinx/tmplcvt`` script,
+26 -1
View File
@@ -164,7 +164,32 @@ load n/2 modules more and try again.
Again, if you find the offending module(s), it(they) must be unloaded every time
before hibernation, and please report the problem with it(them).
c) Advanced debugging
c) Using the "test_resume" hibernation option
/sys/power/disk generally tells the kernel what to do after creating a
hibernation image. One of the available options is "test_resume" which
causes the just created image to be used for immediate restoration. Namely,
after doing:
# echo test_resume > /sys/power/disk
# echo disk > /sys/power/state
a hibernation image will be created and a resume from it will be triggered
immediately without involving the platform firmware in any way.
That test can be used to check if failures to resume from hibernation are
related to bad interactions with the platform firmware. That is, if the above
works every time, but resume from actual hibernation does not work or is
unreliable, the platform firmware may be responsible for the failures.
On architectures and platforms that support using different kernels to restore
hibernation images (that is, the kernel used to read the image from storage and
load it into memory is different from the one included in the image) or support
kernel address space randomization, it also can be used to check if failures
to resume may be related to the differences between the restore and image
kernels.
d) Advanced debugging
In case that hibernation does not work on your system even in the minimal
configuration and compiling more drivers as modules is not practical or some
+59 -58
View File
@@ -1,75 +1,76 @@
Power Management Interface
Power Management Interface for System Sleep
Copyright (c) 2016 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The power management subsystem provides a unified sysfs interface to
userspace, regardless of what architecture or platform one is
running. The interface exists in /sys/power/ directory (assuming sysfs
is mounted at /sys).
The power management subsystem provides userspace with a unified sysfs interface
for system sleep regardless of the underlying system architecture or platform.
The interface is located in the /sys/power/ directory (assuming that sysfs is
mounted at /sys).
/sys/power/state controls system power state. Reading from this file
returns what states are supported, which is hard-coded to 'freeze',
'standby' (Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk'
(Suspend-to-Disk).
/sys/power/state is the system sleep state control file.
Writing to this file one of those strings causes the system to
transition into that state. Please see the file
Documentation/power/states.txt for a description of each of those
states.
Reading from it returns a list of supported sleep states, encoded as:
'freeze' (Suspend-to-Idle)
'standby' (Power-On Suspend)
'mem' (Suspend-to-RAM)
'disk' (Suspend-to-Disk)
/sys/power/disk controls the operating mode of the suspend-to-disk
mechanism. Suspend-to-disk can be handled in several ways. We have a
few options for putting the system to sleep - using the platform driver
(e.g. ACPI or other suspend_ops), powering off the system or rebooting the
system (for testing).
Suspend-to-Idle is always supported. Suspend-to-Disk is always supported
too as long the kernel has been configured to support hibernation at all
(ie. CONFIG_HIBERNATION is set in the kernel configuration file). Support
for Suspend-to-RAM and Power-On Suspend depends on the capabilities of the
platform.
Additionally, /sys/power/disk can be used to turn on one of the two testing
modes of the suspend-to-disk mechanism: 'testproc' or 'test'. If the
suspend-to-disk mechanism is in the 'testproc' mode, writing 'disk' to
/sys/power/state will cause the kernel to disable nonboot CPUs and freeze
tasks, wait for 5 seconds, unfreeze tasks and enable nonboot CPUs. If it is
in the 'test' mode, writing 'disk' to /sys/power/state will cause the kernel
to disable nonboot CPUs and freeze tasks, shrink memory, suspend devices, wait
for 5 seconds, resume devices, unfreeze tasks and enable nonboot CPUs. Then,
we are able to look in the log messages and work out, for example, which code
is being slow and which device drivers are misbehaving.
If one of the strings listed in /sys/power/state is written to it, the system
will attempt to transition into the corresponding sleep state. Refer to
Documentation/power/states.txt for a description of each of those states.
Reading from this file will display all supported modes and the currently
selected one in brackets, for example
/sys/power/disk controls the operating mode of hibernation (Suspend-to-Disk).
Specifically, it tells the kernel what to do after creating a hibernation image.
[shutdown] reboot test testproc
Reading from it returns a list of supported options encoded as:
Writing to this file will accept one of
'platform' (put the system into sleep using a platform-provided method)
'shutdown' (shut the system down)
'reboot' (reboot the system)
'suspend' (trigger a Suspend-to-RAM transition)
'test_resume' (resume-after-hibernation test mode)
'platform' (only if the platform supports it)
'shutdown'
'reboot'
'testproc'
'test'
The currently selected option is printed in square brackets.
/sys/power/image_size controls the size of the image created by
the suspend-to-disk mechanism. It can be written a string
representing a non-negative integer that will be used as an upper
limit of the image size, in bytes. The suspend-to-disk mechanism will
do its best to ensure the image size will not exceed that number. However,
if this turns out to be impossible, it will try to suspend anyway using the
smallest image possible. In particular, if "0" is written to this file, the
suspend image will be as small as possible.
The 'platform' option is only available if the platform provides a special
mechanism to put the system to sleep after creating a hibernation image (ACPI
does that, for example). The 'suspend' option is available if Suspend-to-RAM
is supported. Refer to Documentation/power/basic_pm_debugging.txt for the
description of the 'test_resume' option.
Reading from this file will display the current image size limit, which
is set to 2/5 of available RAM by default.
To select an option, write the string representing it to /sys/power/disk.
/sys/power/pm_trace controls the code which saves the last PM event point in
the RTC across reboots, so that you can debug a machine that just hangs
during suspend (or more commonly, during resume). Namely, the RTC is only
used to save the last PM event point if this file contains '1'. Initially it
contains '0' which may be changed to '1' by writing a string representing a
nonzero integer into it.
/sys/power/image_size controls the size of hibernation images.
To use this debugging feature you should attempt to suspend the machine, then
reboot it and run
It can be written a string representing a non-negative integer that will be
used as a best-effort upper limit of the image size, in bytes. The hibernation
core will do its best to ensure that the image size will not exceed that number.
However, if that turns out to be impossible to achieve, a hibernation image will
still be created and its size will be as small as possible. In particular,
writing '0' to this file will enforce hibernation images to be as small as
possible.
dmesg -s 1000000 | grep 'hash matches'
Reading from this file returns the current image size limit, which is set to
around 2/5 of available RAM by default.
CAUTION: Using it will cause your machine's real-time (CMOS) clock to be
set to a random invalid time after a resume.
/sys/power/pm_trace controls the PM trace mechanism saving the last suspend
or resume event point in the RTC across reboots.
It helps to debug hard lockups or reboots due to device driver failures that
occur during system suspend or resume (which is more common) more effectively.
If /sys/power/pm_trace contains '1', the fingerprint of each suspend/resume
event point in turn will be stored in the RTC memory (overwriting the actual
RTC information), so it will survive a system crash if one occurs right after
storing it and it can be used later to identify the driver that caused the crash
to happen (see Documentation/power/s2ram.txt for more information).
Initially it contains '0' which may be changed to '1' by writing a string
representing a nonzero integer into it.
@@ -42,11 +42,12 @@
caption a.headerlink { opacity: 0; }
caption a.headerlink:hover { opacity: 1; }
/* inline literal: drop the borderbox and red color */
/* inline literal: drop the borderbox, padding and red color */
code, .rst-content tt, .rst-content code {
color: inherit;
border: none;
padding: unset;
background: inherit;
font-size: 85%;
}
+6
View File
@@ -4525,6 +4525,12 @@ L: linux-edac@vger.kernel.org
S: Maintained
F: drivers/edac/sb_edac.c
EDAC-SKYLAKE
M: Tony Luck <tony.luck@intel.com>
L: linux-edac@vger.kernel.org
S: Maintained
F: drivers/edac/skx_edac.c
EDAC-XGENE
APPLIED MICRO (APM) X-GENE SOC EDAC
M: Loc Ho <lho@apm.com>
+1 -1
View File
@@ -1,7 +1,7 @@
VERSION = 4
PATCHLEVEL = 8
SUBLEVEL = 0
EXTRAVERSION = -rc2
EXTRAVERSION = -rc3
NAME = Psychotic Stoned Sheep
# *DOCUMENTATION*
+1
View File
@@ -295,6 +295,7 @@ __und_svc_fault:
bl __und_fault
__und_svc_finish:
get_thread_info tsk
ldr r5, [sp, #S_PSR] @ Get SVC cpsr
svc_exit r5 @ return from exception
UNWIND(.fnend )
+6
View File
@@ -271,6 +271,12 @@ static int __init imx_gpc_init(struct device_node *node,
for (i = 0; i < IMR_NUM; i++)
writel_relaxed(~0, gpc_base + GPC_IMR1 + i * 4);
/*
* Clear the OF_POPULATED flag set in of_irq_init so that
* later the GPC power domain driver will not be skipped.
*/
of_node_clear_flag(node, OF_POPULATED);
return 0;
}
IRQCHIP_DECLARE(imx_gpc, "fsl,imx6q-gpc", imx_gpc_init);
+16 -5
View File
@@ -728,7 +728,8 @@ static void *__init late_alloc(unsigned long sz)
{
void *ptr = (void *)__get_free_pages(PGALLOC_GFP, get_order(sz));
BUG_ON(!ptr);
if (!ptr || !pgtable_page_ctor(virt_to_page(ptr)))
BUG();
return ptr;
}
@@ -1155,10 +1156,19 @@ void __init sanity_check_meminfo(void)
{
phys_addr_t memblock_limit = 0;
int highmem = 0;
phys_addr_t vmalloc_limit = __pa(vmalloc_min - 1) + 1;
u64 vmalloc_limit;
struct memblock_region *reg;
bool should_use_highmem = false;
/*
* Let's use our own (unoptimized) equivalent of __pa() that is
* not affected by wrap-arounds when sizeof(phys_addr_t) == 4.
* The result is used as the upper bound on physical memory address
* and may itself be outside the valid range for which phys_addr_t
* and therefore __pa() is defined.
*/
vmalloc_limit = (u64)(uintptr_t)vmalloc_min - PAGE_OFFSET + PHYS_OFFSET;
for_each_memblock(memory, reg) {
phys_addr_t block_start = reg->base;
phys_addr_t block_end = reg->base + reg->size;
@@ -1183,10 +1193,11 @@ void __init sanity_check_meminfo(void)
if (reg->size > size_limit) {
phys_addr_t overlap_size = reg->size - size_limit;
pr_notice("Truncating RAM at %pa-%pa to -%pa",
&block_start, &block_end, &vmalloc_limit);
memblock_remove(vmalloc_limit, overlap_size);
pr_notice("Truncating RAM at %pa-%pa",
&block_start, &block_end);
block_end = vmalloc_limit;
pr_cont(" to -%pa", &block_end);
memblock_remove(vmalloc_limit, overlap_size);
should_use_highmem = true;
}
}
+9 -1
View File
@@ -101,12 +101,20 @@ ENTRY(cpu_resume)
bl el2_setup // if in EL2 drop to EL1 cleanly
/* enable the MMU early - so we can access sleep_save_stash by va */
adr_l lr, __enable_mmu /* __cpu_setup will return here */
ldr x27, =_cpu_resume /* __enable_mmu will branch here */
adr_l x27, _resume_switched /* __enable_mmu will branch here */
adrp x25, idmap_pg_dir
adrp x26, swapper_pg_dir
b __cpu_setup
ENDPROC(cpu_resume)
.pushsection ".idmap.text", "ax"
_resume_switched:
ldr x8, =_cpu_resume
br x8
ENDPROC(_resume_switched)
.ltorg
.popsection
ENTRY(_cpu_resume)
mrs x1, mpidr_el1
adrp x8, mpidr_hash
+3 -3
View File
@@ -242,7 +242,7 @@ static void note_page(struct pg_state *st, unsigned long addr, unsigned level,
static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start)
{
pte_t *pte = pte_offset_kernel(pmd, 0);
pte_t *pte = pte_offset_kernel(pmd, 0UL);
unsigned long addr;
unsigned i;
@@ -254,7 +254,7 @@ static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start)
static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
{
pmd_t *pmd = pmd_offset(pud, 0);
pmd_t *pmd = pmd_offset(pud, 0UL);
unsigned long addr;
unsigned i;
@@ -271,7 +271,7 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
{
pud_t *pud = pud_offset(pgd, 0);
pud_t *pud = pud_offset(pgd, 0UL);
unsigned long addr;
unsigned i;
+2
View File
@@ -23,6 +23,8 @@
#include <linux/module.h>
#include <linux/of.h>
#include <asm/acpi.h>
struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
EXPORT_SYMBOL(node_data);
nodemask_t numa_nodes_parsed __initdata;
+2 -2
View File
@@ -97,10 +97,10 @@
#define ENOTCONN 235 /* Transport endpoint is not connected */
#define ESHUTDOWN 236 /* Cannot send after transport endpoint shutdown */
#define ETOOMANYREFS 237 /* Too many references: cannot splice */
#define EREFUSED ECONNREFUSED /* for HP's NFS apparently */
#define ETIMEDOUT 238 /* Connection timed out */
#define ECONNREFUSED 239 /* Connection refused */
#define EREMOTERELEASE 240 /* Remote peer released connection */
#define EREFUSED ECONNREFUSED /* for HP's NFS apparently */
#define EREMOTERELEASE 240 /* Remote peer released connection */
#define EHOSTDOWN 241 /* Host is down */
#define EHOSTUNREACH 242 /* No route to host */
-8
View File
@@ -51,8 +51,6 @@ EXPORT_SYMBOL(_parisc_requires_coherency);
DEFINE_PER_CPU(struct cpuinfo_parisc, cpu_data);
extern int update_cr16_clocksource(void); /* from time.c */
/*
** PARISC CPU driver - claim "device" and initialize CPU data structures.
**
@@ -228,12 +226,6 @@ static int processor_probe(struct parisc_device *dev)
}
#endif
/* If we've registered more than one cpu,
* we'll use the jiffies clocksource since cr16
* is not synchronized between CPUs.
*/
update_cr16_clocksource();
return 0;
}
-12
View File
@@ -221,18 +221,6 @@ static struct clocksource clocksource_cr16 = {
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};
int update_cr16_clocksource(void)
{
/* since the cr16 cycle counters are not synchronized across CPUs,
we'll check if we should switch to a safe clocksource: */
if (clocksource_cr16.rating != 0 && num_online_cpus() > 1) {
clocksource_change_rating(&clocksource_cr16, 0);
return 1;
}
return 0;
}
void __init start_cpu_itimer(void)
{
unsigned int cpu = smp_processor_id();
+24
View File
@@ -55,4 +55,28 @@ static struct facility_def facility_defs[] = {
-1 /* END */
}
},
{
.name = "FACILITIES_KVM",
.bits = (int[]){
0, /* N3 instructions */
1, /* z/Arch mode installed */
2, /* z/Arch mode active */
3, /* DAT-enhancement */
4, /* idte segment table */
5, /* idte region table */
6, /* ASN-and-LX reuse */
7, /* stfle */
8, /* enhanced-DAT 1 */
9, /* sense-running-status */
10, /* conditional sske */
13, /* ipte-range */
14, /* nonquiescing key-setting */
73, /* transactional execution */
75, /* access-exception-fetch/store indication */
76, /* msa extension 3 */
77, /* msa extension 4 */
78, /* enhanced-DAT 2 */
-1 /* END */
}
},
};
+1 -1
View File
@@ -28,7 +28,7 @@
#define KVM_S390_BSCA_CPU_SLOTS 64
#define KVM_S390_ESCA_CPU_SLOTS 248
#define KVM_MAX_VCPUS KVM_S390_ESCA_CPU_SLOTS
#define KVM_MAX_VCPUS 255
#define KVM_USER_MEM_SLOTS 32
/*
+1
View File
@@ -125,6 +125,7 @@ int main(void)
OFFSET(__LC_STFL_FAC_LIST, lowcore, stfl_fac_list);
OFFSET(__LC_STFLE_FAC_LIST, lowcore, stfle_fac_list);
OFFSET(__LC_MCCK_CODE, lowcore, mcck_interruption_code);
OFFSET(__LC_EXT_DAMAGE_CODE, lowcore, external_damage_code);
OFFSET(__LC_MCCK_FAIL_STOR_ADDR, lowcore, failing_storage_address);
OFFSET(__LC_LAST_BREAK, lowcore, breaking_event_addr);
OFFSET(__LC_RST_OLD_PSW, lowcore, restart_old_psw);

Some files were not shown because too many files have changed in this diff Show More