Commit Graph

366 Commits

Author SHA1 Message Date
Boris Ostrovsky a71dbdaa8c hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests
Commit 61f01dd941 ("x86_64, asm: Work around AMD SYSRET SS descriptor
attribute issue") makes AMD processors set SS to __KERNEL_DS in
__switch_to() to deal with cases when SS is NULL.

This breaks Xen PV guests who do not want to load SS with__KERNEL_DS.

Since the problem that the commit is trying to address would have to be
fixed in the hypervisor (if it in fact exists under Xen) there is no
reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here.

This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features
op which will clear this flag. (And since this structure is no longer
HVM-specific we should do some renaming).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-05-05 18:27:43 +01:00
Linus Torvalds 497a5df7bf Merge tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen features and fixes from David Vrabel:

 - use a single source list of hypercalls, generating other tables etc.
   at build time.

 - add a "Xen PV" APIC driver to support >255 VCPUs in PV guests.

 - significant performance improve to guest save/restore/migration.

 - scsiback/front save/restore support.

 - infrastructure for multi-page xenbus rings.

 - misc fixes.

* tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/pci: Try harder to get PXM information for Xen
  xenbus_client: Extend interface to support multi-page ring
  xen-pciback: also support disabling of bus-mastering and memory-write-invalidate
  xen: support suspend/resume in pvscsi frontend
  xen: scsiback: add LUN of restored domain
  xen-scsiback: define a pr_fmt macro with xen-pvscsi
  xen/mce: fix up xen_late_init_mcelog() error handling
  xen/privcmd: improve performance of MMAPBATCH_V2
  xen: unify foreign GFN map/unmap for auto-xlated physmap guests
  x86/xen/apic: WARN with details.
  x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs
  xen/pciback: Don't print scary messages when unsupported by hypervisor.
  xen: use generated hypercall symbols in arch/x86/xen/xen-head.S
  xen: use generated hypervisor symbols in arch/x86/xen/trace.c
  xen: synchronize include/xen/interface/xen.h with xen
  xen: build infrastructure for generating hypercall depending symbols
  xen: balloon: Use static attribute groups for sysfs entries
  xen: pcpu: Use static attribute groups for sysfs entry
2015-04-16 14:01:03 -05:00
Konrad Rzeszutek Wilk feb44f1f7a x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs
Instead of mangling the default APIC driver, provide a Xen PV guest
specific one that explicitly provides appropriate methods.

This allows use to report that all APIC IDs are valid, allowing dom0
to boot with more than 255 VCPUs.

Since the probe order of APIC drivers is link dependent, we add in an
late probe function to change to the Xen PV if it hadn't been done
during bootup.

Suggested-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Cathy Avery <cathy.avery@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-03-16 14:49:14 +00:00
Andy Lutomirski 8ef46a672a x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu
We currently store references to the top of the kernel stack in
multiple places: kernel_stack (with an offset) and
init_tss.x86_tss.sp0 (no offset).  The latter is defined by
hardware and is a clean canonical way to find the top of the
stack.  Add an accessor so we can start using it.

This needs minor paravirt tweaks.  On native, sp0 defines the
top of the kernel stack and is therefore always correct.  On Xen
and lguest, the hypervisor tracks the top of the stack, but we
want to start reading sp0 in the kernel.  Fixing this is simple:
just update our local copy of sp0 as well as the hypervisor's
copy on task switches.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8d675581859712bee09a055ed8f785d80dac1eca.1425611534.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-06 08:32:57 +01:00
Boris Ostrovsky 5054daa285 x86/xen: Initialize cr4 shadow for 64-bit PV(H) guests
Commit 1e02ce4ccc ("x86: Store a per-cpu shadow copy of CR4")
introduced CR4 shadows.

These shadows are initialized in early boot code. The commit missed
initialization for 64-bit PV(H) guests that this patch adds.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-02-23 16:30:26 +00:00
Boris Ostrovsky 31795b470b x86/xen: Make sure X2APIC_ENABLE bit of MSR_IA32_APICBASE is not set
Commit d524165cb8 ("x86/apic: Check x2apic early") tests X2APIC_ENABLE
bit of MSR_IA32_APICBASE when CONFIG_X86_X2APIC is off and panics
the kernel when this bit is set.

Xen's PV guests will pass this MSR read to the hypervisor which will
return its version of the MSR, where this bit might be set. Make sure
we clear it before returning MSR value to the caller.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-02-23 16:30:23 +00:00
Andy Lutomirski 375074cc73 x86: Clean up cr4 manipulation
CR4 manipulation was split, seemingly at random, between direct
(write_cr4) and using a helper (set/clear_in_cr4).  Unfortunately,
the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code,
which only a small subset of users actually wanted.

This patch replaces all cr4 access in functions that don't leave cr4
exactly the way they found it with new helpers cr4_set_bits,
cr4_clear_bits, and cr4_set_bits_and_update_boot.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Vince Weaver <vince@deater.net>
Cc: "hillf.zj" <hillf.zj@alibaba-inc.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 12:10:41 +01:00
Linus Torvalds 613d4cefbb Merge tag 'stable/for-linus-3.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
 "Several critical linear p2m fixes that prevented some hosts from
  booting"

* tag 'stable/for-linus-3.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: properly retrieve NMI reason
  xen: check for zero sized area when invalidating memory
  xen: use correct type for physical addresses
  xen: correct race in alloc_p2m_pmd()
  xen: correct error for building p2m list on 32 bits
  x86/xen: avoid freeing static 'name' when kasprintf() fails
  x86/xen: add extra memory for remapped frames during setup
  x86/xen: don't count how many PFNs are identity mapped
  x86/xen: Free bootmem in free_p2m_page() during early boot
  x86/xen: Remove unnecessary BUG_ON(preemptible()) in xen_setup_timer()
2015-01-14 08:07:42 +13:00
Jan Beulich f221b04fe0 x86/xen: properly retrieve NMI reason
Using the native code here can't work properly, as the hypervisor would
normally have cleared the two reason bits by the time Dom0 gets to see
the NMI (if passed to it at all). There's a shared info field for this,
and there's an existing hook to use - just fit the two together. This
is particularly relevant so that NMIs intended to be handled by APEI /
GHES actually make it to the respective handler.

Note that the hook can (and should) be used irrespective of whether
being in Dom0, as accessing port 0x61 in a DomU would be even worse,
while the shared info field would just hold zero all the time. Note
further that hardware NMI handling for PVH doesn't currently work
anyway due to missing code in the hypervisor (but it is expected to
work the native rather than the PV way).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-13 09:39:50 +00:00
Juergen Gross 47591df505 xen: Support Xen pv-domains using PAT
With the dynamical mapping between cache modes and pgprot values it is
now possible to use all cache modes via the Xen hypervisor PAT settings
in a pv domain.

All to be done is to read the PAT configuration MSR and set up the
translation tables accordingly.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stefan.bader@canonical.com
Cc: xen-devel@lists.xensource.com
Cc: ville.syrjala@linux.intel.com
Cc: jbeulich@suse.com
Cc: toshi.kani@hp.com
Cc: plagnioj@jcrosoft.com
Cc: tomi.valkeinen@ti.com
Cc: bhelgaas@google.com
Link: http://lkml.kernel.org/r/1415019724-4317-19-git-send-email-jgross@suse.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16 11:04:26 +01:00
Juergen Gross 2c185687ab x86/xen: delay construction of mfn_list_list
The 3 level p2m tree for the Xen tools is constructed very early at
boot by calling xen_build_mfn_list_list(). Memory needed for this tree
is allocated via extend_brk().

As this tree (other than the kernel internal p2m tree) is only needed
for domain save/restore, live migration and crash dump analysis it
doesn't matter whether it is constructed very early or just some
milliseconds later when memory allocation is possible by other means.

This patch moves the call of xen_build_mfn_list_list() just after
calling xen_pagetable_p2m_copy() simplifying this function, too, as it
doesn't have to bother with two parallel trees now. The same applies
for some other internal functions.

While simplifying code, make early_can_reuse_p2m_middle() static and
drop the unused second parameter. p2m_mid_identity_mfn can be removed
as well, it isn't used either.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23 16:24:02 +01:00
Linus Torvalds 0429fbc0bd Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu consistent-ops changes from Tejun Heo:
 "Way back, before the current percpu allocator was implemented, static
  and dynamic percpu memory areas were allocated and handled separately
  and had their own accessors.  The distinction has been gone for many
  years now; however, the now duplicate two sets of accessors remained
  with the pointer based ones - this_cpu_*() - evolving various other
  operations over time.  During the process, we also accumulated other
  inconsistent operations.

  This pull request contains Christoph's patches to clean up the
  duplicate accessor situation.  __get_cpu_var() uses are replaced with
  with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

  Unfortunately, the former sometimes is tricky thanks to C being a bit
  messy with the distinction between lvalues and pointers, which led to
  a rather ugly solution for cpumask_var_t involving the introduction of
  this_cpu_cpumask_var_ptr().

  This converts most of the uses but not all.  Christoph will follow up
  with the remaining conversions in this merge window and hopefully
  remove the obsolete accessors"

* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
  irqchip: Properly fetch the per cpu offset
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
  ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
  Revert "powerpc: Replace __get_cpu_var uses"
  percpu: Remove __this_cpu_ptr
  clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
  sparc: Replace __get_cpu_var uses
  avr32: Replace __get_cpu_var with __this_cpu_write
  blackfin: Replace __get_cpu_var uses
  tile: Use this_cpu_ptr() for hardware counters
  tile: Replace __get_cpu_var uses
  powerpc: Replace __get_cpu_var uses
  alpha: Replace __get_cpu_var
  ia64: Replace __get_cpu_var uses
  s390: cio driver &__get_cpu_var replacements
  s390: Replace __get_cpu_var uses
  mips: Replace __get_cpu_var uses
  MIPS: Replace __get_cpu_var uses in FPU emulator.
  arm: Replace __this_cpu_ptr with raw_cpu_ptr
  ...
2014-10-15 07:48:18 +02:00
Mukesh Rathor a2ef5dc2c7 x86/xen: Set EFER.NX and EFER.SCE in PVH guests
This fixes two bugs in PVH guests:

  - Not setting EFER.NX means the NX bit in page table entries is
    ignored on Intel processors and causes reserved bit page faults on
    AMD processors.

  - After the Xen commit 7645640d6ff1 ("x86/PVH: don't set EFER_SCE for
    pvh guest") PVH guests are required to set EFER.SCE to enable the
    SYSCALL instruction.

Secondary VCPUs are started with pagetables with the NX bit set so
EFER.NX must be set before using any stack or data segment.
xen_pvh_cpu_early_init() is the new secondary VCPU entry point that
sets EFER before jumping to cpu_bringup_and_idle().

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-06 10:27:47 +01:00
Juergen Gross d1e9abd630 xen: eliminate scalability issues from initrd handling
Size restrictions native kernels wouldn't have resulted from the initrd
getting mapped into the initial mapping. The kernel doesn't really need
the initrd to be mapped, so use infrastructure available in Xen to avoid
the mapping and hence the restriction.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-03 12:34:52 +01:00
David Vrabel f955371ca9 x86: remove the Xen-specific _PAGE_IOMAP PTE flag
The _PAGE_IO_MAP PTE flag was only used by Xen PV guests to mark PTEs
that were used to map I/O regions that are 1:1 in the p2m.  This
allowed Xen to obtain the correct PFN when converting the MFNs read
from a PTE back to their PFN.

Xen guests no longer use _PAGE_IOMAP for this. Instead mfn_to_pfn()
returns the correct PFN by using a combination of the m2p and p2m to
determine if an MFN corresponds to a 1:1 mapping in the the p2m.

Remove _PAGE_IOMAP, replacing it with _PAGE_UNUSED2 to allow for
future uses of the PTE flag.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
2014-09-23 13:36:20 +00:00
Christoph Lameter 89cbc76768 x86: Replace __get_cpu_var uses
__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-08-26 13:45:49 -04:00
Linus Torvalds e306e3be1c Merge tag 'stable/for-linus-3.17-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen updates from David Vrabel:
 - remove unused V2 grant table support
 - note that Konrad is xen-blkkback/front maintainer
 - add 'xen_nopv' option to disable PV extentions for x86 HVM guests
 - misc minor cleanups

* tag 'stable/for-linus-3.17-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen-pciback: Document the 'quirks' sysfs file
  xen/pciback: Fix error return code in xen_pcibk_attach()
  xen/events: drop negativity check of unsigned parameter
  xen/setup: Remove Identity Map Debug Message
  xen/events/fifo: remove a unecessary use of BM()
  xen/events/fifo: ensure all bitops are properly aligned even on x86
  xen/events/fifo: reset control block and local HEADs on resume
  xen/arm: use BUG_ON
  xen/grant-table: remove support for V2 tables
  x86/xen: safely map and unmap grant frames when in atomic context
  MAINTAINERS: Make me the Xen block subsystem (front and back) maintainer
  xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests.
2014-08-07 11:33:15 -07:00
Linus Torvalds 76f09aa464 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI changes from Ingo Molnar:
 "Main changes in this cycle are:

   - arm64 efi stub fixes, preservation of FP/SIMD registers across
     firmware calls, and conversion of the EFI stub code into a static
     library - Ard Biesheuvel

   - Xen EFI support - Daniel Kiper

   - Support for autoloading the efivars driver - Lee, Chun-Yi

   - Use the PE/COFF headers in the x86 EFI boot stub to request that
     the stub be loaded with CONFIG_PHYSICAL_ALIGN alignment - Michael
     Brown

   - Consolidate all the x86 EFI quirks into one file - Saurabh Tangri

   - Additional error logging in x86 EFI boot stub - Ulf Winkelvos

   - Support loading initrd above 4G in EFI boot stub - Yinghai Lu

   - EFI reboot patches for ACPI hardware reduced platforms"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  efi/arm64: Handle missing virtual mapping for UEFI System Table
  arch/x86/xen: Silence compiler warnings
  xen: Silence compiler warnings
  x86/efi: Request desired alignment via the PE/COFF headers
  x86/efi: Add better error logging to EFI boot stub
  efi: Autoload efivars
  efi: Update stale locking comment for struct efivars
  arch/x86: Remove efi_set_rtc_mmss()
  arch/x86: Replace plain strings with constants
  xen: Put EFI machinery in place
  xen: Define EFI related stuff
  arch/x86: Remove redundant set_bit(EFI_MEMMAP) call
  arch/x86: Remove redundant set_bit(EFI_SYSTEM_TABLES) call
  efi: Introduce EFI_PARAVIRT flag
  arch/x86: Do not access EFI memory map if it is not available
  efi: Use early_mem*() instead of early_io*()
  arch/ia64: Define early_memunmap()
  x86/reboot: Add EFI reboot quirk for ACPI Hardware Reduced flag
  efi/reboot: Allow powering off machines using EFI
  efi/reboot: Add generic wrapper around EfiResetSystem()
  ...
2014-08-04 17:13:50 -07:00
Daniel Kiper c7341d6a61 arch/x86/xen: Silence compiler warnings
Compiler complains in the following way when x86 32-bit kernel
with Xen support is build:

  CC      arch/x86/xen/enlighten.o
arch/x86/xen/enlighten.c: In function ‘xen_start_kernel’:
arch/x86/xen/enlighten.c:1726:3: warning: right shift count >= width of type [enabled by default]

Such line contains following EFI initialization code:

boot_params.efi_info.efi_systab_hi = (__u32)(__pa(efi_systab_xen) >> 32);

There is no issue if x86 64-bit kernel is build. However, 32-bit case
generate warning (even if that code will not be executed because Xen
does not work on 32-bit EFI platforms) due to __pa() returning unsigned long
type which has 32-bits width. So move whole EFI initialization stuff
to separate function and build it conditionally to avoid above mentioned
warning on x86 32-bit architecture.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-07-18 21:24:03 +01:00
Daniel Kiper be81c8a1da xen: Put EFI machinery in place
This patch enables EFI usage under Xen dom0. Standard EFI Linux
Kernel infrastructure cannot be used because it requires direct
access to EFI data and code. However, in dom0 case it is not possible
because above mentioned EFI stuff is fully owned and controlled
by Xen hypervisor. In this case all calls from dom0 to EFI must
be requested via special hypercall which in turn executes relevant
EFI code in behalf of dom0.

When dom0 kernel boots it checks for EFI availability on a machine.
If it is detected then artificial EFI system table is filled.
Native EFI callas are replaced by functions which mimics them
by calling relevant hypercall. Later pointer to EFI system table
is passed to standard EFI machinery and it continues EFI subsystem
initialization taking into account that there is no direct access
to EFI boot services, runtime, tables, structures, etc. After that
system runs as usual.

This patch is based on Jan Beulich and Tang Liang work.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Tang Liang <liang.tang@oracle.com>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-07-18 21:23:58 +01:00
Konrad Rzeszutek Wilk 8d693b911b xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests.
By default when CONFIG_XEN and CONFIG_XEN_PVHVM kernels are
run, they will enable the PV extensions (drivers, interrupts, timers,
etc) - which is the best option for the majority of use cases.

However, in some cases (kexec not fully working, benchmarking)
we want to disable Xen PV extensions. As such introduce the
'xen_nopv' parameter that will do it.

This parameter is intended only for HVM guests as the Xen PV
guests MUST boot with PV extensions. However, even if you use
'xen_nopv' on Xen PV guests it will be ignored.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
---
[v2: s/off/xen_nopv/ per Boris Ostrovsky recommendation.]
[v3: Add Reviewed-by]
[v4: Clarify that this is only for HVM guests]
2014-07-14 12:21:59 -04:00
Linus Torvalds 3d09c62394 Merge tag 'stable/for-linus-3.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fixes from David Vrabel:
 "Xen regression and PVH fixes for 3.16-rc1

   - fix dom0 PVH memory setup on latest unstable Xen releases
   - fix 64-bit x86 PV guest boot failure on Xen 3.1 and earlier
   - fix resume regression on non-PV (auto-translated physmap) guests"

* tag 'stable/for-linus-3.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/grant-table: fix suspend for non-PV guests
  x86/xen: no need to explicitly register an NMI callback
  Revert "xen/pvh: Update E820 to work with PVH (v2)"
  x86/xen: fix memory setup for PVH dom0
2014-06-19 07:53:27 -10:00
David Vrabel abacaadc41 x86/xen: fix memory setup for PVH dom0
Since af06d66ee32b (x86: fix setup of PVH Dom0 memory map) in Xen, PVH
dom0 need only use the memory memory provided by Xen which has already
setup all the correct holes.

xen_memory_setup() then ends up being trivial for a PVH guest so
introduce a new function (xen_auto_xlated_memory_setup()).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>
2014-06-05 14:22:27 +01:00
Linus Torvalds 9f888b3a10 Merge tag 'stable/for-linus-3.16-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip into next
Pull Xen updates from David Vrabel:
 "xen: features and fixes for 3.16-rc0
   - support foreign mappings in PVH domains (needed when dom0 is PVH)

   - fix mapping high MMIO regions in x86 PV guests (this is also the
     first half of removing the PAGE_IOMAP PTE flag).

   - ARM suspend/resume support.

   - ARM multicall support"

* tag 'stable/for-linus-3.16-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: map foreign pfns for autotranslated guests
  xen-acpi-processor: Don't display errors when we get -ENOSYS
  xen/pciback: Document the entry points for 'pcistub_put_pci_dev'
  xen/pciback: Document when the 'unbind' and 'bind' functions are called.
  xen-pciback: Document when we FLR an PCI device.
  xen-pciback: First reset, then free.
  xen-pciback: Cleanup up pcistub_put_pci_dev
  x86/xen: do not use _PAGE_IOMAP in xen_remap_domain_mfn_range()
  x86/xen: set regions above the end of RAM as 1:1
  x86/xen: only warn once if bad MFNs are found during setup
  x86/xen: compactly store large identity ranges in the p2m
  x86/xen: fix set_phys_range_identity() if pfn_e > MAX_P2M_PFN
  x86/xen: rename early_p2m_alloc() and early_p2m_alloc_middle()
  xen/x86: set panic notifier priority to minimum
  arm,arm64/xen: introduce HYPERVISOR_suspend()
  xen: refactor suspend pre/post hooks
  arm: xen: export HYPERVISOR_multicall to modules.
  arm64: introduce virt_to_pfn
  arm/xen: Remove definiition of virt_to_pfn in asm/xen/page.h
  arm: xen: implement multicall hypercall support.
2014-06-02 08:24:12 -07:00
Radim Krčmář bc5eb20161 xen/x86: set panic notifier priority to minimum
Execution is not going to continue after telling Xen about the crash.
Let other panic notifiers run by postponing the final hypercall as much
as possible.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-05-15 15:54:57 +01:00