Memory {increase|decrease}_reservation and VA mappings update/reset
code used in balloon driver can be made common, so other drivers can
also re-use the same functionality without open-coding.
Create a dedicated file for the shared code and export corresponding
symbols for other kernel modules.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Commit f5775e0b61 ("x86/xen: discard RAM regions above the maximum
reservation") left host memory not assigned to dom0 as available for
memory hotplug.
Unfortunately this also meant that those regions could be used by
others. Specifically, commit fa564ad963 ("x86/PCI: Enable a 64bit BAR
on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those
addresses as MMIO.
To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus
effectively reverting f5775e0b61) and keep track of that region as
a hostmem resource that can be used for the hotplug.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Commit aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
removed XENFEAT_auto_translated_physmap test in xen_alloc_p2m_entry()
since it is assumed that the routine is never called by non-PV guests.
However, alloc_xenballooned_pages() may make this call on a PVH guest.
Prevent this from happening by adding XENFEAT_auto_translated_physmap
check there.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Fixes: aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
When setting up the Xenstore watch for the memory target size the new
watch will fire at once. Don't try to reach the configured target size
by onlining new memory in this case, as the current memory size will
be smaller in almost all cases due to e.g. BIOS reserved pages.
Onlining new memory will lead to more problems e.g. undesired conflicts
with NVMe devices meant to be operated as block devices.
Instead remember the difference between target size and current size
when the watch fires for the first time and apply it to any further
size changes, too.
In order to avoid races between balloon.c and xen-balloon.c init calls
do the xen-balloon.c initialization from balloon.c.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Balloon driver uses several PV-only concepts (xen_start_info,
xen_extra_mem,..) and it seems the simpliest solution to make HVM-only
build happy is to decorate these parts with #ifdefs.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Add #include <linux/cred.h> dependencies to all .c files rely on sched.h
doing that for them.
Note that even if the count where we need to add extra headers seems high,
it's still a net win, because <linux/sched.h> is included in over
2,200 files ...
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Only mark a page as managed when it is released back to the allocator.
This ensures that the managed page count does not get falsely increased
when a VM is running. Correspondingly change it so that pages are
marked as unmanaged after getting them from the allocator.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Fix a declared-but-not-defined warning when building with
XEN_BALLOON_MEMORY_HOTPLUG=n. This fixes a regression introduced by
commit dfd74a1edf ("xen/balloon: Fix crash when ballooning on x86 32
bit PAE").
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Commit 55b3da98a4 (xen/balloon: find
non-conflicting regions to place hotplugged memory) caused a
regression in 4.4.
When ballooning on an x86 32 bit PAE system with close to 64 GiB of
memory, the address returned by allocate_resource may be above 64 GiB.
When using CONFIG_SPARSEMEM, this setup is limited to using physical
addresses < 64 GiB. When adding memory at this address, it runs off
the end of the mem_section array and causes a crash. Instead, fail
the ballooning request.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: <stable@vger.kernel.org> # 4.4+
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Pull xen updates from David Vrabel:
"Features and fixes for 4.6:
- Make earlyprintk=xen work for HVM guests
- Remove module support for things never built as modules"
* tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
drivers/xen: make platform-pci.c explicitly non-modular
drivers/xen: make sys-hypervisor.c explicitly non-modular
drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular
drivers/xen: make [xen-]ballon explicitly non-modular
xen: audit usages of module.h ; remove unnecessary instances
xen/x86: Drop mode-selecting ifdefs in startup_xen()
xen/x86: Zero out .bss for PV guests
hvc_xen: make early_printk work with HVM guests
hvc_xen: fix xenboot for DomUs
hvc_xen: add earlycon support
The Makefile / Kconfig currently controlling compilation here is:
obj-y += grant-table.o features.o balloon.o manage.o preempt.o time.o
[...]
obj-$(CONFIG_XEN_BALLOON) += xen-balloon.o
...with:
drivers/xen/Kconfig:config XEN_BALLOON
drivers/xen/Kconfig: bool "Xen memory balloon driver"
...meaning that they currently are not being built as modules by anyone.
Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.
In doing so we uncover two implict includes that were obtained
by module.h having such a wide include scope itself:
In file included from drivers/xen/xen-balloon.c:41:0:
include/xen/balloon.h:26:51: warning: ‘struct page’ declared inside parameter list [enabled by default]
int alloc_xenballooned_pages(int nr_pages, struct page **pages);
^
include/xen/balloon.h: In function ‘register_xen_selfballooning’:
include/xen/balloon.h:35:10: error: ‘ENOSYS’ undeclared (first use in this function)
return -ENOSYS;
^
This is fixed by adding mm-types.h and errno.h to the list.
We also delete the MODULE_LICENSE tags since all that information
is already contained at the top of the file in the comments.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Currently, all newly added memory blocks remain in 'offline' state
unless someone onlines them, some linux distributions carry special udev
rules like:
SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
to make this happen automatically. This is not a great solution for
virtual machines where memory hotplug is being used to address high
memory pressure situations as such onlining is slow and a userspace
process doing this (udev) has a chance of being killed by the OOM killer
as it will probably require to allocate some memory.
Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online"
which causes all newly added memory blocks to go online as soon as
they're added. The default is "offline".
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The type of the item in frame_list is xen_pfn_t which is not an unsigned
long on ARM but an uint64_t.
With the current computation, the size of frame_list will be 2 *
PAGE_SIZE rather than PAGE_SIZE.
I bet it's just mistake when the type has been switched from "unsigned
long" to "xen_pfn_t" in commit 965c0aaafe
"xen: balloon: use correct type for frame_list".
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.
With 64K page granularity, a single page will be spread over multiple
Xen frame.
To avoid splitting the page into 4K frame, take advantage of the
extent_order field to directly allocate/free chunk of the Linux page
size.
Note that PVMMU is only used for PV guest (which is x86) and the page
granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure
that because the code has not been modified.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Pages returned by alloc_xenballooned_pages() will be used for grant
mapping which will call set_phys_to_machine() (in PV guests).
Ballooned pages are set as INVALID_P2M_ENTRY in the p2m and thus may
be using the (shared) missing tables and a subsequent
set_phys_to_machine() will need to allocate new tables.
Since the grant mapping may be done from a context that cannot sleep,
the p2m entries must already be allocated.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
alloc_xenballooned_pages() is used to get ballooned pages to back
foreign mappings etc. Instead of having to balloon out real pages,
use (if supported) hotplugged memory.
This makes more memory available to the guest and reduces
fragmentation in the p2m.
This is only enabled if the xen.balloon.hotplug_unpopulated sysctl is
set to 1. This sysctl defaults to 0 in case the udev rules to
automatically online hotplugged memory do not exist.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Add xen.balloon.hotplug_unpopulated sysctl to enable use of hotplug
for unpopulated pages.
All users of alloc_xenballoon_pages() wanted low memory pages, so
remove the option for high memory.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Now that we track the total number of pages (included hotplugged
regions), it is easy to determine if more memory needs to be
hotplugged.
Add a new BP_WAIT state to signal that the balloon process needs to
wait until kicked by the memory add notifier (when the new section is
onlined by userspace).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Return BP_WAIT if enough sections are already hotplugged.
v2:
- New BP_WAIT status after adding new memory sections.
The stats used for memory hotplug make no sense and are fiddled with
in odd ways. Remove them and introduce total_pages to track the total
number of pages (both populated and unpopulated) including those within
hotplugged regions (note that this includes not yet onlined pages).
This will be used in a subsequent commit (xen/balloon: only hotplug
additional memory if required) when deciding whether additional memory
needs to be hotplugged.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Instead of placing hotplugged memory at the end of RAM (which may
conflict with PCI devices or reserved regions) use allocate_resource()
to get a new, suitably aligned resource that does not conflict.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Remove stale comment.
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This resulted in some misimplementation of helpers on ARM and
confused developers about the expected behavior.
For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.
For clarity and avoid new confusion, replace any reference to mfn with
gfn in any helpers used by PV drivers. The x86 code will still keep some
reference of pfn_to_mfn which may be used by all kind of guests
No changes as been made in the hypercall field, even
though they may be invalid, in order to keep the same as the defintion
in xen repo.
Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
name to close to the KVM function gfn_to_page.
Take also the opportunity to simplify simple construction such
as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
will come in follow-up patches.
[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
When dom0 is being ballooned balloon_process() will hold the balloon
mutex until it is finished. This will block e.g. creation of new
domains as the device backends for the new domain need some
autoballooned pages for the ring buffers.
Avoid this by releasing the balloon mutex from time to time during
ballooning. Adjust the comment above balloon_process() regarding
multiple instances of balloon_process().
Instead of open coding it, just use cond_resched().
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>