Commit Graph

93 Commits

Author SHA1 Message Date
Tomáš Golembiovský 4d32029b8d virtio_balloon: include disk/file caches memory statistics
Add a new field VIRTIO_BALLOON_S_CACHES to virtio_balloon memory
statistics protocol. The value represents all disk/file caches.

In this case it corresponds to the sum of values
Buffers+Cached+SwapCached from /proc/meminfo.

Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-31 01:47:33 +02:00
Jan Stancek d9e427f6ab virtio_balloon: fix increment of vb->num_pfns in fill_balloon()
commit c7cdff0e86 ("virtio_balloon: fix deadlock on OOM")
changed code to increment vb->num_pfns before call to
set_page_pfns(), which used to happen only after.

This patch fixes boot hang for me on ppc64le KVM guests.

Fixes: c7cdff0e86 ("virtio_balloon: fix deadlock on OOM")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-01 16:55:45 +02:00
Michael S. Tsirkin c7cdff0e86 virtio_balloon: fix deadlock on OOM
fill_balloon doing memory allocations under balloon_lock
can cause a deadlock when leak_balloon is called from
virtballoon_oom_notify and tries to take same lock.

To fix, split page allocation and enqueue and do allocations outside the lock.

Here's a detailed analysis of the deadlock by Tetsuo Handa:

In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
serialize against fill_balloon(). But in fill_balloon(),
alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
is specified, this allocation attempt might indirectly depend on somebody
else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
__GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
will cause OOM lockup.

  Thread1                                       Thread2
    fill_balloon()
      takes a balloon_lock
      balloon_page_enqueue()
        alloc_page(GFP_HIGHUSER_MOVABLE)
          direct reclaim (__GFP_FS context)       takes a fs lock
            waits for that fs lock                  alloc_page(GFP_NOFS)
                                                      __alloc_pages_may_oom()
                                                        takes the oom_lock
                                                        out_of_memory()
                                                          blocking_notifier_call_chain()
                                                            leak_balloon()
                                                              tries to take that balloon_lock and deadlocks

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-11-14 23:57:38 +02:00
Wei Wang f9aada5fff virtio-balloon: coding format cleanup
Clean up the comment format.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-25 16:37:35 +03:00
Liang Li 195a8c43e9 virtio-balloon: deflate via a page list
This patch saves the deflated pages to a list, instead of the PFN array.
Accordingly, the balloon_pfn_to_page() function is removed.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-25 16:37:35 +03:00
Michael S. Tsirkin e41b135550 virtio_balloon: disable VIOMMU support
virtio balloon bypasses the DMA API entirely so does not support the
VIOMMU right now.  It's not clear we need that support, for now let's
just make sure we don't pretend to support it.

Cc: stable@vger.kernel.org
Cc: Wei Wang <wei.w.wang@intel.com>
Fixes: 1a93769399 ("virtio: new feature to detect IOMMU device quirk")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2017-06-18 23:13:35 +03:00
Michael S. Tsirkin 9b2bbdb227 virtio: wrap find_vqs
We are going to add more parameters to find_vqs, let's wrap the call so
we don't need to tweak all drivers every time.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-02 23:41:42 +03:00
Arnd Bergmann f0bb2d50df virtio_balloon: prevent uninitialized variable use
The latest gcc-7.0.1 snapshot reports a new warning:

virtio/virtio_balloon.c: In function 'update_balloon_stats':
virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in this function [-Werror=uninitialized]

This seems absolutely right, so we should add an extra check to
prevent copying uninitialized stack data into the statistics.
>From all I can tell, this has been broken since the statistics code
was originally added in 2.6.34.

Fixes: 9564e138b1 ("virtio: Add memory statistics reporting to the balloon driver (V4)")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-28 20:41:28 +03:00
Ladi Prosek 9646b26e85 virtio-balloon: use actual number of stats for stats queue buffers
The virtio balloon driver contained a not-so-obvious invariant that
update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters
in order to send valid stats to the host. This commit fixes it by having
update_balloon_stats return the actual number of counters, and its
callers use it when pushing buffers to the stats virtqueue.

Note that it is still out of spec to change the number of counters
at run-time. "Driver MUST supply the same subset of statistics in all
buffers submitted to the statsq."

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-28 20:41:28 +03:00
Ladi Prosek fc8653228c virtio_balloon: init 1st buffer in stats vq
When init_vqs runs, virtio_balloon.stats is either uninitialized or
contains stale values. The host updates its state with garbage data
because it has no way of knowing that this is just a marker buffer
used for signaling.

This patch updates the stats before pushing the initial buffer.

Alternative fixes:
* Push an empty buffer in init_vqs. Not easily done with the current
  virtio implementation and violates the spec "Driver MUST supply the
  same subset of statistics in all buffers submitted to the statsq".
* Push a buffer with invalid tags in init_vqs. Violates the same
  spec clause, plus "invalid tag" is not really defined.

Note: the spec says:
	When using the legacy interface, the device SHOULD ignore all values in
	the first buffer in the statsq supplied by the driver after device
	initialization. Note: Historically, drivers supplied an uninitialized
	buffer in the first buffer.

Unfortunately QEMU does not seem to implement the recommendation
even for the legacy interface.

Cc: stable@vger.kernel.org
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-28 20:41:27 +03:00
Linus Torvalds 1827adb11a Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched.h split-up from Ingo Molnar:
 "The point of these changes is to significantly reduce the
  <linux/sched.h> header footprint, to speed up the kernel build and to
  have a cleaner header structure.

  After these changes the new <linux/sched.h>'s typical preprocessed
  size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
  lines), which is around 40% faster to build on typical configs.

  Not much changed from the last version (-v2) posted three weeks ago: I
  eliminated quirks, backmerged fixes plus I rebased it to an upstream
  SHA1 from yesterday that includes most changes queued up in -next plus
  all sched.h changes that were pending from Andrew.

  I've re-tested the series both on x86 and on cross-arch defconfigs,
  and did a bisectability test at a number of random points.

  I tried to test as many build configurations as possible, but some
  build breakage is probably still left - but it should be mostly
  limited to architectures that have no cross-compiler binaries
  available on kernel.org, and non-default configurations"

* 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
  sched/headers: Clean up <linux/sched.h>
  sched/headers: Remove #ifdefs from <linux/sched.h>
  sched/headers: Remove the <linux/topology.h> include from <linux/sched.h>
  sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h>
  sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h>
  sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h>
  sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/init.h>
  sched/core: Remove unused prefetch_stack()
  sched/headers: Remove <linux/rculist.h> from <linux/sched.h>
  sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h>
  sched/headers: Remove <linux/signal.h> from <linux/sched.h>
  sched/headers: Remove <linux/rwsem.h> from <linux/sched.h>
  sched/headers: Remove the runqueue_is_locked() prototype
  sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h>
  sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h>
  sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h>
  ...
2017-03-03 10:16:38 -08:00
Linus Torvalds 54d7989f47 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost updates from Michael Tsirkin:
 "virtio, vhost: optimizations, fixes

  Looks like a quiet cycle for vhost/virtio, just a couple of minor
  tweaks. Most notable is automatic interrupt affinity for blk and scsi.
  Hopefully other devices are not far behind"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-console: avoid DMA from stack
  vhost: introduce O(1) vq metadata cache
  virtio_scsi: use virtio IRQ affinity
  virtio_blk: use virtio IRQ affinity
  blk-mq: provide a default queue mapping for virtio device
  virtio: provide a method to get the IRQ affinity mask for a virtqueue
  virtio: allow drivers to request IRQ affinity when creating VQs
  virtio_pci: simplify MSI-X setup
  virtio_pci: don't duplicate the msix_enable flag in struct pci_dev
  virtio_pci: use shared interrupts for virtqueues
  virtio_pci: remove struct virtio_pci_vq_info
  vhost: try avoiding avail index access when getting descriptor
  virtio_mmio: expose header to userspace
2017-03-02 13:53:13 -08:00
Ingo Molnar 50d34394ce sched/headers: Prepare to remove the <linux/magic.h> include from <linux/sched/task_stack.h>
Update files that depend on the magic.h inclusion.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:40 +01:00
Christoph Hellwig fb5e31d970 virtio: allow drivers to request IRQ affinity when creating VQs
Add a struct irq_affinity pointer to the find_vqs methods, which if set
is used to tell the PCI layer to create the MSI-X vectors for our I/O
virtqueues with the proper affinity from the start.  Compared to after
the fact affinity hints this gives us an instantly working setup and
allows to allocate the irq descritors node-local and avoid interconnect
traffic.  Last but not least this will allow blk-mq queues are created
based on the interrupt affinity for storage drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-27 20:54:04 +02:00
Yisheng Xie 9c57b5808c mm balloon: umount balloon_mnt when removing vb device
With CONFIG_BALLOON_COMPACTION=y the kernel will mount balloon_mnt for
balloon page migration when we probe a virtio_balloon device.  However
we do not unmount it when removing the device.  Fix this.

Fixes: b1123ea6d3 ("mm: balloon: use general non-lru movable page feature")
Link: http://lkml.kernel.org/r/1486531318-35189-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Konstantin Neumoin 8424af5336 virtio: update balloon size in balloon "probe"
The following commit 'fad7b7b27b6a (virtio_balloon: Use a workqueue
instead of "vballoon" kthread)' has added a regression. Original code with
kthread starts the thread inside probe and checks the necessity to update
balloon inside the thread immediately.

Nowadays the code behaves differently. Work is queued only on the first
command from the host after the negotiation. Thus there is a window
especially at the guest startup or the module reloading when the balloon
size is not updated until the notification from the host.

This patch adds balloon size check at the end of the probe to match
original behaviour.

Signed-off-by: Konstantin Neumoin <kneumoin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-31 00:21:41 +02:00
Linus Torvalds 0803e04011 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost updates from Michael Tsirkin:

 - new vsock device support in host and guest

 - platform IOMMU support in host and guest, including compatibility
   quirks for legacy systems.

 - misc fixes and cleanups.

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  VSOCK: Use kvfree()
  vhost: split out vringh Kconfig
  vhost: detect 32 bit integer wrap around
  vhost: new device IOTLB API
  vhost: drop vringh dependency
  vhost: convert pre sorted vhost memory array to interval tree
  vhost: introduce vhost memory accessors
  VSOCK: Add Makefile and Kconfig
  VSOCK: Introduce vhost_vsock.ko
  VSOCK: Introduce virtio_transport.ko
  VSOCK: Introduce virtio_vsock_common.ko
  VSOCK: defer sock removal to transports
  VSOCK: transport-specific vsock_transport functions
  vhost: drop vringh dependency
  vop: pull in vhost Kconfig
  virtio: new feature to detect IOMMU device quirk
  balloon: check the number of available pages in leak balloon
  vhost: lockless enqueuing
  vhost: simplify work flushing
2016-08-06 09:20:13 -04:00
Konstantin Neumoin 37cf99e08c balloon: check the number of available pages in leak balloon
The balloon has a special mechanism that is subscribed to the oom
notification which leads to deflation for a fixed number of pages.
The number is always fixed even when the balloon is fully deflated.
But leak_balloon did not expect that the pages to deflate will be more
than taken, and raise a "BUG" in balloon_page_dequeue when page list
will be empty.

So, the simplest solution would be to check that the number of releases
pages is less or equal to the number taken pages.

Cc: stable@vger.kernel.org
Signed-off-by: Konstantin Neumoin <kneumoin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01 21:44:51 +03:00
Minchan Kim dd4123f324 mm: fix build warnings in <linux/compaction.h>
Randy reported below build error.

> In file included from ../include/linux/balloon_compaction.h:48:0,
>                  from ../mm/balloon_compaction.c:11:
> ../include/linux/compaction.h:237:51: warning: 'struct node' declared inside parameter list [enabled by default]
>  static inline int compaction_register_node(struct node *node)
> ../include/linux/compaction.h:237:51: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
> ../include/linux/compaction.h:242:54: warning: 'struct node' declared inside parameter list [enabled by default]
>  static inline void compaction_unregister_node(struct node *node)
>

It was caused by non-lru page migration which needs compaction.h but
compaction.h doesn't include any header to be standalone.

I think proper header for non-lru page migration is migrate.h rather
than compaction.h because migrate.h has already headers needed to work
non-lru page migration indirectly like isolate_mode_t, migrate_mode
MIGRATEPAGE_SUCCESS.

[akpm@linux-foundation.org: revert mm-balloon-use-general-non-lru-movable-page-feature-fix.patch temp fix]
Link: http://lkml.kernel.org/r/20160610003304.GE29779@bbox
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Minchan Kim b1123ea6d3 mm: balloon: use general non-lru movable page feature
Now, VM has a feature to migrate non-lru movable pages so balloon
doesn't need custom migration hooks in migrate.c and compaction.c.

Instead, this patch implements the page->mapping->a_ops->
{isolate|migrate|putback} functions.

With that, we could remove hooks for ballooning in general migration
functions and make balloon compaction simple.

[akpm@linux-foundation.org: compaction.h requires that the includer first include node.h]
Link: http://lkml.kernel.org/r/1464736881-24886-4-git-send-email-minchan@kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Michael S. Tsirkin 87c9403b0d virtio_balloon: fix PFN format for virtio-1
Everything should be LE when using virtio-1, but
the linux balloon driver does not seem to care about that.

Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-22 19:44:13 +03:00
Linus Torvalds f0691533b7 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost updates from Michael Tsirkin:
 "New features, performance improvements, cleanups:

   - basic polling support for vhost
   - rework virtio to optionally use DMA API, fixing it on Xen
   - balloon stats gained a new entry
   - using the new napi_alloc_skb speeds up virtio net
   - virtio blk stats can now be read while another VCPU is busy
     inflating or deflating the balloon

  plus misc cleanups in various places"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb()
  vhost_net: basic polling support
  vhost: introduce vhost_vq_avail_empty()
  vhost: introduce vhost_has_work()
  virtio_balloon: Allow to resize and update the balloon stats in parallel
  virtio_balloon: Use a workqueue instead of "vballoon" kthread
  virtio/s390: size of SET_IND payload
  virtio/s390: use dev_to_virtio
  vhost: rename vhost_init_used()
  vhost: rename cross-endian helpers
  virtio_blk: VIRTIO_BLK_F_WCE->VIRTIO_BLK_F_FLUSH
  vring: Use the DMA API on Xen
  virtio_pci: Use the DMA API if enabled
  virtio_mmio: Use the DMA API if enabled
  virtio: Add improved queue allocation API
  virtio_ring: Support DMA APIs
  vring: Introduce vring_use_dma_api()
  s390/dma: Allow per device dma ops
  alpha/dma: use common noop dma ops
  dma: Provide simple noop dma ops
2016-03-20 13:28:18 -07:00
Igor Redko 5057dcd0f1 virtio_balloon: export 'available' memory to balloon statistics
Add a new field, VIRTIO_BALLOON_S_AVAIL, to virtio_balloon memory
statistics protocol, corresponding to 'Available' in /proc/meminfo.

It indicates to the hypervisor how big the balloon can be inflated
without pushing the guest system to swap.

Signed-off-by: Igor Redko <redkoi@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Petr Mladek fd0e21c31e virtio_balloon: Allow to resize and update the balloon stats in parallel
The virtio balloon statistics are not updated when the balloon
is being resized. But it seems that both tasks could be done
in parallel.

stats_handle_request() updates the statistics in the balloon
structure and then communicates with the host.

update_balloon_stats() calls all_vm_events() that just reads
some per-CPU variables. The values might change during and
after the call but it is expected and happens even without
this patch.

update_balloon_stats() also calls si_meminfo(). It is a bit
more complex function. It too just reads some variables and
looks lock-less safe. In each case, it seems to be called
lock-less on several similar locations, e.g. from post_status()
in dm_thread_func(), or from vmballoon_send_get_target().

The communication with the host is done via a separate virtqueue,
see vb->stats_vq vs. vb->inflate_vq and vb->deflate_vq. Therefore
it could be used in parallel with fill_balloon() and leak_balloon().

This patch splits the existing work into two pieces. One is for
updating the balloon stats. The other is for resizing of the balloon.
It seems that they can be proceed in parallel without any
extra locking.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 01:40:18 +02:00
Petr Mladek fad7b7b27b virtio_balloon: Use a workqueue instead of "vballoon" kthread
This patch moves the deferred work from the "vballoon" kthread into a
system freezable workqueue.

We do not need to maintain and run a dedicated kthread. Also the event
driven workqueues API makes the logic much easier. Especially, we do
not longer need an own wait queue, wait function, and freeze point.

The conversion is pretty straightforward. One cycle of the main loop
is put into a work. The work is queued instead of waking the kthread.

fill_balloon() and leak_balloon() have a limit for the amount of modified
pages. The work re-queues itself when necessary. For this, we make
fill_balloon() to return the number of really modified pages.
Note that leak_balloon() already did this.

virtballoon_restore() queues the work only when really needed.

The only complication is that we need to prevent queuing the work
when the balloon is being removed. It was easier before because the
kthread simply removed itself from the wait queue. We need an
extra boolean and spin lock now.

My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
suggested using a system one. Tejun Heo confirmed that the system
workqueue has a pretty high concurrency level (256) by default.
Therefore we need not be afraid of too long blocking.

Signed-off-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 01:40:13 +02:00