Commit Graph

360789 Commits

Author SHA1 Message Date
Sasha Levin 55257d72bd virtio-net: fill only rx queues which are being used
Due to MQ support we may allocate a whole bunch of rx queues but
never use them. With this patch we'll safe the space used by
the receive buffers until they are actually in use:

sh-4.2# free -h
             total       used       free     shared    buffers     cached
Mem:          490M        35M       455M         0B         0B       4.1M
-/+ buffers/cache:        31M       459M
Swap:           0B         0B         0B
sh-4.2# ethtool -L eth0 combined 8
sh-4.2# free -h
             total       used       free     shared    buffers     cached
Mem:          490M       162M       327M         0B         0B       4.1M
-/+ buffers/cache:       158M       331M
Swap:           0B         0B         0B

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-29 12:47:05 +09:30
Rusty Russell 6b39271746 lguest: map Switcher below fixmap.
Now we've adjusted all the code, we can simply set switcher_addr to
wherever it needs to go below the fixmaps, rather than asserting that
it should be so.

With large NR_CPUS and PAE, people were hitting the "mapping switcher
would thwack fixmap" message.

Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:45:03 +09:30
Rusty Russell 6d0cda93c0 lguest: cache last cpu we ran on.
This optimizes the frobbing of our Switcher map.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:45:03 +09:30
Rusty Russell 86935fc4ee lguest: map Switcher text whenever we allocate a new pagetable.
It's always to same, so no need to put in the PTE every time we're
about to run.  Keep a flag to track whether the pagetable has the
Switcher entries allocated, and when allocating always initialize the
Switcher text PTE.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:45:02 +09:30
Rusty Russell 3412b6ae29 lguest: don't share Switcher PTE pages between guests.
We currently use the whole top PGD entry for the switcher, so we
simply share a fixed page of PTEs between all guests (actually, it's
one per Host CPU, to ensure isolation between guests).

Changes to a scheme where every guest has its own mappings.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:45:01 +09:30
Rusty Russell f1f394b1c3 lguest: expost switcher_pages array (as lg_switcher_pages).
We will need this in page_table.c soon.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:45:00 +09:30
Rusty Russell 17427e08fa lguest: extract shadow PTE walking / allocating.
We want a separate find_pte() function so we can call it for populating the
switcher PTE entries.

We can also use it in page_writable().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:44:47 +09:30
Rusty Russell e1d12606f7 lguest: make check_gpte et. al return bool.
This is a bit neater: we can immediately return if a PTE/PGD/PMD entry
is invalid (which also kills the guest).  It means we don't risk using
invalid entries as we reshuffle the code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:31:39 +09:30
Rusty Russell 93a2cdff98 lguest: assume Switcher text is a single page.
ie. SHARED_SWITCHER_PAGES == 1.  It is well under a page, and it's a
minor simplification: it's nice to have *one* simplification in a
patch series!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:31:36 +09:30
Rusty Russell 856c608827 lguest: rename switcher_page to switcher_pages.
There is a single page with the Switcher in it, but it's followed by 2
pages per Host CPU.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:31:35 +09:30
Rusty Russell c215a8b9eb lguest: remove RESERVE_MEM constant.
We can use switcher_addr directly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:31:35 +09:30
Rusty Russell 68a644d734 lguest: check vaddr not pgd for Switcher protection.
We currently assume that the Switcher the top pgd; we want to remove
this assumption, so check that vaddr is OK, rather then checking pgd
index.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:31:34 +09:30
Rusty Russell 406a590ba1 lguest: prepare to make SWITCHER_ADDR a variable.
We currently use the whole top PGD entry for the switcher, but that's
hitting the fixmap in some configurations (mainly, large NR_CPUS).
Introduce a variable, currently set to the constant.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-22 15:31:33 +09:30
Amit Shah 74ff582cd6 virtio: console: replace EMFILE with EBUSY for already-open port
Returning EMFILE (process has too many open files) is incorrect to
indicate a port is already open by another process.  Use EBUSY for that.

This does change what we report to userspace, but I believe userspace
can look at it this way: it gets EBUSY, a new error code, instead of
EMFILE.  It's still an error, and that's not changing.

Reported-by: Mateusz Guzik <mguzik@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-15 15:17:39 +09:30
Wanlong Gao 285e71ea6f virtio-scsi: reset virtqueue affinity when doing cpu hotplug
Add hot cpu notifier to reset the request virtqueue affinity
when doing cpu hotplug.

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-08 23:06:56 +09:30
Paolo Bonzini 9141a4ca0d virtio-scsi: introduce multiqueue support
This patch adds queue steering to virtio-scsi.  When a target is sent
multiple requests, we always drive them to the same queue so that FIFO
processing order is kept.  However, if a target was idle, we can choose
a queue arbitrarily.  In this case the queue is chosen according to the
current VCPU, so the driver expects the number of request queues to be
equal to the number of VCPUs.  This makes it easy and fast to select
the queue, and also lets the driver optimize the IRQ affinity for the
virtqueues (each virtqueue's affinity is set to the CPU that "owns"
the queue).

The speedup comes from improving cache locality and giving CPU affinity
to the virtqueues, which is why this scheme was selected.  Assuming that
the thread that is sending requests to the device is I/O-bound, it is
likely to be sleeping at the time the ISR is executed, and thus executing
the ISR on the same processor that sent the requests is cheap.

However, the kernel will not execute the ISR on the "best" processor
unless you explicitly set the affinity.  This is because in practice
you will have many such I/O-bound processes and thus many otherwise
idle processors.  Then the kernel will execute the ISR on a random
processor, rather than the one that is sending requests to the device.

The alternative to per-CPU virtqueues is per-target virtqueues.  To
achieve the same locality, we could dynamically choose the virtqueue's
affinity based on the CPU of the last task that sent a request.  This
is less appealing because we do not set the affinity directly---we only
provide a hint to the irqbalanced running in userspace.  Dynamically
changing the affinity only works if the userspace applies the hint
fast enough.

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Tested-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-08 23:06:55 +09:30
Paolo Bonzini 10f34f64d3 virtio-scsi: push vq lock/unlock into virtscsi_vq_done
Avoid duplicated code in all of the callers.

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-08 23:06:53 +09:30
Paolo Bonzini 7f82b3c915 virtio-scsi: pass struct virtio_scsi to virtqueue completion function
This will be needed soon in order to retrieve the per-target
struct.

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-08 23:06:49 +09:30
Wanlong Gao 5c370194df virtio-scsi: redo allocation of target data
virtio_scsi_target_state is now empty.  We will find new uses for it in
the next few patches, so this patch does not drop it completely.

And as James suggested, we use entries target_alloc and target_destroy
in the host template to allocate and destroy the virtio_scsi_target_state
of each target, attach this struct to scsi_target->hostdata. Now
we can get at it from the sdev with scsi_target(sdev)->hostdata.
No messing around with fixed size arrays and bulk memory allocation
and no need to pass in the maximum target size as a parameter because
everything should now happen dynamically.

Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-08 23:06:47 +09:30
Wei Yongjun 3826835ab8 virtio_console: make local symbols static
Those symbols only used within this file, and should be static.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-08 23:00:26 +09:30
Wei Yongjun 1aef76e9c4 caif_virtio: fix error return code in cfv_create_genpool()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-02 16:48:25 +10:30
Amos Kong 916cdabc31 MAINTAINERS: add missing entries for virtio
Some head files were split or moved to uapi/ without
updating MAINTAINERS.

Signed-off-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-02 16:43:00 +10:30
Paul Bolle 608c380c30 virtio: do not export "u16" and "u64" to userspace
virtio_balloon.h exports "u16" and "u64" to userspace. Use "__u16" and
"__u64" instead.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-02 16:42:58 +10:30
Sjur Brændeland a8c7687bf2 caif_virtio: Check that vringh_config is not null
Check that vringh_config is not NULL before using it.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-03-24 14:29:15 +10:30
Sjur Brændeland b2273be8d2 caif_virtio: Use vringh_notify_enable correctly
Check on the correct return value from
vringh_notify_enable_kern(). It returns false if
more packets are available, not true.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-03-24 14:29:14 +10:30