Commit Graph

248 Commits

Author SHA1 Message Date
Linus Torvalds 53861af9a1 Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio updates from Rusty Russell:
 "OK, this has the big virtio 1.0 implementation, as specified by OASIS.

  On top of tht is the major rework of lguest, to use PCI and virtio
  1.0, to double-check the implementation.

  Then comes the inevitable fixes and cleanups from that work"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (80 commits)
  virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice.
  virtio_net: unconditionally define struct virtio_net_hdr_v1.
  tools/lguest: don't use legacy definitions for net device in example launcher.
  virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined.
  tools/lguest: use common error macros in the example launcher.
  tools/lguest: give virtqueues names for better error messages
  tools/lguest: more documentation and checking of virtio 1.0 compliance.
  lguest: don't look in console features to find emerg_wr.
  tools/lguest: don't start devices until DRIVER_OK status set.
  tools/lguest: handle indirect partway through chain.
  tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI)
  tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI)
  tools/lguest: rename virtio_pci_cfg_cap field to match spec.
  tools/lguest: fix features_accepted logic in example launcher.
  tools/lguest: handle device reset correctly in example launcher.
  virtual: Documentation: simplify and generalize paravirt_ops.txt
  lguest: remove NOTIFY call and eventfd facility.
  lguest: remove NOTIFY facility from demonstration launcher.
  lguest: use the PCI console device's emerg_wr for early boot messages.
  lguest: always put console in PCI slot #1.
  ...
2015-02-18 09:24:01 -08:00
Rusty Russell d9bab50aa4 lguest: remove NOTIFY call and eventfd facility.
Disappointing, as this was kind of neat (especially getting to use RCU
to manage the address -> eventfd mapping).  But now the devices are PCI
handled in userspace, we get rid of both the NOTIFY hypercall and
the interface to connect an eventfd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:46 +10:30
Rusty Russell e68ccd1f9d lguest: remove support for lguest bus.
The demonstration launcher now uses PCI entirely.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:41 +10:30
Rusty Russell 7313d5217e lguest: add iomem region, where guest page faults get sent to userspace.
This lets us implement PCI.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:33 +10:30
Rusty Russell c565650b10 lguest: send trap 13 through to userspace.
We copy 7 bytes at eip for userspace's instruction decode; we have to
carefully handle the case where eip is at the end of a page.  We can't
leave this to userspace since kernel has all the page table decode
logic.

The decode logic moves to userspace, basically unchanged.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:31 +10:30
Rusty Russell c9e433e4b8 lguest: add infrastructure to check mappings.
We normally abort the guest unconditionally when it gives us a bad address,
but in the next patch we want to copy some bytes which may not be mapped.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:31 +10:30
Rusty Russell 8ed313001a lguest: add infrastructure for userspace to deliver a trap to the guest.
This is required for instruction emulation to move to userspace.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:30 +10:30
Rusty Russell 69a09dc174 lguest: write more information to userspace about pending traps.
This is preparation for userspace handling MMIO and ioport accesses.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:30 +10:30
Rusty Russell 18c137371b lguest: add operations to get/set a register from the Launcher.
We use the ptrace API struct, and we currently don't let them set
anything but the normal registers (we'd have to filter the others).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:29 +10:30
Andy Lutomirski 375074cc73 x86: Clean up cr4 manipulation
CR4 manipulation was split, seemingly at random, between direct
(write_cr4) and using a helper (set/clear_in_cr4).  Unfortunately,
the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code,
which only a small subset of users actually wanted.

This patch replaces all cr4 access in functions that don't leave cr4
exactly the way they found it with new helpers cr4_set_bits,
cr4_clear_bits, and cr4_set_bits_and_update_boot.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Vince Weaver <vince@deater.net>
Cc: "hillf.zj" <hillf.zj@alibaba-inc.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 12:10:41 +01:00
Michael S. Tsirkin 5c609a5ef0 virtio: allow finalize_features to fail
This will make it easy for transports to validate features and return
failure.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 16:32:32 +02:00
Michael S. Tsirkin 93d389f820 virtio: assert 32 bit features in transports
At this point, no transports set any of the high 32 feature bits.
Since transports generally can't (yet) cope with such bits, add BUG_ON
checks to make sure they are not set by mistake.

Based on rproc patch by Rusty.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:24 +02:00
Michael S. Tsirkin d025477368 virtio: add support for 64 bit features.
Change u32 to u64, and use BIT_ULL and 1ULL everywhere.

Note: transports are unchanged, and only set low 32 bit.
This guarantees that no transport sets e.g. VERSION_1
by mistake without proper support.

Based on patch by Rusty.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:24 +02:00
Michael S. Tsirkin e16e12be34 virtio: use u32, not bitmap for features
It seemed like a good idea to use bitmap for features
in struct virtio_device, but it's actually a pain,
and seems to become even more painful when we get more
than 32 feature bits.  Just change it to a u32 for now.

Based on patch by Rusty.

Suggested-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:23 +02:00
WANG Chao f6f8ed4735 mm/vmalloc.c: clean up map_vm_area third argument
Currently map_vm_area() takes (struct page *** pages) as third argument,
and after mapping, it moves (*pages) to point to (*pages +
nr_mappped_pages).

It looks like this kind of increment is useless to its caller these
days.  The callers don't care about the increments and actually they're
trying to avoid this by passing another copy to map_vm_area().

The caller can always guarantee all the pages can be mapped into vm_area
as specified in first argument and the caller only cares about whether
map_vm_area() fails or not.

This patch cleans up the pointer movement in map_vm_area() and updates
its callers accordingly.

Signed-off-by: WANG Chao <chaowang@redhat.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:19 -07:00
Andrew Morton 179e09637c drivers/lguest/page_tables.c: rename do_set_pte()
"mm: introduce vm_ops->map_pages()" wants to export a do_set_pte() from core
kernel.  Rename lguest's do_set_pte() to something more lguest-specific.

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:35:52 -07:00
Andi Kleen cdd77e87ea x86, asmlinkage, lguest: Pass in globals into assembler statement
Tell the compiler that the inline assembler statement
references lguest_entry.

This fixes compile problems with LTO where the variable
and the assembler code may end up in different files.

Cc: x86@kernel.org
Cc: rusty@rustcorp.com.au
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-11-07 12:13:05 +10:30
Heinz Graalfs 46f9c2b925 virtio_ring: change host notification API
Currently a host kick error is silently ignored and not reflected in
the virtqueue of a particular virtio device.

Changing the notify API for guest->host notification seems to be one
prerequisite in order to be able to handle such errors in the context
where the kick is triggered.

This patch changes the notify API. The notify function must return a
bool return value. It returns false if the host notification failed.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-29 11:28:11 +10:30
Rusty Russell 98fb4e5e6b lguest: fix guest kernel stack overflow when TF bit set.
The symptoms are that running gdb on a binary causes the guest to
overflow the kernels stack (after some period of time), resulting in
it finally being killed with a "Bad address" message.

Reported-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-06 08:09:27 +09:30
Rusty Russell 4623c28e22 lguest: fix BUG_ON() in invalid guest page table.
If we discover the entry is invalid, we kill the guest, but we must
avoid calling gpte_addr() on the invalid pmd, otherwise:

	kernel BUG at drivers/lguest/page_tables.c:157!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-06 08:09:26 +09:30
Linus Torvalds 80cc38b163 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "The usual stuff from trivial tree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
  treewide: relase -> release
  Documentation/cgroups/memory.txt: fix stat file documentation
  sysctl/net.txt: delete reference to obsolete 2.4.x kernel
  spinlock_api_smp.h: fix preprocessor comments
  treewide: Fix typo in printk
  doc: device tree: clarify stuff in usage-model.txt.
  open firmware: "/aliasas" -> "/aliases"
  md: bcache: Fixed a typo with the word 'arithmetic'
  irq/generic-chip: fix a few kernel-doc entries
  frv: Convert use of typedef ctl_table to struct ctl_table
  sgi: xpc: Convert use of typedef ctl_table to struct ctl_table
  doc: clk: Fix incorrect wording
  Documentation/arm/IXP4xx fix a typo
  Documentation/networking/ieee802154 fix a typo
  Documentation/DocBook/media/v4l fix a typo
  Documentation/video4linux/si476x.txt fix a typo
  Documentation/virtual/kvm/api.txt fix a typo
  Documentation/early-userspace/README fix a typo
  Documentation/video4linux/soc-camera.txt fix a typo
  lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment
  ...
2013-07-04 11:40:58 -07:00
H. Peter Anvin 1adfa76a95 x86, flags: Rename X86_EFLAGS_BIT1 to X86_EFLAGS_FIXED
Bit 1 in the x86 EFLAGS is always set.  Name the macro something that
actually tries to explain what it is all about, rather than being a
tautology.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/n/tip-f10rx5vjjm6tfnt8o1wseb3v@git.kernel.org
2013-06-25 16:25:32 -07:00
Paul Bolle e3d1848f29 lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-05-29 15:13:05 +02:00
Rusty Russell f616fe4fee lguest: clear cached last cpu when guest_set_pgd() called.
commit v3.9-rc1-53-g6d0cda9 "lguest: cache last cpu we ran on." missed
one case, which causes a triple fault.  The guest calls guest_set_pgd()
on the top page, and we carefully remap the Switcher text page.  But
we didn't reset last_host_cpu, so map_switcher_in_guest() thinks
the guest's regs and IDT/GDT etc are already mapped.

Reported-by: Paul Bolle <pebolle@tiscali.nl>
Tested-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-05-08 10:49:18 +09:30
Linus Torvalds 736a2dd257 Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio & lguest updates from Rusty Russell:
 "Lots of virtio work which wasn't quite ready for last merge window.

  Plus I dived into lguest again, reworking the pagetable code so we can
  move the switcher page: our fixmaps sometimes take more than 2MB now..."

Ugh.  Annoying conflicts with the tcm_vhost -> vhost_scsi rename.
Hopefully correctly resolved.

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (57 commits)
  caif_virtio: Remove bouncing email addresses
  lguest: improve code readability in lg_cpu_start.
  virtio-net: fill only rx queues which are being used
  lguest: map Switcher below fixmap.
  lguest: cache last cpu we ran on.
  lguest: map Switcher text whenever we allocate a new pagetable.
  lguest: don't share Switcher PTE pages between guests.
  lguest: expost switcher_pages array (as lg_switcher_pages).
  lguest: extract shadow PTE walking / allocating.
  lguest: make check_gpte et. al return bool.
  lguest: assume Switcher text is a single page.
  lguest: rename switcher_page to switcher_pages.
  lguest: remove RESERVE_MEM constant.
  lguest: check vaddr not pgd for Switcher protection.
  lguest: prepare to make SWITCHER_ADDR a variable.
  virtio: console: replace EMFILE with EBUSY for already-open port
  virtio-scsi: reset virtqueue affinity when doing cpu hotplug
  virtio-scsi: introduce multiqueue support
  virtio-scsi: push vq lock/unlock into virtscsi_vq_done
  virtio-scsi: pass struct virtio_scsi to virtqueue completion function
  ...
2013-05-02 14:14:04 -07:00